Re: [GLISS] Existing syntax abominations

lilypond-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [GLISS] Existing syntax abominations

From:	Janek Warchoł
Subject:	Re: [GLISS] Existing syntax abominations
Date:	Sat, 22 Sep 2012 01:10:03 +0200

Hi David, James & all,

On Fri, Sep 21, 2012 at 7:41 PM, James <address@hidden> wrote:
> About the only part of the thread I could follow [...]

Indeed, David's message was really technical.
I'll try to "translate" his email.  Apart from checking whether i
understood everything correctly myself, i hope that this will benefit
you as well.

On Fri, Sep 21, 2012 at 6:46 PM, David Kastrup <address@hidden> wrote:
> Sometimes it is important to be able to parse some expression without
> further lookahead,

When Lily reads a .ly file, she sees just a load of characters.  She
needs to use two subprograms, the parser and the lexer, to understand
the file (translate the text into meaningful objects, like a NoteHead,
Stem, Accidental etc.).
Lexer reads the .ly file letter-by-letter.  Lexer's problem is "is
this letter a continuation of the previous 'word', or the beginning of
a new one?"  For example, lexer's job is to divide this:
cis4\fermata-.
into this:
cis  4  \fermata  -.
(a pitch, a duration, a postevent (postevent is something attached to
a note, like an articulation), another postevent)
After that, the parser's job is to group these 'words' into meaningful
'sentences'.  For example,
c4 g \f d8-.
becomes
c4
g \f
d8-.
(i.e., all things that go with the pitch - a duration, articulations
etc - are merged together).
The problem is that sometimes it's impossible to tell what something
is without looking at next thing.  For example, when reading this
\markup " \ bla"
letter-by-letter, Lily sees
\  <= a beginning of a command
m  <= first letter of the command name
a  <= second letter of the command name
r   etc.
k
u
p
   <= whitespace - this means command name ended
"  <= beginning of a string
   <= space in the string
\  <= another character in the string

b
l
a
"  <= end of the string.

That was easy.  Now, take this:
\markup " \" bla"
The tricky part is that \" means a double quote char inside the string
(as opposed to " without a backslash, which means the beginning/end of
a string).  How Lily should know that the second backslash doesn't
mean a backslash char inside the string, like in the previous example,
but rather something special?  That's what we need lookahead for.
Lookahead means that before deciding what current letter in input
means, we look at the next one.  So, everytime Lily sees a backslash
inside a string (inside " "), she looks at the next letter in input to
know whether the backslash is just another char or has a special
meaning.
Now, that was lookahead at lexer lever (when gluing letters into
words).  If i understand correctly, we have similar lookahead in
parser, i.e. when we have the words already separated and we analyze
their meaning.  (unfortunately i don't have any example handy).

I hope that it's now clear what a lookahead is.

> for example because lexer modes need switching.

I'm not sure what lexer modes are, but i suppose that it's about
different rules in different contexts.  For example, when you're
inside a string you have to do a lookahead when you encounter a
backslash, but you don't have to do this when you're not inside
string.

> I am just now experimenting with code where the _lexer_ will transparently
> call music functions in its own parser copy, inserting the result back
> into LilyPond.

I think that David checked what would happen if the lexer "calculated"
music functions on its own (using a privately run parser) and just
inserted the results back into LilyPond.
It's like changing this (not Lily code here, just math)
a + 2*3
into this:
a + 6
(i.e. calculating 2*3 and inserting the result in place of the multiplication).

> Now I get the following output for cue-clef-new-line.ly:
>
> input/regression/cue-clef-new-line.ly:14:20: error: unknown escaped string: 
> `\vI'
> \addQuote vIQuote {
>                     \vI }
> input/regression/cue-clef-new-line.ly:14:20: error: syntax error, unexpected 
> STRING
> \addQuote vIQuote {
>                     \vI }
>
> The input is
>
> vI = \relative c'' { \clef "treble" \repeat unfold 40 g4 }
> \addQuote vIQuote { \vI }

LilyPond says "i don't know what a \vl is.  \vl looks like a string,
and i don't want a string here"

> Huh?  Why is \vI undefined at the time \addQuote is called?  Now since
> \addQuote is called in the lexer in this LilyPond version,

David's experimental change resulted in \addQuote being called and
"calculated" during lexer phase.  This didn't happen before.

> it is called when the preceding code is asking for a lookahead token.

If \addQuote was called during lexer phase, it means that it happened
when a previous element of the code asked the lexer to lookahead to
check something.

> Why on Earth
> would the preceding code ask for a lookahead token to finish that
> assignment?

Preceeding code is a closing brace.  Closing brace generally means end
of music expression.  It's strange that a closing brace says "please
check what happens after me, because i'm not sure what i'm doing
here".  The strange this is: why brace asks for this, and not just say
"hey, i closed a music expression now!"?

> Calling lilypond with -ddebug-parser tells us:
>
> Entering state 55
> Reducing stack by rule 134 (line 1007):
>    $1 = nterm braced_music_list (: )
> -> $$ = nterm sequential_music (: )
> Stack now 0 2 6 168 296
> Entering state 57
> Reducing stack by rule 157 (line 1093):
>    $1 = nterm sequential_music (: )
> -> $$ = nterm grouped_music_list (: )
> Stack now 0 2 6 168 296
> Entering state 61
> Reducing stack by rule 155 (line 1088):
>    $1 = nterm grouped_music_list (: )
> -> $$ = nterm music_bare (: )
> Stack now 0 2 6 168 296
> Entering state 60
> Reducing stack by rule 152 (line 1082):
>    $1 = nterm music_bare (: )
> -> $$ = nterm composite_music (: )
> Stack now 0 2 6 168 296
> Entering state 136
> Reading a token: Starting parse
> Entering state 0
> Reading a token: Next token is token "(music-function-call)" (: #<Music
> function
>
> So here is where we have "composite_music", and the following \addQuote
> is called prematurely in the search of a lookahead token.  Why?  Let's
> look at state 136 in the parser:
>
> state 136
>
>   130 music_assign: composite_music .  ["end of input", error, "\\repeat", 
> "\\alternative", "\\default", ':', '(', ')', '[', ']', '~', '^', '_', "--", 
> "__", "\\!", EVENT_IDENTIFIER, E_UNSIGNED, "\\[", "\\]", "\\(", "\\)", "\\<", 
> "\\>", DURATION_IDENTIFIER, REAL, UNSIGNED, NUMBER_IDENTIFIER, "\\accepts", 
> "\\alias", "\\book", "\\bookpart", "\\change", "\\chordmode", "\\chords", 
> "\\consists", "\\context", "\\defaultchild", "\\denies", "\\description", 
> "\\drummode", "\\drums", "\\figuremode", "\\figures", "\\header", 
> "\\version-error", "\\layout", "\\lyricmode", "\\lyrics", "\\lyricsto", 
> "\\markup", "\\markuplist", "\\midi", "\\name", "\\notemode", "\\override", 
> "\\paper", "\\remove", "\\revert", "\\score", "\\sequential", "\\set", 
> "\\simultaneous", "\\tempo", "\\type", "\\unset", "\\with", "\\new", "<", 
> "<<", ">>", "\\", "\\~", FIGURE_OPEN, LYRIC_MARKUP, MULTI_MEASURE_REST, 
> "(backed-up?)", "(reparsed?)", CHORD_REPETITION, CONTEXT_MOD_IDENTIFIER, 
> DRUM_PITCH, PITCH_IDENTIFIER, FRACTION, LYRICS_STRING, 
> LYRIC_MARKUP_IDENTIFIER, MARKUP_IDENTIFIER, MARKUPLIST_IDENTIFIER, 
> MUSIC_IDENTIFIER, NOTENAME_PITCH, RESTNAME, SCM_IDENTIFIER, SCM_TOKEN, 
> STRING, STRING_IDENTIFIER, TONICNAME_PITCH, '-', '{', '}', '|']
>   227 new_lyrics: . "\\addlyrics" address@hidden composite_music
>   229           | . new_lyrics "\\addlyrics" address@hidden composite_music
>   230 re_rhythmed_music: composite_music . new_lyrics
>
>     "\\addlyrics"  shift, and go to state 202
>
>     $default  reduce using rule 130 (music_assign)
>
>     new_lyrics  go to state 203
>
>     Conflict between rule 130 and token "\\addlyrics" resolved as shift 
> (COMPOSITE < "\\addlyrics").

This is very technical and i can only guess what it means based on
what David writes below.
Basically i think David provided this for people who can understand it
- if you cannot, skip it as there is a written explanation.

> Look and behold: after the closing brace of the sequential music, the
> expression is not finished because LilyPond has to see whether there is
> an \addlyrics after that, as it would become part of the expression.

When there's an \addlyrics present, the music expression doesn't end
where it normally does.  That's why there was a lookahead from } ,
which caused \addQuote to be evaluated before music expression being
assigned to \vl was finished.

> Well, it seems my "stealthy" music function call in the lexer can't work
> just as stealthily as that since a brace-enclosed music expression is
> potentially incomplete.  That's actually rather bad news for other
> potentially mode-switching commands as well.

I'm not sure about mode-switching commands.
But generally, having to do excessive lookahead is bad.  You prefer to
know what's happening without looking ahead.

> \addlyrics is actually something calling itself "re_rhythmed_music" in
> the parser.  That concept would make independent sense, but it is only
> available for lyrics, not anything else.

I think David means that there is some interesting idea behind having
\addlyrics work in this way, but it isn't actually used for anything
else.

> Then we have \override Grob #'this #'that = 7 which needs to get
> reverted with \revert Grob #'(this that) rather counter-intuitively, to
> the degree that people complain about a recently introduced warning
> resulting from \revert Grob #'this #'that (which never did what people
> thought it would, ignoring all but the first Scheme expression).

If you want to override a multi-level property - for example #'details
#'lengths property of Stem object - you can write this:
  \override Stem #'details #'lengths = #4
or
  \override Stem #'(details lengths) = #4
and both work.  But to revert this change, you must write
  \revert Stem #'(details lengths)
because writing
  \revert Stem #'details #'lengths
won't work.
This is inconsistent.

> Then strings in lyrics are sufficiently differently delimited from the
> way strings in markups are.  For example, they can contain unquoted
> curly braces in some positions.  In my opinion, lyrics (which can be
> interspersed with durations) have even less business to allow curly
> braces as part of words without using quote marks than markups have.

\markup { blah } , \markup {"blah"} and \markup {blah} work.  (notice
that the text isn't separated from the braces in the last case)
On the other hand, \addlyrics { blah } and \addlyrics {"blah"} work,
but \addlyrics {blah} doesn't - the closing brace gets glued to the
text.
This is inconsistent.

i hope that my "translations" are both correct and clear :-)
As for my opinion on this, i agree that all these are annoying and it
will be good to get rid of them.

cheers,
Janek

[Prev in Thread]

Current Thread

[Next in Thread]

[GLISS] Existing syntax abominations, David Kastrup, 2012/09/21
- Re: [GLISS] Existing syntax abominations, James, 2012/09/21
  - Re: [GLISS] Existing syntax abominations, Francisco Vila, 2012/09/21
    - Re: [GLISS] Existing syntax abominations, Graham Percival, 2012/09/22
    - Re: [GLISS] Existing syntax abominations, Janek Warchoł, 2012/09/23
- Re: [GLISS] Existing syntax abominations, Trevor Daniels, 2012/09/21
  - Re: [GLISS] Existing syntax abominations, Janek Warchoł, 2012/09/23
- Re: [GLISS] Existing syntax abominations, Francisco Vila, 2012/09/21
- Re: [GLISS] Existing syntax abominations, Janek Warchoł <=
  - Re: [GLISS] Existing syntax abominations, David Kastrup, 2012/09/22
    - Re: [GLISS] Existing syntax abominations, Janek Warchoł, 2012/09/23

Prev by Date: Re: [GLISS] Existing syntax abominations
Next by Date: Update help2man to latest release 1.40.12 (issue 6540043)
Previous by thread: Re: [GLISS] Existing syntax abominations
Next by thread: Re: [GLISS] Existing syntax abominations
Index(es):
- Date
- Thread