[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: pygment regex question
From: |
Jean Abou Samra |
Subject: |
Re: pygment regex question |
Date: |
Fri, 25 Nov 2022 21:39:28 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.0 |
Le 25/11/2022 à 20:28, Luca Fascione a écrit :
On Fri, 25 Nov 2022, 18:11 Jean Abou Samra, <jean@abou-samra.fr> wrote:
What makes you think Pygments can’t do this? You can do
(?<=\w+)\d+
Nothing but my not remembering lookaheads/lookbehinds, which I may
argue aren't very commom constructs. In fact aside from PERL I'm not
even sure what precedent they have (no python doesn't count). Besides,
this has nothing to do with pygments, this is the regex matching
engine that does its thing, pygments just gratefully receives the benefit.
Well, reusing a feature found in the underlying tools is not bad
design, it is good design that shares functionality instead of
reinventing the wheel.
(Sorry, I co-maintain Pygments, which is why I am a bit sensitive
to this "bad design" criticism.)
and things like that. You could also arrange so that the regex
parsing a pitch leaves you in a state of the lexer where something
special will happen for \d+
This does sound like pygments code. Interesting, I wasn't aware you
could mess with the state of the lexer to that depth.
Hrrm... It's not an advanced feature, it's really the basic way
Pygments lexers work. You have a set of states, the lexer has a
state stack, each state tries regex-based rules in turn and a rule
adds to or removes from the stack. This example would be done as
tokens = {
"root": [
...
(r"\w+", Token.Pitch, "after_note"),
...
],
"after_note": [
(r"\d+", Token.Duration, "#pop"),
...
default("#pop"),
],
...
}
In simple cases (if there is no complex stuff in the "after_note"
state), you can get also along with
tokens = {
"root": [
...
(r"(\w+)(\d*)", bygroups(Token.Pitch, Token.Duration)),
...
],
...
}
which in hindsight may be closer to what you were thinking
of originally.
However, durations don’t always follow a pitch, as in
\tuplet 3/2 8. { … }
which is the reason why we don’t want to do that.
Does Lilypond's parser even know that's a duration? Isn't that just a
bare string that \tuplet internally interprets as a duration?
\tuplet is defined (in ly/music-functions-init.ly) as
tuplet =
#(define-music-function (ratio tuplet-span music)
(fraction? (ly:duration? '()) ly:music?)
...)
When the parser sees "8", it notes that this could
be either a number of a duration, so it tries the
different variants against the predicate ly:duration?
The function receives an argument of the right
type thanks to the predicate it declares for this
argument.
If you wanted to do that in Pygments, you would have
to know the signature of every LilyPond music function
and which predicates match numbers or durations,
not to mention the problem of user-written functions.
When implementing this kind of simplistic syntax highlighting (like,
ones not assisted by being aware of the semantics of the language,
like you'd have in Visual Studio or Qt Creator, say) there's always
this problem of how much of the common libraries you reimplement by
hand, I'm not sure how Frescobaldi does its thing, for example, a lot
of it seems quite magic to me (or the result of a huge labour of
love... I mean, that program is just brilliant).
Anyways whatever Frescobaldi does, I wonder if we could mimic for
Pygments...
What Frescobaldi does is here:
https://github.com/frescobaldi/python-ly/blob/master/ly/lex/lilypond.py
1500+ lines of code, obviously a lot of work and dedication.
Nevertheless, it has to make assumptions too. For example, if
you enter this in Frescobaldi:
\version "2.22.2"
{
\barNumberCheck 1
\tweak duration-log 2 c'1
}
... you will notice that the "1" after \barNumberCheck is highlighted
in the same color as the duration in "c'1", in spite of it being
a number like the "2" in "\tweak duration-log 2 ..."
On the reasons not to reuse Frescobaldi's code for syntax
highlighting in the documentation, see
https://lists.gnu.org/archive/html/lilypond-devel/2022-10/msg00207.html
Jean
OpenPGP_signature
Description: OpenPGP digital signature
- Re: pygment regex question,Re: pygment regex question, (continued)
- Re: pygment regex question,Re: pygment regex question, Werner LEMBERG, 2022/11/25
- Re: pygment regex question,Re: pygment regex question, Luca Fascione, 2022/11/25
- Re: pygment regex question,Re: pygment regex question, Jean Abou Samra, 2022/11/25
- Re: pygment regex question,Re: pygment regex question, Benkő Pál, 2022/11/25
- Re: pygment regex question,Re: pygment regex question, Werner LEMBERG, 2022/11/26
Re: pygment regex question, Luca Fascione, 2022/11/25
Re: pygment regex question, Jean Abou Samra, 2022/11/25
Re: pygment regex question, Werner LEMBERG, 2022/11/25
Re: pygment regex question, Werner LEMBERG, 2022/11/26
Re: pygment regex question, Lukas-Fabian Moser, 2022/11/26
Re: pygment regex question, Jean Abou Samra, 2022/11/26
Re: pygment regex question,Re: pygment regex question, Werner LEMBERG, 2022/11/26
Re: pygment regex question,Re: pygment regex question, Jean Abou Samra, 2022/11/26
Re: pygment regex question, David Kastrup, 2022/11/26
Re: pygment regex question, Lukas-Fabian Moser, 2022/11/26
Re: pygment regex question, David Kastrup, 2022/11/26