texmacs-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Texmacs-dev] Recent Works on Programming Language Parsers


From: TeXmacs
Subject: Re: [Texmacs-dev] Recent Works on Programming Language Parsers
Date: Thu, 26 Mar 2020 02:07:47 +0100
User-agent: Mutt/1.5.20 (2009-12-10)

Hi Darcy,

I noticed your recent changes.
Please be very careful when commiting changes,
since version 2.1 is still approaching, despite the Covid-19 setback.

Best wishes, --Joris


On Mon, Mar 23, 2020 at 02:36:03AM +0800, Darcy Shen via Texmacs-dev wrote:
> Recent tree months, I wrote some parsers under src/Data/Parser:
> 
> + blanks_parser
> 
> + escaped_char_parser
> 
> + identifier_parser
> 
> + inline_comment_parser
> 
> + keyword_parser
> 
> + number_parser
> 
> + operator_parser
> 
> + string_parser (TODO: multi-line support)
> 
> On top of these parsers, I rewrite various part of xxx_language.cpp .
> 
> I didn't use a higher-level abstraction like packrat-parser. In my
> opinion,for now, low level small parsers written in C++ should work
> fine.
> 
> Most programing languages are similar in syntax. Doing simple
> abstraction is sufficient for syntax highlighting.
> 
> 
> The goal of these small parsers with `can_parser` and `do_parser` is
> to minimize the time to implemented a new programming language
> parser.
> 
> 
>      Below are the detailed progress:
> 
> + dot: new composition of parsers, string_parser, keyword_parser,
> operator_parser
> 
> + cpp: new composition of parsers, string_parser
> 
> + java/scala/python: old composition of parsers, string_parser
> 
> + others: old composition of parsers
> 
> 
>      Composition of parsers: old vs new
> 
> Almost all xxx_language.cpp (except scheme_language.cpp) is derived
> from mathemagix_language.cpp .
> 
> I call it old-style composition of parsers. It is not efficient
> enough, because, we have to re-parse the code in `get_color`.
> 
> Dive into concat_text.cpp:typeset_prog_string, we will find what
> actually is `get_color` and `advance`.
> 
> The new-style composition of parsers, reduce the unnecessary
> parsings in get_color.
> 
> 
>      Keyword/Operator Parser
> 
> Aims to keep the (type,  keyworkd) mapping in Scheme files. Please
> refer to `dot-lang.scm`.
> 
> 
>      String Parser
> 
> Currently, the String parser only support inline string. Actually,
> string and multi-comment are the same type.
> 
> They both have openings and corresponding closings.
> 
> The string parser will finally support multi-line. Once it is ready,
> the multi-comment parser will also be implemented
> 
> in a short time.
> 
> 
>      Recent Plans
> 
> I will continue my developments on newly-supported languages (like
> dot). The goal is to make it extremely easy to
> 
> support a new language. For coloring schemes, it is another topic.
> In the next one or two months, I will continue to
> 
> work on improving the xyz_parser and abc_language.
> 
> 
> Darcy
> 
> 2020/03/23
> 

> _______________________________________________
> Texmacs-dev mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/texmacs-dev




reply via email to

[Prev in Thread] Current Thread [Next in Thread]