lilypond-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)


From: k-ohara5a5a
Subject: Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)
Date: Sun, 01 Jan 2012 02:01:11 +0000

Works nicely.

Showing the input location will probably be very helpful.  We probably
want to remove the similar message from lily/misc.cc, because both
message together are very noisy.

I wish I could think of a way to check the input with a canned regular
expression like
<http://flex.sourceforge.net/manual/Identifiers.html#Identifiers> or
better one with comments
<http://www.w3.org/International/questions/qa-forms-utf-8>

Doing so seems to require backing up (which probably won't cause any
harm) or maybe I'm just not seeing an easy way.


http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll
File lily/lexer.ll (right):

http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll#newcode134
lily/lexer.ll:134: A            [a-zA-Z\200-\377]
non-ASCII characters are used internally as-read, tested below only to
warn the user if the input is invalid as UTF-8

http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll#newcode220
lily/lexer.ll:220: (void) YYText_utf8 ();
Complaining about comments doesn't help anybody.

http://codereview.appspot.com/5505090/diff/1/lily/lexer.ll#newcode1083
lily/lexer.ll:1083: LexerWarning (_ ("non-UTF-8 characters").c_str ());
I suggest "unsupported file encoding, expected UTF-8"

You could warn once per token, rather than once per character, by
following this with  break;

http://codereview.appspot.com/5505090/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]