help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Need help in Bison with Unicode


From: Hans Aberg
Subject: Re: Need help in Bison with Unicode
Date: Sat, 23 Jul 2005 18:08:19 +0200

On 23 Jul 2005, at 09:41, Hieu Le Trung wrote:

Hi have a project that use bison to parse some file.

I need to change to Unicode version and I don't know does bison support
Unicode or not?

The Bison generated parses token, so from that point of view, it is a non-issue. You cannot though use the token character forms '...' for Unicode characters. Instead use standard tokens "%token ...". As for the error messages that the Bison writes, at least those in English, these are in the ASCII 7-bit byte subset. If you need error messages in other languages, you could use UTF-8.

The problem is mostly with the lexer. I have started to use Flex, by merely writing the .l file in UTF-8 in the "..." patterns, thus generating a UTF-8 lexer, and it seems to work (which it should, unless there some unforeseen snag to it). I have also, in the flex list, posted a Haskell program, by which one can generate Flex like regular expressions from Unicode character classes. I think there might be Flex support for Unicode, but I do not know how this work progresses.

  Hans Aberg






reply via email to

[Prev in Thread] Current Thread [Next in Thread]