monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Representing EBCDIC history in monotone


From: hendrik
Subject: Re: [Monotone-devel] Representing EBCDIC history in monotone
Date: Tue, 1 Jul 2008 17:10:28 -0400
User-agent: Mutt/1.5.9i

On Tue, Jul 01, 2008 at 03:27:23PM -0400, Jack Lloyd wrote:
> On Tue, Jul 01, 2008 at 03:22:05PM -0400, address@hidden wrote:
> 
> > But there are a few small files containing lots of characters not in the 
> > ASCII character set.  These are tables used by the lexical scanner to 
> > classify characters.  It makes no sense to translate the weird 
> > characters into ASCII.  But leaving them out will violate the 
> > development history.
> > 
> > Any advice?
> 
> My first thought would be to convert them to reading/recognizing
> the Unicode/UTF-8 equivalents [1], though that would also require
> changing the code to understand some amount of UTF-8/Unicode which
> might not be the point
> 
> Perhaps you can change it to match, rather than the literal EBCDIC characters,
> the equivalent character values (in whatever EBCDIC layout
> this thing is set up for), with comments specifying what the actual
> characters being matched are?

Perhaps sticking the EBCDIC file in the archive, along with a 
separate document explaining it?  That would satisfy the historical
record.

And then revising it immediately to a usable ASCII version?
There aren't *many* files that would need this treatment, so they won't 
take an inordinate amount of space in the data base.

> Or maybe just mark that file as binary, then delete the manual-merge
> cert on it? (Does that even work?)

Manual-merge cert?  What is that?

> Is the intent to feed this thing ASCII sources, or EBCDIC ones?

The existing thing is set up for EBCDIC source.  The first order of 
business is to make it accept 7-bit ASCII source instead.  Yes, that 
makes it recornise not just a different character code from the 
original, but a different character *set*, and changes the permitted 
symbols that represent some operators.  That's OK.  The language 
I'm compiling has been adapted to ASCII since the original code was 
written.

> 
> -Jack
> 
> [1]: I assume at some point _someone_ thought it would be useful to
> include all of the EBCDIC character points into Unicode somewhere.

It has been done --- several times.  But there are so many EBCDICs that 
it's hard to find the right one.  Look up EBCDIC on unicode.org, and you 
will find several.  None of the ones I've found online are the ones
I used.

If all I were doing was converting ans ancient piece of code, I'd just 
ignore the issue, and do everything in ASCII, editing out any junk left 
from the translation.   Maybe I should just do that.  But I'd like to 
preserve the historical record, for archaeological reasons.

Maybe I'm just being silly.

-- hendrik

> 
> 
> _______________________________________________
> Monotone-devel mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/monotone-devel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]