monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: non utf-8 filenames


From: Graydon Hoare
Subject: [Monotone-devel] Re: non utf-8 filenames
Date: Thu, 05 Oct 2006 13:59:38 -0700
User-agent: Thunderbird 1.5.0.7 (Windows/20060909)

Markus Schiltknecht wrote:
Hi,

Recently, I've stumbled across the following invariant when doing
'mtn ls unknown':

mtn: fatal: std::logic_error: paths.cc:255: invariant 'I(utf8_validate(path))' violated

It's not that throwing a warning would not be good, but it's certainly not a bug in monotone. For some reason, I just happen to have files in my working copy which have names that are not UTF-8 encoded. Could we have a nicer warning here, instead of failing on an invariant? Maybe even saying, that such files can not be added to the repository?

There's an important difference to examine here:

  - There are file names that, while not presently encoded as UTF-8,
    can be faithfully transformed to and from Unicode (and thus UTF-8).
    These are very common -- several euc, koi, 8859-x, gb and jis
    standards fall in this category -- but they are all supposed to
    map bijectively to Unicode codepoints. We support these.

  - There are file names that cannot be faithfully transformed to and
    from Unicode. These are very rare -- possibly some 2202-x standards
    -- and we've decided not to support these.

If you have filenames of the latter sort, you are out of luck: our rosters (internal data structures) only deal in Unicode, so before monotone can do anything with your filename it tries to convert it to Unicode.

If you have filenames of the former sort, we should be able to deal with it. Monotone is supposed to transform host character sets to Unicode while reading them from disk, and transform back from Unicode to the host character set when writing back to disk.

If it's not doing so, there's a bug.

-graydon





reply via email to

[Prev in Thread] Current Thread [Next in Thread]