gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hash computation and TFB


From: Richard Frith-Macdonald
Subject: Re: Hash computation and TFB
Date: Tue, 6 Aug 2013 14:39:41 +0100

On 6 Aug 2013, at 14:30, Stefan Bidi <address@hidden> wrote:

> I copied the hash algorithm straight out of -base, so they should match.  I 
> remember a few months ago Richard was playing around with hash functions and 
> this might be causing some issues, now.

It wouldn't on a normal setup ... the experimental hash code is used only if 
you explicitly build it.

> I just looked it up, the changes were made on rev 36344.
> 
> There is another issue... -base allows UTF-8 strings, which will not be 
> hashed to the same UTF-16 value.

They are hashed to the same value as other strings, in base hashing is computed 
on unicode codepoint.

>  In my opinion, allowing UTF-8 string literals is not a good idea and base 
> should revert back to Latin1 as the default C string encoding.

gnustep-base still uses latin1 as the default C string encoding.  The change 
with string literals is one from ascii to utf-8

>  I'm actually debating adding a UTF-16 string literals configure option for 
> corebase.  I believe using UTF-16 internally is the only sane solution to 
> non-ASCII encodings.
> 
> I've tried experimenting with other hash functions that are not 
> one-at-a-time, but unfortunately have not found anything that will work on 
> both ASCII and Unicode strings consistently.  It would be really nice to be 
> able to work with 32- or 64-bit integers directly instead of 8- or 16-bit 
> characters.  If could use UTF-16 across the board, this wouldn't be a problem.

base uses the 16bit codepoints to compute string hashes ... which is of course 
fine for ascii and utf-16 since ascii is a true subset of unicode and each 
ascii character therefore has exactly the same value as the corresponding 
utf-15 character.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]