[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 00/18] rewrite PSPP lexer
From: |
Ben Pfaff |
Subject: |
Re: [PATCH 00/18] rewrite PSPP lexer |
Date: |
Sun, 20 Mar 2011 09:45:04 -0700 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) |
John Darrington <address@hidden> writes:
> I had a look through all these patches and don't see any major
> problems.
Thank you for looking it over.
> Minor issues:
> * I see that a lot of new files have the copyright year as 2010, 2011.
> As I understand it, the years listed are supposed to be those in which
> the changes were released (where "released" means made available to the
> public in any way), not necessarily the years in which the code was
> written. So if this is the first time you're made these changes public,
> then 2010 is inappropriate.
I don't know how public something has to be to be "public". I've
been posting updates to my intermediate work on about a daily
basis on the public Git repository at git://benpfaff.org/pspp. I
don't publicize that very much, but I do tell people about it
from time to time.
> * In light of recent discussions, the case insensitive hash should perhaps
> use the C locale toupper instead of the locale dependent one. On the other
> hand the particular use case might require locale dependency. - On the
> third hand all sorts of things might break, if an string is placed into a
> hash table in oen locale, and lookup is performed in another, all hell
> could break loose. I think this deserves further consideration.
I agree that case-insensitive comparison and hashing needs more
work. I decided that that could be left for later.
It seems likely to me that we should use the Unicode casefolding
operation to do case-insensitive comparison and hashing.
libunistring has functions for that.
> * In my opinion the test in i18n-test.c is tautological. It's not testing
> anything except that you didn't make any typos in the recode_string_len
> function. Maybe a short table of example strings with their expected
> lengths would
> be more useful?
I guess you mean that the new assertion, added in the commit
"i18n: New function recode_string_len()", is tautological? Yes,
it is; it doesn't test much. But there is already a (very) short
table of example strings in tests/libpspp/i18n.at, which looks at
the output of the i18n-test code. Adding an assertion to the
i18n-test.c seemed like a reasonable way to at least make sure
that that function gets coverage.
> * In the comment within encoding-guesser, there is the word "other;" which
> seems to be erroneous.
Thanks. I decided to just remove that part.
I'm going to do a little final testing and push it soon.
--
Ben Pfaff
http://benpfaff.org
- [PATCH 06/18] str: Rename ss_chomp() to ss_chomp_byte(), ds_chomp() to ds_chomp_byte()., (continued)
- [PATCH 06/18] str: Rename ss_chomp() to ss_chomp_byte(), ds_chomp() to ds_chomp_byte()., Ben Pfaff, 2011/03/19
- [PATCH 02/18] file-name: Do not make output files line-buffered in fn_open()., Ben Pfaff, 2011/03/19
- [PATCH 12/18] identifier: Rename token_type_to_string() and make a new version., Ben Pfaff, 2011/03/19
- [PATCH 08/18] hash-functions: New function hash_case_bytes()., Ben Pfaff, 2011/03/19
- [PATCH 11/18] i18n: New functions for truncating strings in an arbitrary encoding., Ben Pfaff, 2011/03/19
- [PATCH 17/18] scan: New library for high-level PSPP syntax lexical analysis., Ben Pfaff, 2011/03/19
- [PATCH 15/18] u8-istream: New library for reading a text file and recoding to UTF-8., Ben Pfaff, 2011/03/19
- [PATCH 16/18] segment: New library for low-level phase of lexical syntax analysis., Ben Pfaff, 2011/03/19
- [PATCH 03/18] sys-file-reader: Refactor to clean up character encoding support., Ben Pfaff, 2011/03/19
- Re: [PATCH 00/18] rewrite PSPP lexer, John Darrington, 2011/03/20
- Re: [PATCH 00/18] rewrite PSPP lexer,
Ben Pfaff <=
- Re: [PATCH 00/18] rewrite PSPP lexer, John Darrington, 2011/03/22