Re: [PATCH] implement --enable-encoding for UTF-8 info files

bug-texinfo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] implement --enable-encoding for UTF-8 info files

From:	Eli Zaretskii
Subject:	Re: [PATCH] implement --enable-encoding for UTF-8 info files
Date:	Sat, 06 Oct 2007 14:49:09 +0200

> From: Bruno Haible <address@hidden>
> Date: Sat, 6 Oct 2007 13:37:24 +0200
> Cc: address@hidden
> 
> Eli Zaretskii wrote:
> > > !       for (i = 0; i < sizeof (unicode_map) / sizeof (unicode_map[0]); 
> > > i++)
> > > !         if (strcmp (html, unicode_map[i].html) == 0)
> > > !           return unicode_map[i].unicode;
> > 
> > unicode_map[] has over 200 entries.  I think linear search is not
> > really appropriate for such a long list.
> 
> Here is a revised patch, using binary search.

Thanks!

> If even binary search is not fast enough, one can also use gperf for
> maximal speed lookup.

No, I think binary search is okay for a list like this.

> + /* List of HTML entities.  */
> + static struct { const char *html; unsigned int unicode; } unicode_map[] = {
> + /* Extracted from http://www.w3.org/TR/html401/sgml/entities.html through
> +    sed -n -e 's|<!ENTITY \([^ ][^ ]*\) *CDATA "[&]#\([0-9][0-9]*\);".*|  { 
> "\1", \2 },|p'

I get an empty output when I run this Sed command on entities.html
downloaded with wget.  I think that's because the downloaded file uses
"&lt;" and "&amp;" instead of literal "<" and "&", respectively, that
you seem to have in your copy of the file.

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH] implement --enable-encoding for UTF-8 info files, Bruno Haible, 2007/10/05
- Re: [PATCH] implement --enable-encoding for UTF-8 info files, Eli Zaretskii, 2007/10/06
  - Re: [PATCH] implement --enable-encoding for UTF-8 info files, Bruno Haible, 2007/10/06
    - Re: [PATCH] implement --enable-encoding for UTF-8 info files, Eli Zaretskii <=
    - Re: [PATCH] implement --enable-encoding for UTF-8 info files, Bruno Haible, 2007/10/06

Prev by Date: Re: [PATCH] implement --enable-encoding for UTF-8 info files
Next by Date: Re: [PATCH] implement --enable-encoding for UTF-8 info files
Previous by thread: Re: [PATCH] implement --enable-encoding for UTF-8 info files
Next by thread: Re: [PATCH] implement --enable-encoding for UTF-8 info files
Index(es):
- Date
- Thread