bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new module 'regex-quote'


From: Reuben Thomas
Subject: Re: new module 'regex-quote'
Date: Tue, 21 Sep 2010 10:31:38 +0100

On 21 September 2010 01:07, Bruno Haible <address@hidden> wrote:
> Reuben,
>
>> Heh. My point precisely: 3 functions and 50 lines versus 1 flag and 5
>> lines (RE_PLAIN) to solve the same problem
>
> I agree that if we had the opportunity to invent regex APIs from scratch
> now, all 4 syntaxes (literals, wildcards, basic regular expression, extended
> regular expression) would be worth supporting equally.
>
> But the fact is that POSIX standardizes the regex API, and therefore there
> is a border between "in glibc" and "outside glibc". Functionality in glibc
> is available at no cost; functionality outside glibc requires additional
> link options and increased startup time or a 50KB bigger executable.

Equally, libc APIs such as the crappy standard string handling
functions waste my time on a daily basis, whereas APIs from other
libraries save it. C does make linking harder than newer languages,
but link options and increased startup time and/or a bigger executable
have never put me off (the penalties are tiny on modern machines),
although I do spend time considering licensing and portability of
libraries I use. Indeed, libc's general weakness in so many areas
means I consider third-party libraries much more often than in other
languages. (glibc is at least better than many libc's in this respect,
by covering a lot more ground, though it is a pity that it carries so
much non-standard crud just for backwards binary compatibility.) In
this particular case, the point is somewhat moot: GNU regex is still
not synced with glibc, many applications continue to use internal
copies unconditionally (though, thanks to hard work by GNU developers,
most GNU programs now use gnulib), and tons of other applications use
other regex libraries altogether.

So, I am not really making things much worse by proposing extensions
to the POSIX API, and indeed I am leaving the door open to make things
better: the chances of any other C regex API ever being standardised
are practically zero, so applications using non-POSIX APIs are always
going to suffer the penalty of an external library; whereas API and
ABI-compatible extensions at least have a chance of one day being
added to the standard.

Not to mention the big picture: the vast majority of C apps these days
use either POSIX or PCRE regex APIs. On most GNU systems, there will
today still be plenty of apps in which POSIX regexes are compiled in
statically via GNU regex (old glibc's and/or old apps). My suggestions
aim at a situation in which, in a few years, the situation is much the
same but the application code is cleaner (and there are not lots of
statically-linked "quote_regexp" functions). And then, a few years
later, the changes get into glibc and the statically linked copies
disappear. Not to mention that plenty of mature programs won't want
any of my extensions, and therefore will not need statically-linked
POSIX regex.

The fundamental point is this: the two scenarios are pretty much equal
with respect to time and space overhead, but evolving standard APIs
over time wins hands down when it comes to improving things for
application programmers, and reducing application code. Developer time
is a much more important resource than machine cycles or bytes, and
that is not to disrespect users, because time that developers save not
coding they can spend on useful optimisation.

(I am also rather alarmed at the way that gnulib seems to be growing
without bound; it should be making bits of itself redundant just as
fast as it can, so that it's the glue between the near past and the
near future, not just another big ball of wax that keeps accreting.)

-- 
http://rrt.sc3d.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]