guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Stupid module and pregexp questions


From: Tom Lord
Subject: Re: Stupid module and pregexp questions
Date: Tue, 6 May 2003 02:28:21 -0700 (PDT)


    > From: address@hidden

    > > Maybe that suggestion, to choose a minimalist, truly regular regular
    > > expression language -- then do the rest in scheme -- satisfies the
    > > spirit of "do as little as possible in C".

    > Hm. Technically, the idea sounds quite attractive, in a way. I
    > see several issues, though.

    >  - This leaves still the question open whether it'd be possible to
    >    have a regexp interface spec which could be fairly portable
    >    across Schemes. It might leave many things unspecified, but it
    >    would have to be powerful/specific enough that people dare to
    >    use it (when trying to write portable Scheme, that is).

POSIX regexps are your friend, in this regard.

A _subset_ of Posix regexps is a minimalist, truly regular, regular
expression pattern language.  So, to the (slightly problematic) extent
that you can lay your hands on accurate Posix regexp engines, you can
use such engines to implement the kind of Scheme regexp library I'm
suggesting.



    >  - If there is a possibility to provide a ``high level'' interface
    >    resembling more traditional regexp languages, I see no problem.
    >    It's this ``high level'' interface I was talking about (after all
    >    it seems pregexp does *everything* in Scheme).

It's really foolish, performance-wise, to do _all_ of a regexp engine
in scheme until you can scan a string through a dfa table at <20
instructions per character.   If some of the hard-core compilers are
up to that, I'm impressed -- but I'm quite sure none of the
interpreters are.   The interpreters will be off by no less than 1,
and I'd expect 2 or 3 orders of magnitude (powers of 10, here).


    > > Another design dimension to consider: what are Guile's plans re:
    > > Unicode?

    > Uh, oh.

Tee hee.

No point talking regexps there until you get characters and strings
right.   I've actually mapped a bunch of that stuff out:  how to do
strings at the C level and chars and strings at the C level.   I'm
starting to fear I'm getting too old to ever make it real, though.

-t




reply via email to

[Prev in Thread] Current Thread [Next in Thread]