guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: guile can't find a chinese named file


From: Andy Wingo
Subject: Re: guile can't find a chinese named file
Date: Mon, 27 Feb 2017 12:02:12 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)

Hello,

On Mon 27 Feb 2017 10:10, David Kastrup <address@hidden> writes:

> Andy Wingo <address@hidden> writes:
>
>> Legacy programs don't use codepoints >255.
>
> Sort of a moot point when Guile makes the decision to interpret external
> files with codepoints >255.  Not every data processed by a "legacy
> program" originates from inside the program.

Not a moot point at all.  If you want to decode/encode characters
to/from ports, you have to call Guile's setlocale function; that's a
choice you can make.  In Guile 1.8 and earlier regardless you would just
get ISO-8859-1 one-character-per-byte, so no significant change here.
If you would prefer to continue to use this encoding with every port in
your program, you can do that.

>> In Scheme, strings are sequences of characters.  Encoding and decoding
>> is only needed when going to and from bytes.
>
> A string port is strictly passing characters to characters completely
> inside of Guile

This is an implementation concern.  May I remind you and the list that
we have kindly asked you to not post to guile-devel because
implementation discussions with you are not productive.  I'm not
interested in having similar discussions, only on another list.  Thanks.

>>> PostScript files are usually encoded in Latin-1 with occasional UCS-16
>>> passages.  Reading and writing and copying such files byte-correctly
>>> while trying to actually parse their contents is not feasible with
>>> Guile.
>>
>> Works perfectly well.  The web server for example reads the request as
>> Latin-1 and the body as something else.  Just re-set the port encoding
>> and there you go.
>
> Reading and writing and copying cannot always afford to _parse_ and
> switch encodings based on the content.  It needs to work even when you
> don't do that.

If you would like to read just the bytes and parse yourself, you can do
that too.  Re-setting the encoding while parsing from a port can often
be more efficient though, as you don't have to read all of the data and
then parse it all; you can parse incrementally.

>> String ports have nothing to do with the discussion AFAIU.  (Ports in
>> Guile are sequences of bytes also.  They may be accessed using
>> textual interfaces as well.
>
> They can _only_ be accessed using textual interfaces.  They are
> character-in/character-out.

You misunderstand what Guile ports are.  I seriously invite you to read
the fine manual, specifically the first four subsections of this node:

  
https://www.gnu.org/software/guile/docs/master/guile.html/Input-and-Output.html

Thanks,

Andy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]