guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: string-ports issue on Windows


From: Christopher Lam
Subject: Re: string-ports issue on Windows
Date: Tue, 16 Apr 2019 23:26:49 +0000

Thank you Mark

The problem is rather obscure and may have been fixed in 2.2.

I've taken the reins of handling the guile code in GnuCash. For various
reasons I can't fathom, the Windows build includes Guile 2.0.14 rather than
Guile-2.2. I've checked NEWS and there was change in SRFI-6 string-ports to
make them Unicode-capable in 2.0.6.

Bearing in mind majority of strings code in GnuCash handle Unicode just
fine. However, there are some currencies e.g.TYR
https://en.wikipedia.org/wiki/Turkish_lira need extended Unicode and are
misprinted as ? in the reports.

I've dwelved down and figure there are only 2 offending functions. (format
#f "~a bla" str) and (with-output-to-string) as described above. After much
experimentation I can fix by changing (format) to (string-append), and
changing (with-ouput-to-string) to (open-string-port) and importing srfi-6
as described in original port, and these fix the TYR symbol display. Hence
my suspicion that string-ports on Windows munging Unicode. To try elucidate
this I've also tried removing (setlocale LC_ALL "") and dumping
(locale-encoding) which is "CP1252".

There are also other bits where UTF8 is being interpreted as CP1252 but
these are outside the scope of this post.

So, I'm rather late in this game (started diving into scheme 18 months ago)
and have probably missed many controversial changes in the past years, but
the issue above seems weird to me, why the Windows port is munging Unicode
:)

On Tue, 16 Apr 2019 at 17:29, Mark H Weaver <address@hidden> wrote:

> Hi Christopher,
>
> Christopher Lam <address@hidden> writes:
>
> > I'm struggling with string-ports on Windows.
> >
> > Last para of
> > https://www.gnu.org/software/guile/manual/html_node/String-Ports.html
> > "With string ports, the port-encoding is treated differently than other
> > types of ports. When string ports are created, they do not inherit a
> > character encoding from the current locale. They are given a default
> locale
> > that allows them to handle all valid string characters."
> >
> > This causes a string-sanitize function to not run correctly in Windows.
> > (locale-encoding) says "CP1252" no matter what LANG or setlocale I try.
> >
> > The use case is to sanitize string for html, but on Windows it munges
> > extended-unicode.
>
> Can you explain more fully what the problem is?  I know a fair amount
> about Unicode, but my knowledge of Windows is extremely weak.
>
> What exactly is "extended-unicode" in this context?  References welcome.
>
>       Thanks,
>         Mark
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]