Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}

guile-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}

From:	Mike Gran
Subject:	Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
Date:	Mon, 6 Sep 2010 09:28:03 -0700 (PDT)

> From: Andy Wingo <address@hidden>


[...]

> The solution is to use  functions that specify the locale. We don't have
> those yet, but we do have  the capability to write them
> now. Specifically:
> 
>    scm_from_utf8_string
>   scm_from_utf8_symbol
>    scm_from_utf8_keyword
> 
>   scm_from_latin1_string
>    scm_from_latin1_symbol
>   scm_from_latin1_keyword
> 
> We probably also  need the "n" variants.
> 

[...]

> So then we need, I  think:
> 
>   scm_to_utf8_string
>   scm_to_utf16_string
>    scm_to_utf32_string
> 
> We need the "n" variants here too (perhaps  more).

Some of this is already in the bytevectors module, but, 
perhaps not in an easy form for C source code.

It would easy enough to do, but, there is a failure case to 
consider for scm_from_utf8_string.  The C utf8 string could
contain incorrectly encoded data.

You could throw the encoding error, or you could replace the 
bad utf8 with U+FFFD or the question mark.

The bytevector's utf8->string always throws encoding-error.
Maybe that's good enough.

Otherwise, perhaps something like

scm_from_utf8_stringn (str, len, error_or_replace_strategy)

If you didn't mind the overhead of calling the somewhat 
heavyweight scm_{to,from}_stringn, these could be macros
or inline functions that wrap that.

-Mike

[Prev in Thread]

Current Thread

[Next in Thread]

need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Andy Wingo, 2010/09/06
- Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Mike Gran <=
  - Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Andy Wingo, 2010/09/06
- Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Ludovic Courtès, 2010/09/06
  - Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Mike Gran, 2010/09/07
    - Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Ludovic Courtès, 2010/09/07
    - Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Andy Wingo, 2010/09/07
    - Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Ludovic Courtès, 2010/09/08
    - Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Mike Gran, 2010/09/07
    - Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Ludovic Courtès, 2010/09/08
    - Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Andy Wingo, 2010/09/08
    - Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}, Ludovic Courtès, 2010/09/08

Prev by Date: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
Next by Date: Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
Previous by thread: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
Next by thread: Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
Index(es):
- Date
- Thread