[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Enable utf8->string to take a range
From: |
Maxime Devos |
Subject: |
Re: [PATCH] Enable utf8->string to take a range |
Date: |
Wed, 09 Mar 2022 14:24:14 +0100 |
User-agent: |
Evolution 3.38.3-1 |
Vijay Marupudi schreef op vr 21-01-2022 om 20:21 [-0500]:
> +SCM_DEFINE (scm_utf8_range_to_string, "utf8->string",
> + 1, 2, 0,
> + (SCM utf, SCM start, SCM end),
> + "Return a newly allocate string that contains from the
> UTF-8-"
> + "encoded contents of bytevector @var{utf}.")
This is incorrect, since the nul character is encoded even though UTF-
proper does not allow encoding the nul character -- UTF-8 with an
encoding of the nul character is sometimes called ‘modified UTF-8’.
The distinction is sometimes relevant, e.g. the GNS specifications asks
for labels to be encoded in UTF-8, and according to the spec writers,
that implied that nul characters are forbidden.
As such, I cannot rely on 'utf8->string' to verify that there aren't
any nul characters.
Greetings,
Maxime.
signature.asc
Description: This is a digitally signed message part