bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Failure to render utf8 characters when sourced


From: G. Branden Robinson
Subject: Re: Failure to render utf8 characters when sourced
Date: Fri, 26 Aug 2022 11:01:19 -0500

[looping in groff list, since bug-groff isn't really for discussion]

Hi Pippo,

At 2022-08-26T12:07:17+0800, Pippo Carmona wrote:
> Greetings!
> 
> Since the Refer csl cannot be changed,

I'm not precisely sure what you mean by the "csl" here, but I think I
grasp the contours of your problem.

> I used the .ds macro to supply my footnotes and bibliography with the
> formatted entries that fit my specification. However, if the .ds
> macros are sourced from a separate file using .so, some characters are
> rendered incorrectly. For example, é becomes é. And when I set the
> macro in the same document, it is rendered correctly.
> 
> I have used -k and preconv to try solve the issue, but it just doesn't
> work.  Is there a workaround that I need to do, or is this a bug?

I think you are hitting a known limitation of preconv.  Here is some
language from the version of the man page in groff Git.

[[
   Limitations
       preconv cannot perform any transformation on input that it cannot
       see.  Examples include files that are interpolated by
       preprocessors that run subsequently, including soelim(1); files
       included by troff itself through “so” and similar requests; and
       string definitions passed to troff through its -d command‐line
       option.
]]

https://git.savannah.gnu.org/cgit/groff.git/tree/src/preproc/preconv/preconv.1.man

There are multiple workarounds.  Bjarni offered one.

At 2022-08-26T14:46:48+0000, bjarniig@vortex.is wrote:
> This looks like a case for bug #59442.  a) Use the option '-V' for
> "groff"  to see what the pipeline is  b) Reconstruct it to put
> "soelim" first.  Add the option '-e <encoding>' to the "preconv"
> command.

Another approach would be to convert the file you're sourcing to be
groff-friendly input on disk.

So instead of a UTF-8 encoded file like this:

.ds Gassee Jean-Louis Gassée\"

You might have:

.ds Gassee Jean-Louis Gass\['e]e\"

Some day I'd like to extend preconv(1) to accept options to produce
input that is more user-friendly and maintainable than the Unicode code
point escape sequences that it produces now, which look like this.

.ds Gassee Jean-Louis Gass\[u00C3]\[u00A9]e\"

You can read more about these issues in the groff_char(7) man page; I
recommend the version from groff Git; it has been considerably
expanded and clarified since the 1.22.4 release.

https://git.savannah.gnu.org/cgit/groff.git/tree/man/groff_char.7.man

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]