bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #64061] pdfpic.tmac requires non-standard sed feature


From: Deri James
Subject: [bug #64061] pdfpic.tmac requires non-standard sed feature
Date: Sat, 6 May 2023 16:30:35 -0400 (EDT)

Follow-up Comment #16, bug #64061 (project groff):

The switch from using grep to sed, which seems to have caused issues, may well
have been unnecessary, given that the example pdf provided in bug #58206 was
in fact an invalid pdf, which is why pdfinfo did not handle it correctly. The
pdf reference says:-

"For text strings encoded in Unicode, the first two bytes must be 254 followed
by
255, representing the Unicode byte order marker, U+FEFF ."

>From the byte dump in that bug you can see no such BOM is present. Please also
note that later versions of pdfinfo (checked with v. 0.26.4) now handles the
mangled title correctly (must be recognising the alternating zero bytes and
"guessing" it is UTF-16 with a missing BOM."

So, if it makes things any easier we could go back to a simple grep, since if
it fails we know it is a non-standard pdf and they are using an older version
of pdfinfo.


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?64061>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]