bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gnu-libiconv] iconv fails on large Greek files


From: W. Wesley Groleau (伟思礼)
Subject: [bug-gnu-libiconv] iconv fails on large Greek files
Date: Sat, 1 Oct 2022 12:36:08 -0700

In a large file of Greek, with no known decomposed characters, iconv will fail 
if using it to fix any decomposed letters (“just in case”).
It does not fail if the reverse is done (or at least it didn’t this time). The 
failure usually occurs after processing APPROX. 4000 bytes, but occasionally 
approx. 8000.  If the line in the error message and a few lines before and 
after are processed, it doesn’t fail. Thought it might be related to a buffer 
size, but the exact number of bytes varies.  Also, if the source file is NOT 
changed, the failure position varies (but always close to 4000 or 8000)

The failure also occurs when the file does have known decomposed characters.

WGroleau@MBP ~ % iconv --version
iconv (GNU libiconv 1.16)
WGroleau@MBP ~ % uname -a
Darwin MBP.local 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:17:10 PDT 
2022; root:xnu-8020.140.49~2/RELEASE_X86_64 x86_64
WGroleau@MBP el % wc el.txt                                                 
     179     975    8621 el.txt
WGroleau@MBP el % iconv -f UTF8-MAC -t UTF-8 el.txt > /tmp/tmp              

iconv: el.txt:90:16: cannot convert
WGroleau@MBP el % wc /tmp/tmp
      89     457    4093 /tmp/tmp
WGroleau@MBP el % iconv -f UTF-8 -t UTF8-MAC el.txt > /tmp/tmp
WGroleau@MBP el % wc /tmp/tmp
     179    1029    9537 /tmp/tmp
WGroleau@MBP el % iconv -f UTF8-MAC -t UTF-8 el.txt > /tmp/tmp

iconv: el.txt:90:16: cannot convert
WGroleau@MBP el % wc el.txt
     179     975    8621 el.txt                                  WGroleau@MBP 
el % tail -$((179-90+2)) el.txt > el+.txt
WGroleau@MBP el % wc el+.txt
      90     522    4558 el+.txt
WGroleau@MBP el % iconv -f UTF8-MAC -t UTF-8 el+.txt > /tmp/tmp

iconv: el+.txt:84:36: cannot convert
WGroleau@MBP el % wc /tmp/tmp
      83     469    4093 /tmp/tmp
WGroleau@MBP el % iconv -f UTF-8 -t UTF8-MAC el.txt > /tmp/tmp
WGroleau@MBP el % iconv -f UTF8-MAC -t UTF-8 /tmp/tmp > temp.txt

iconv: /tmp/tmp:161:7: cannot convert
WGroleau@MBP el % wc temp.txt
     160     835    7390 temp.txt
WGroleau@MBP el % wc /tmp/tmp
     179    1029    9537 /tmp/tmp


-- 
Wes Groleau
伟思礼

You can make many plans,
but the Lᴏʀᴅ’s purpose will prevail.
http://biblehub.com/proverbs/19-21.htm




reply via email to

[Prev in Thread] Current Thread [Next in Thread]