[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
base64-decode-region inserts carriage-returns
From: |
Eric Hanchrow |
Subject: |
base64-decode-region inserts carriage-returns |
Date: |
08 Jun 2002 13:42:42 -0700 |
In GNU Emacs 21.2.1 (i386-debian-linux-gnu, X toolkit, Xaw3d scroll bars)
of 2002-05-18 on offby1, modified by Debian
configured using `configure i386-debian-linux-gnu --prefix=/usr
--sharedstatedir=/var/lib --libexecdir=/usr/lib --localstatedir=/var/lib
--infodir=/usr/share/info --mandir=/usr/share/man --with-pop=yes --with-x=yes
--with-x-toolkit=athena --without-gif'
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: nil
locale-coding-system: nil
default-enable-multibyte-characters: nil
Using Bash, create a binary file containing eight bytes in two lines:
bash$ echo -n $'\001\002\003\n\001\002\003\n' > /tmp/bin
Double-check that the file contains what we think it does:
bash$ od -c /tmp/bin
you'll see 0000000 001 002 003 \n 001 002 003 \n
Start Emacs with -q --no-site-file.
Visit that file in Emacs:
M-x find-file-literally RET /tmp/bin RET
Base64-encode it:
C-x h M-x base64-encode-region RET
Put a carriage-return-linefeed pair at the end of the single line:
M-> C-q C-m RET
Save the encoded version:
C-x C-w bin.b64 RET
Revisit the file, thus setting the buffer to use the MS-DOS line
ending convention:
C-x C-v RET
Base64-decode the file:
C-x h M-x base64-decode-region
Save the decoded version to a different file for comparison with the
original:
C-x C-w bin.again RET
Now examine the newly-saved version with od back at the shell:
od -c /tmp/bin.again
you'll now see 0000000 001 002 003 \r \n 001 002 003 \r \n
Thus the binary file has had some carriage-returns inserted into it,
which is a Bad Thing, since those carriage-returns were not present in
the encoded data.
RFC 2045 says both
All line breaks or other characters not found in Table 1 must
be ignored by decoding software.
and
Any characters outside of the base64 alphabet are to be
ignored in base64-encoded data.
If this is indeed a bug (as opposed to my misunderstanding how
base64-decode-region is supposed to work) then a possible fix would be
to have base64-decode-region, after it's done its work, do
(set-buffer-file-coding-system 'raw-text-unix) or something similar.
--
PGP Fingerprint: 3E7B A3F3 96CA 8958 ACC5 C8BD 6337 0041 C01C 5276
- base64-decode-region inserts carriage-returns,
Eric Hanchrow <=