bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13328: 24.2; Rmail does not properly decode MIME messages containing


From: Mark Lillibridge
Subject: bug#13328: 24.2; Rmail does not properly decode MIME messages containing "From " lines or save such attachments correctly
Date: Mon, 10 Oct 2022 11:00:44 -0700

>  From: Lars Ingebrigtsen <larsi@gnus.org>
>  To: Eli Zaretskii <eliz@gnu.org>
>  Cc: mdl@alum.mit.edu,  13328@debbugs.gnu.org
>  Subject: Re: bug#13328: 24.2; Rmail does not properly decode MIME messages
>   containing "From " lines or save such attachments correctly
>  Date: Tue, 08 Dec 2020 17:18:36 +0100
>  In-Reply-To: <83lfe8714d.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 08 
> Dec
>       2020 17:45:22 +0200")
>  
>  Eli Zaretskii <eliz@gnu.org> writes:
>  
>  > AFAIK, mbox format requires that every new message begins with a line
>  > that starts with "From " (sorry, I forgot the space in my previous
>  > message).  See, for example, this page:
>  >
>  >   https://www.loc.gov/preservation/digital/formats/fdd/fdd000383.shtml
>  
>  The mbox format is more restrictive than just that.
>  `message-unix-mail-delimiter' is a regexp that matches these lines.

    First, I wanted to report that this bug is still present in Emacs
28.1.  Start emacs 28.1 -q, find file the gunziped version of the
attachment in this message (should be same mbox as quoted one in
original report), then do M-x rmail-mode.  You will see the last 2
messages w/ extra >'s on the Froms, which continues when you save the
attachments.


    Back to the discussion.  I think the confusion here is that there
are multiple levels of quoting.  At the outermost level there is quoting
of messages to store them in the mbox file.  This is what adds the >'s
in front of from lines.  Without this quoting, a single message
containing From lines would be decoded as multiple messages.  

    Because messages in mbox files are separated by a blank line
followed by a line started with "From ", quoting traditionally on Unix
for mbox (e.g., fetchmail, getmail) is adding a > in front of any line
matching regexp "^>*From ".  When extracting the messages, un-quoting
must be done by removing one > from each line in the message matching
that regular expression.

    Why can there be multiple >'s?  Because we have to safely quote message
lines like ">From ".

    There are some other formats of mbox that do not do this quoting in
a data preserving manner.  They should not be relevant except when
importing mbox files (as opposed to receiving messages) from other
mailing systems.

   MIME may additionally quote parts of its messages internally as well.


    Note that the mbox quoting is not part of the messages and is never
supposed to be seen by the user or exposed when sending messages --
ideally, it should be removed whenever a message is read from the mbox
reader component.  Rmail, unfortunately due to bugs, does not properly
do this.  What should happen is that the current message is unquoted and
copied to a seperate view buffer, which is then displayed, possibly
after decoding various MIME or other quoting (e.g., base 64).  What I'm
guessing is happening for MIME is that the region of the mbox containing
the message body is being passed directly to the MIME decoder.  This
likely explains bug #10080 as well as the blank line of the message
separator is not being correctly excluded.

   You can look at https://en.wikipedia.org/wiki/Mbox for more on mbox
formats; we are talking about mboxrd format in that article's terms.

- Mark

Attachment: mbox_13328.gz
Description: application/gzip


reply via email to

[Prev in Thread] Current Thread [Next in Thread]