emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] Bug: text export and multi-word link descriptions with line brea


From: Mathias Bauer
Subject: Re: [O] Bug: text export and multi-word link descriptions with line breaks
Date: Thu, 3 Apr 2014 18:30:24 +0200

Hello Nicolas,

* Nicolas Goaziou wrote on 2014-04-03 at 17:25 (+0200):

> Mathias Bauer <address@hidden> writes:
>
> > I just stumbled over Org's plain text export and how it works on
> > links with descriptions consisting of multiple words and line
> > breaks between them.  I'm running Org stable version 8.2.5h.
> >
> > Org source (spaces at the end of line 1 and 2 don't matter):
> >
> > --------------------snip--------------------
> > "OpenPGP Message Format" ([[https://tools.ietf.org/html/rfc4880][RFC
> > 4880]] which obsoletes [[https://tools.ietf.org/html/rfc1991][RFC
> > 1991]] and [[https://tools.ietf.org/html/rfc2440][RFC 2440]])...
> > ...
> > foo [[https://tools.ietf.org/html/rfc4880][RFC 4880]] bar
> > baz [[https://tools.ietf.org/html/rfc1991][RFC 1991]] foo
> > bar [[https://tools.ietf.org/html/rfc2440][RFC 2440]] baz
> > --------------------snip--------------------
> >
> > Text export result:
> >
> > --------------------snip--------------------
> > "OpenPGP Message Format" ([RFC 4880] which obsoletes [RFC 1991] and [RFC
> > 2440])...  ...  foo [RFC 4880] bar baz [RFC 1991] foo bar [RFC 2440] baz
> >
> >
> > [RFC 4880] https://tools.ietf.org/html/rfc4880
> >
> > [RFC 1991] https://tools.ietf.org/html/rfc1991
> >
> > [RFC 2440] https://tools.ietf.org/html/rfc2440
> >
> > [RFC 4880] https://tools.ietf.org/html/rfc4880
> >
> > [RFC 1991] https://tools.ietf.org/html/rfc1991
> > --------------------snip--------------------
> >
> > These multiple references look quite bad.  Is it possible to
> > "normalize" the descriptions in some way *before* checking
> > them for uniqueness and output them thereafter?
>
> Could you be more explicit? What does look quite bad? What did
> you expect instead? How is related to line breaks in the
> descriptions?

Ok, let's go into more details.  See the Org source text:

1. There are three links and each of them appears twice.  The
   link targets of every two of them are identical.

2. Each of the two "[...][RFC 2440]" links appear in one line; the
   links "[...][RFC 4880]" and "[...][RFC 1991]" each have a
   newline in their description.  They are in fact
   "[...][RFC\n4880]" and "[...][RFC 4880]" and, respectively,
   "[...][RFC\n1991]" and "[...][RFC 1991]".

So, now let's examine the Org text export:

The final reference part - the five links below the paragraph -
shows two links, [RFC 4880] and [RFC 1991], which appear twice
but the link [RFC 2440] appears only once there.

This is, at least, inconsistent.

The point is, that Org obviously considers "[...][RFC 4880]" and
"[...][RFC\n4880]" as being two different links internally and
list both of them in the reference part.  For this listing, the
\n is removed.  This is, what I called "normalization" in my
first post.

Human eyes, however, won't see any difference between this two
forms and start being surprised.

I expect, Org to do the following steps while parsing the source
text:

1. "Normalize" or clean the link description, i.e. remove any
   newlines, starting and trailing spaces, and replace any
   occurrences of "[ \t]+" in the interior by a single space
   only.  (To be done.)

2. Check the tuple (description,target) for duplicates and drop
   them.  (Seems ok to me.)

3. Below the paragraph list the tuples as "[description] target"
   in the order of occurrence in the original text.  (Also seems
   ok to me.)

I hope this makes this issue a little bit more clear now.

Kind regards,
Mathias



reply via email to

[Prev in Thread] Current Thread [Next in Thread]