emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: link syntax fixing bug?


From: Maxim Nikulin
Subject: Re: link syntax fixing bug?
Date: Sun, 25 Apr 2021 17:46:08 +0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1

On 21/03/2021 05:46, Samuel Wales wrote:
the issue is that when i click on google, the space before "hi" does
not show up in the earch box.  ergo, different results.

*** should be orig
[[http://www.google.com/search?q=%7E%22retroactive%20whatever%22%20%22hi%22][retro
original]]
*** should be fixed, is not?
[[http://www.google.com/search?q=~"retroactive whatever" "hi"][retro
original]]

Reading Kyle's response, I have realized that you might have unsafe URL handler. I hope, I am wrong. To factor out some excessively smart JS, I tried

    firefox 'http:/127.0.0.1/search?q=~"retroactive whatever" "hi"'

and I got expected result in the URL bar. With the following test script "fake-browser"

#!/bin/sh
exec kdialog --title "Fake Browser" --msgbox "Args $#: '$*'"

and a some customization:

 '(browse-url-browser-function (quote browse-url-generic))
 '(browse-url-generic-program "fake-browser")

I did not get any white space problem for the following link

[[http:/127.0.0.1/search?q=~"retroactive whatever" "hi"][retro-original]]

So neither passing URL to handler nor handling URL by firefox cause a problem.

However protecting spaces in URLs from `org-fill-paragraph' function was mentioned in mail list archive as one of the reasons to introduce second pass of percent encoding. Double percent encoding is clearly a problem since there is no way to reliably guess whether second pass was applied or not. My impression, it were not a problem if just "offensive" for org symbols "][ \" would be replaced by percent-encoded equivalent in URLs. Maybe I just missed cases when mixing percent-encoded and unicode characters leads to some problem, so I believe it is safe. My hypotesis is that replacing just "[", "]", and "\" to percent encoded equivalent in any URL does not cause any issue, web-servers are able to decode them (selective encoding, not second pass for whole URL). Maybe file links on windows is an exception.

My opinion is that `org-lint' gives false positives for URLs with percent encoded characters. They are rather wide spread e.g. in search queries.

*** [[https://orgmode.org/Changes.html][Org mode for Emacs – Release notes]]
The following function will help switching your links to the new syntax:

(defun org-update-link-syntax (&optional no-query)
...
       (while (re-search-forward "\\[\\[[^]]*?%\\(?:2[05]\\|5[BD]\\)" nil t)

I believe, the logic at least for space symbol (%20) should be more sophisticated. Maybe decoding of URLs with "%20" should be performed only if decoded URL still contains percent-encoded characters. Maybe decoding should be prevented if any of characters mandatory for percent encoding ("[]?/", etc) is present besides percent-encoded sequences. Maybe the only way is interactive comparison of original and decoded URL.

I do not think that particular example you provided

http://www.google.com/search?q=%7E%22retroactive%20whatever%22%20%22hi%22

needs decoding. It is not human friendly but it is more safe and quite wide spread. On the other hand, decoded variant should not lead to any problem as well unless something is misconfigured

[[http://www.google.com/search?q=~"retroactive whatever" "hi"][retro original]]




reply via email to

[Prev in Thread] Current Thread [Next in Thread]