emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Org mode links: Open a PDF file at a given page and highlight a give


From: Max Nikulin
Subject: Re: Org mode links: Open a PDF file at a given page and highlight a given string
Date: Sat, 3 Sep 2022 20:00:47 +0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0

On 03/03/2021 03:07, Rodrigo Morales wrote:

#+begin_src emacs-lisp :results silent
(setq org-file-apps
       '(("\\.pdf::\\([0-9]+\\)::\\([^:]+\\)\\'" . "zathura -P %1 -f %2 %s")))
#+end_src

I am going to respond to a message from another thread containing discussion of a patch, but I suppose the following considerations are more appropriate in the thread discussing combined specifier for location withing a PDF document. For me it is more convenient to test ideas using okular, I hope, changing code for zathura or other PDF viewer is trivial.

Ihor Radchenko. Re: [PATCH] org.el: Fix percent substitutions in `org-open-file' Fri, 02 Sep 2022 20:08:17 +0800. https://list.orgmode.org/87tu5qm11q.fsf@localhost
+    ;; Page and search string,
+    ;; e.g. <file:///usr/share/doc/bash/bashref.pdf::34::order of 
redirections>.
+    (\"\\\\.pdf::\\\\([0-9]+\\\\)::\\\\(.+\\\\)\\\\\\='\"
+        . \"okular --page %1 --find %2 %s\")
+    ;; Internal anchor and search string,
+    ;; e.g. <file:///usr/share/doc/bash/bashref.pdf::Redirections::allocate a 
file>.
+    (\"\\\\.pdf::\\\\(.+\\\\)::\\\\(.+\\\\)\\\\\\='\"
+        . \"okular --find %2 file://%s\\\\\\\\#%1\")
+    ;; Page number, e.g. <file:///usr/share/doc/bash/bashref.pdf::34>.
+    (\"\\\\.pdf::\\\\([0-9]+\\\\)\\\\\\='\" . \"okular --page %1 %s\")
+    ;; Internal reference, e.g. 
<file:///usr/share/doc/bash/bashref.pdf::Redirections>.
+    (\"\\\\.pdf::\\\\(.+\\\\)\\\\\\='\" . \"okular file://%s\\\\\\\\#%1\")
+    ;; No location within the file, optionally followed by \"::\",
+    ;; e.g. <file:///usr/share/doc/bash/bashref.pdf>.
+    (\"\\\\.pdf\\\\(?:::\\\\)?\\\\\\='\" . \"okular %s\")

This is a nice set of examples, but it probably does not belong to this
docstring. I'd rather see this in `org-file-apps' docstring or even in
the manual.

It is a part of docstring, so number of slashes is doubled.

First of all, I overlooked possibility to distinguish text search "file:text.pdf::patttern" and cross reference target within the document "file:text.pdf::#anchor". Secondly, I forgot that PDF viewers may support compressed files.

Currently I believe that instead of injecting up to 6 entries into `org-file-apps' for various combinations of page, anchor, and search pattern, it is better to add single record with function handler. Notice that the approach presented above is not affected by the bug with multiple regexp group. Its additional advantage is that shell is not involved, so peculiar file names can not cause execution of some code when quoting and escaping are messed up.

I think a set of functions for popular PDF viewers (evince, zathura, okular, xpdf, xpopple, firefox, chromium & Co., etc.) should be defined in some package, but I am in doubts if it suitable for Org core.

Proof of concept implementation.

Configuration:

(add-to-list
 'org-file-apps
 `(("\\.pdf\\(?:\\.gz\\|\\.bz2\\|\\.xz\\)?\\(?:::.*\\)?\\'"
    . ,#'my-open-file-pdf-okular)))

Helper functions:

(defun my--parse-file-link-search (suffix-re link)
"Parse PDF file LINK for page number, cross reference anchor, search string.

Return nil if it is not a link to some PDF file. Any element may be nil."
  (let ((case-fold-search t)) ; Handle files having .PDF suffix as well
    (and (string-match
          (concat suffix-re
                  (rx
                   (optional "::"
                             (or (group (+ digit))
                                 (group "#" (+ (not (any ?:))))
                                 (optional "#")))
                   (optional "::"
                             (optional (group (+ anything))))
                   string-end))
          link)
         (mapcar (lambda (i) (match-string i link)) '(1 2 3)))))

(defun my-launch-viewer (command arguments)
  "Launch external application COMMAND with ARGUMENTS.

The function allows to avoid intermediate shell and so escaping
of arguments that otherwise might be considered as shell
specialls and run arbitrary commands.  The function launches
viewer process using shoot ant forget method like `browse-url-xdg-open',
so the application may run even after quit from Emacs."
  (apply #'call-process command nil 0 nil args))

(defun my-open-file-pdf-okular (file link)
  "PDF files handler for usage as a command in `org-file-apps' alist.

Customize `org-file-apps' to add the following entry:


\\='(\"\\\\.pdf\\\\(?:\\\\.gz\\\\|\\\\.bz2\\\\|\\\\.xz\\\\)?\\\\(?:::.*\\\\)?\\\\\\='\"
      . #\\='my-open-file-pdf-okular)

Open FILE at the location specified by LINK (page, internal
reference, search string).  Supported link search options (side
note: in the particular case of bash manual
<info:bash#Redirections> link may be used instead):

- Page number <file:///usr/share/doc/bash/bashref.pdf::34>.
- Page number and search text
  <file:///usr/share/doc/bash/bashref.pdf::34::order of redirections>.
- Cross reference anchor
  <file:///usr/share/doc/bash/bashref.pdf::#Redirections>.
- Cross reference anchor and search text
  <file:///usr/share/doc/bash/bashref.pdf::#Redirections::allocate a file>.
- Search text <file:///usr/share/doc/bash/bashref.pdf::allocate a file>.

Optionally the FILE may be compressed by gzip, bzip2, or xz."
  (pcase-let* ((pdf-re (rx ".pdf"
                           ;; .Z and .zip are not supported by okular
                           (optional (or ".gz" ".bz2" ".xz"))))
               (`(,page ,ref ,find)
                (or (my--parse-file-link-search pdf-re link)
                    (error "Not a PDF file link: %S" link)))
               (args (list "--"
                    (if (org-string-nw-p ref)
                        (concat file ref)
                      file))))
      ;; Protect against file names starting from a dash that might be
      ;; considered as an option despite `org-open-file' passes absolute
      ;; file name and it is not strictly necessary.
      (when find
        (push find args)
        (push "--find" args))
      (when page
        (push page args)
        (push "--page" args))
      (my-launch-viewer "okular" args)))


And some tests

(ert-deftest test-my/parse-file-link-search ()
  (let ((pdf-re (rx ".pdf"
                    ;; .Z and .zip are not supported by okular
                    (optional (or ".gz" ".bz2" ".xz")))))
    (should-not (my--parse-file-link-search pdf-re "/no-match.doc"))
    (should-not (my--parse-file-link-search pdf-re "/no-match.doc::#ref"))
    (should (equal
             '(nil nil nil)
             (my--parse-file-link-search pdf-re "/just-file.pdf")))
    (should (equal
             '(nil nil nil)
             (my--parse-file-link-search pdf-re "/just-file-upper-case.PDF")))
    (should (equal
             '("21" nil nil)
             (my--parse-file-link-search pdf-re "/page.pdf::21")))
    (should (equal
             '(nil "#ref" nil)
             (my--parse-file-link-search pdf-re "/anchor.pdf::#ref")))
    (should (equal
             '(nil nil "some text")
             (my--parse-file-link-search pdf-re "/search-string.pdf::some 
text")))
    (should (equal
             '(nil nil "in gzipped file")
             (my--parse-file-link-search
              pdf-re
              "/compressed-search-string.pdf.GZ::in gzipped file")))
    (should (equal
             '("32" nil "page text")
             (my--parse-file-link-search
              pdf-re
              "/page-search.pdf::32::page text")))
    (should (equal
             '(nil "#ref" "anchor text")
             (my--parse-file-link-search
              pdf-re
              "/anchor-search.pdf::#ref::anchor text")))
    (should (equal
             '(nil nil "::")
             (my--parse-file-link-search pdf-re "/search-quad.pdf::::::")))
    (should (equal
             '(nil nil nil)
             (my--parse-file-link-search pdf-re "/nothing-1.pdf::")))
    (should (equal
             '(nil nil nil)
             (my--parse-file-link-search pdf-re "/nothing-2.PDF::::")))
    (should (equal
             '(nil nil nil)
             (my--parse-file-link-search pdf-re "/empty-anchor-1.pdf::#")))
    (should (equal
             '(nil nil nil)
             (my--parse-file-link-search pdf-re "/empty-anchor-2.pdf::#::")))
    (should (equal
             '(nil nil "empty anchor text")
             (my--parse-file-link-search
              pdf-re
              "/empty-anchor-1.pdf::#::empty anchor text")))))





reply via email to

[Prev in Thread] Current Thread [Next in Thread]