emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Improve `replace-regexp-in-string' ergonomics?


From: Lars Ingebrigtsen
Subject: Improve `replace-regexp-in-string' ergonomics?
Date: Wed, 22 Sep 2021 06:36:27 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

`replace-regexp-in-string' often leads to pretty awkward code.  I wonder
whether we could improve it somehow.

Here's a real life example:

(defun org-babel-js-read (results)
[...]
       (org-babel-read
        (concat "'"
                (replace-regexp-in-string
                 "\\[" "(" (replace-regexp-in-string
                            "\\]" ")" (replace-regexp-in-string
                                       ",[[:space:]]" " "
                                       (replace-regexp-in-string
                                        "'" "\"" results))))))

That's kinda hard to read, but variations on this is pretty common.
When you have one `replace-regexp-in-string', you often have another.

We introduced `thread-last' in 2014, and there seems to be one (1) place
in the Emacs code base, so I guess that didn't take off, but rewriting
with that, we get:

       (org-babel-read
        (concat "'"
                (thread-last
                  results
                  (replace-regexp-in-string "'" "\"")
                  (replace-regexp-in-string ",[[:space:]]" " ")
                  (replace-regexp-in-string "\\]" ")")
                  (replace-regexp-in-string "\\[" "("))))

Which is somewhat more readable (but note that this totally breaks down
if you want to mix in LITERAL etc).  But I wonder whether we should
consider renaming the function to something more palatable, and since we
have `string-replace', why not `regexp-replace'?  The length of the name
of this common function is itself offputting.

       (org-babel-read
        (concat "'"
                (thread-last
                  results
                  (regexp-replace "'" "\"")
                  (regexp-replace ",[[:space:]]" " ")
                  (regexp-replace "\\]" ")")
                  (regexp-replace "\\[" "("))))

We could also consider making `regexp-replace' take a series of pairs,
since this is so common.  Like:

       (org-babel-read
        (concat "'"
                (regexp-replace "'" "\""
                                ",[[:space:]]" " "
                                "\\]" ")"
                                "\\[" "("
                                results)))

Or some variation thereupon with some more ()s to group pairs.

The most popular way to deal with the awkwardness is to just give up and
go all imperative:

(defun authors-canonical-author-name (author file pos)
[...]
  (when author
    (setq author (replace-regexp-in-string "[ \t]*[(<].*$" "" author))
    (setq author (replace-regexp-in-string "\\`[ \t]+" "" author))
    (setq author (replace-regexp-in-string "[ \t]+$" "" author))
    (setq author (replace-regexp-in-string "[ \t]+" " " author))

Which leads me to my other point -- about a quarter of the usages of the
function in Emacs core has "" as the replacement, so perhaps that should
have its own function?  `regexp-remove'?

Then that could be:

  (when author
    (setq author (regexp-remove "[ \t]*[(<].*$" author))
    (setq author (regexp-remove "\\`[ \t]+" author))
    (setq author (regexp-remove "[ \t]+$" author))
    (setq author (regexp-replace "[ \t]+" " " author))

or

  (when author
    (setq author
          (regexp-replace
           "[ \t]+" " " (regexp-remove
                         "[ \t]*[(<].*$" (regexp-remove
                                          "\\`[ \t]+" (regexp-remove
                                                       "[ \t]+$" author)))))))
or

  (when author
    (setq author
          (thread-last author
                       (regexp-remove "[ \t]*[(<].*$")
                       (regexp-remove "\\`[ \t]+")
                       (regexp-remove "[ \t]+$")
                       (regexp-replace "[ \t]+" " ")))))


Or...  something else.  I'm sure nobody else has thought about this
issue before.  

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




reply via email to

[Prev in Thread] Current Thread [Next in Thread]