[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[O] Suggestion: Add zero-width nbsp to emphasis-regexp-components

From: Chris
Subject: [O] Suggestion: Add zero-width nbsp to emphasis-regexp-components
Date: Wed, 06 Jun 2018 09:54:37 +0200


I'm not an experienced mailing list user, but I will try to be brief.
Please excuse my lack of common courtesy.

* Problem

  There needs to be a way to coax Org into interpreting something as an
  emphasis marker, even if it ordinarily would not look like it (for
  example, because it is in the middle of a regular word, when putting
  emphasis on only part of a word.)

  - Version of Org: 9.1.6
  - Version of Emacs: GNU Emacs 25.3.2 (x86_64-pc-linux-gnu)

* Suggested Solution

  Include the Unicode zero width no-break space character (U+feff) in
  both ~pre~ and ~post~ sections of ~org-emphasis-regexp-components~.

  I currently have trouble accessing code.orgmode.org (502 Bad Gateway),
  but I imagine the solution to look something like

      --- org.el      2018-06-06 09:33:56.602335268 +0200
      +++ org-zwnbsp-emphasis.el      2018-06-06 09:39:37.985958647 +0200
      @@ -4355,7 +4355,7 @@
       ;; set this option proved cumbersome.  See this message/thread:
       ;; http://article.gmane.org/gmane.emacs.orgmode/68681
       (defvar org-emphasis-regexp-components
      -  '("- \t('\"{" "- \t.,:!?;'\")}\\[" " \t\r\n" "." 1)
      +  '("- \ufeff\t('\"{" "- \ufeff\t.,:!?;'\")}\\[" " \t\r\n" "." 1)
         "Components used to build the regular expression for emphasis.
       This is a list with five entries.  Terminology:  In an emphasis string
       like \" *strong word* \", we call the initial space PREMATCH, the final

  This has the added tiny benefit that legacy documents that still use
  U+feff as a byte order mark may be able to get emphasis also on their
  first word... (Not sure if this is a problem, actually, just throwing
  it out there.)

* Discussion

  - Does this even make sense to begin with, or is it just me?

  - Is the zero-width no-break space the most sensible character to do
    this with?

    I see the zero-width joiner as the alternative – but that appears to
    have more legitimate uses inside words, especially in some
    non-Western scripts such as Arabic and Indic. I use U+feff mostly
    because it is actually sort of a space but not quite.

* Related Reports

  I found an email in the archives which touches on the same point[1],
  but suggests a more radical change.

  [1]: https://lists.gnu.org/archive/html/emacs-orgmode/2017-09/msg00363.html


Attachment: signature.asc
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]