info-gnus-english
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

XEmacs, Gnus and mm-coding-system-priorities.


From: Aidan Kehoe
Subject: XEmacs, Gnus and mm-coding-system-priorities.
Date: Thu, 02 Dec 2004 11:59:01 +0000
User-agent: Gnus/5.1002 (Gnus v5.10.2) XEmacs/21.4 (Rational FORTRAN, linux)

Hi, 

Further to my message of the 26th, lvmzx4a1vb.fsf@ns5.nestdesign.com , I’ve
made a patch to mm-util.el that takes advantage of Stephen Turnbull’s Latin
Unity to remap messages’ characters and to take notice of the
mm-coding-system-priorities variable under XEmacs. 

With this patch applied, and with latin-unity available,

     (setq mm-coding-system-priorities '(iso-8859-1 iso-8859-15 utf-8))

in your init file tells Gnus to post in Latin 1 if the message fits into
Latin 1--including if, say, Latin 2 U WITH DIAERESIS is used--iso-8859-15 if
the message fits into that but not Latin 1, and UTF-8 if neither of those
things is true. This is much preferable to the current behaviour.

Tested under 21.4, 21.5 and the stable GNU Emacs. If you have trouble
applying the patch from a news article, there’s a plain text version
available at
http://parhasard.net/mm-util-xemacs-coding-system-priorities.diff .

What do I have to do get this included in the standard Gnus?

Best regards, 

        - Aidan
-- 
“As democracy is perfected, the office of president represents, more and
more closely, the inner soul of the people. On some great and glorious day
the plain folks of the land will reach their heart’s desire at last and the
White House will be adorned by a downright moron.” – H.L. Mencken 

--- mm-util.el~ 2004-12-02 10:43:08.000000000 +0000
+++ mm-util.el  2004-12-02 11:42:03.000000000 +0000
@@ -587,11 +587,68 @@
               charsets))
        ;; Otherwise we're not multibyte, we're XEmacs, or a single
        ;; coding system won't cover it.
-       (setq charsets
-             (mm-delete-duplicates
-              (mapcar 'mm-mime-charset
-                      (delq 'ascii
-                            (mm-find-charset-region b e))))))
+
+       ;; For intelligent handling of the various ISO-8859-? character sets
+       ;; and their common subsets under XEmacs, we use latin-unity.
+       (when (and (not (featurep 'latin-unity))
+                  (locate-library "latin-unity"))
+         (require 'latin-unity))
+
+       (if (featurep 'latin-unity)
+           (let ((csets (latin-unity-representations-feasible-region b e))
+                 (psets (latin-unity-representations-present-region b e))
+                 (systems mm-coding-system-priorities)
+                 (chars-region (delq 'ascii (charsets-in-region b e))) curset)
+
+             (assert (featurep 'xemacs) t 
+                     "Latin Unity shouldn't be available on GNU Emacs.")
+
+             (setq charsets
+                   (catch 'done
+
+                      ;; Check whether all Latin Unity knows about all the
+                      ;; character sets in the region. If it doesn't, and we
+                      ;; have a universal coding system in the
+                      ;; mm-coding-system-priorities list, return that
+                      ;; universal coding system. Otherwise, we can't do the
+                      ;; right thing; return a multiple-entry list, so Gnus
+                      ;; will do its broken thing.
+
+                      (dolist (curset chars-region)
+                        (unless (memq curset latin-unity-character-sets)
+                          (dolist (curset systems)
+                            (if (memq curset latin-unity-ucs-list)
+                                (throw 'done (list curset))))
+                          (throw 'done (mapcar 'mm-mime-charset
+                                               (delq 'ascii
+                                                     (charsets-in-region
+                                                      b e))))))
+
+                      ;; Okay, Latin Unity does know all about the
+                      ;; character sets in the region. Pass back the first
+                      ;; coding system in the preferred list that can
+                      ;; encode the whole buffer.
+
+                      (dolist (curset systems)
+                        (setq curset 
+                              (latin-unity-massage-name curset 
+                                                        'buffer-default))
+                        (if (memq curset latin-unity-ucs-list)
+                            (throw 'done (list curset)))
+                        (if (latin-unity-maybe-remap b e curset csets psets t)
+                            (throw 'done (list curset))))
+
+                      ;; Can't encode using anything from the
+                      ;; mm-coding-system-priorities list. Return a
+                      ;; multiple entry list.
+                      (mapcar 'mm-mime-charset 
+                              (delq 'ascii (charsets-in-region b e))))))
+         ;; Otherwise, there's nothing really intelligent we can do with
+         ;; the characters.
+         (setq charsets
+               (mm-delete-duplicates 
+                (mapcar 'mm-mime-charset 
+                        (delq 'ascii (mm-find-charset-region b e)))))))
     (if (and (> (length charsets) 1)
             (memq 'iso-8859-15 charsets)
             (memq 'iso-8859-15 hack-charsets)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]