emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

address@hidden: html-coding.el -- coding system from meta tag]


From: Richard M. Stallman
Subject: address@hidden: html-coding.el -- coding system from meta tag]
Date: Wed, 20 Jul 2005 18:08:46 -0400

Could people who know more than I about HTML specifications please
look at this, and tell me whether they think it is good to add to Emacs?

------- Start of forwarded message -------
From: Kevin Ryde <address@hidden>
To: address@hidden
Organization: Bah Humbug
Date: Wed, 20 Jul 2005 10:47:38 +1000
Subject: html-coding.el -- coding system from meta tag
Sender: address@hidden
X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on monty-python
X-Spam-Level: 
X-Spam-Status: No, hits=0.9 required=5.0 tests=FROM_ENDS_IN_NUMS autolearn=no 
        version=2.63

- --=-=-=

This is a little spot of code for getting the coding system from the
meta tag when visiting a html file.

The emacs cvs head already has this feature, so this code is only for
emacs 21.

I'd be surprised if something like this isn't already in some or most
of the heavy duty html/sgml editing/viewing packages, though I
couldn't find the right bits on cursory inspection.  In any case all I
wanted was to see the right chars in a plain old find-file of some
random html.


- --=-=-=
Content-Type: application/emacs-lisp
Content-Disposition: attachment; filename=html-coding.el
Content-Transfer-Encoding: quoted-printable

;;; html-coding.el --- coding system from meta tag when visiting html files.

;; Copyright 2005 Kevin Ryde
;;
;; html-coding.el is free software; you can redistribute it and/or modify it
;; under the terms of the GNU General Public License as published by the
;; Free Software Foundation; either version 2, or (at your option) any later
;; version.
;;
;; html-coding.el is distributed in the hope that it will be useful, but
;; WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General
;; Public License for more details.
;;
;; You can get a copy of the GNU General Public License online at
;; http://www.gnu.org/licenses/gpl.txt, or you should have one in the file
;; COPYING which comes with GNU Emacs and other GNU programs.  Failing that,
;; write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
;; Boston, MA 02111-1307, USA.


;;; Commentary:

;; This is a spot of code for getting the coding system from a HTML <meta>
;; tag when visiting a .html, .shtml or .htm file.  mm-util.el (from Gnus)
;; is used to map a mime charset name in the html to an emacs coding system.
;;
;; This code is designed for Emacs 21.  The Emacs cvs head (which will be
;; Emacs 22 or whatever) already has this feature (in
;; sgml-html-meta-auto-coding-function), so nothing is done there.

;; If you have a file with a slightly bogus charset name, like "iso8859-1"
;; where it should be "iso-8859-1", you can map to the right one in
;; `mm-charset-synonym-alist', like
;;
;;     (eval-after-load "mm-util"
;;       '(add-to-list 'mm-charset-synonym-alist '(iso8859-1 . iso-8859-1)))
;;
;; But note that the mm-util.el which comes with Emacs 21.4a has a bug that
;; stops this working.  The test (mm-coding-system-p charset) should be
;; (mm-coding-system-p cs), ie. validate the mapped good name, not the bad
;; one.  You can make that change, or it's fixed in the separately packaged
;; Gnus.


;;; Install:

;; Put html-coding.el somewhere in your `load-path', and in your .emacs put
;;
;;     (require 'html-coding)

;;; History:

;; Version 1 - the first version.


;;; Code:

;; emacs 22 `sgml-html-meta-auto-coding-function' does this coding system
;; determination already, skip our code in that case
;;
(unless (fboundp 'sgml-html-meta-auto-coding-function)

  (defun html-coding-system (args)
    "Return the coding system for reading a HTML file, based on the <meta> =
tag.
If there's no charset in the file, this function checks what other rules sa=
y.

This function is for use in `file-coding-system-alist', the ARGS parameter
is a list, the only form handled here is `(insert-file-contents ...)'."
    (or (and (eq (car args) 'insert-file-contents)
             (file-exists-p (cadr args))
             (with-temp-buffer
               (insert-file-contents-literally (cadr args))
               (and (re-search-forward "<meta\\s-[^>]*charset=3D\\([^\">]+\=
\)"
                                       ;; first 10 lines, like emacs 22
                                       (save-excursion (forward-line 10)
                                                       (point))
                                       t)
                    (let ((charset (match-string 1)))
                      (require 'mm-util)
                      (or (mm-charset-to-coding-system charset)
                          (progn
                            (message "Unrecognised HTML MIME charset: %s"
                                     charset)
                            nil))))))
        (progn
          (require 'cl)
          (let ((file-coding-system-alist
                 (remove* 'html-coding-system file-coding-system-alist
                          :key 'cdr)))
            (apply 'find-operation-coding-system args)))))

  (modify-coding-system-alist 'file "\\.\\(html\\|shtml\\|htm\\)\\'" 'html-=
coding-system))

(provide 'html-coding)

;;; html-coding.el ends here

- --=-=-=
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Gnu-emacs-sources mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/gnu-emacs-sources

- --=-=-=--
------- End of forwarded message -------




reply via email to

[Prev in Thread] Current Thread [Next in Thread]