emacs-elpa-diffs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[elpa] externals/guess-language d532e6217c 19/35: Merge pull request #31


From: ELPA Syncer
Subject: [elpa] externals/guess-language d532e6217c 19/35: Merge pull request #31 from hendursaga/eo-support
Date: Tue, 28 May 2024 18:58:24 -0400 (EDT)

branch: externals/guess-language
commit d532e6217c74f99483715f3fff20d58ed46d81e3
Merge: afbc3456eb 166618eb79
Author: Titus von der Malsburg <malsburg@posteo.de>
Commit: GitHub <noreply@github.com>

    Merge pull request #31 from hendursaga/eo-support
    
    Add Esperanto language support
---
 README.org                           |   3 +-
 guess-language.el                    |  16 +-
 testdata/all_supported_languages.org |   2 +
 trigrams/eo                          | 300 +++++++++++++++++++++++++++++++++++
 4 files changed, 313 insertions(+), 8 deletions(-)

diff --git a/README.org b/README.org
index e9be266126..da572279b2 100644
--- a/README.org
+++ b/README.org
@@ -16,7 +16,7 @@ Emacs minor mode that detects the language of what you're 
typing.  Automatically
 
 I write a lot of text in multiple languages and was getting tired of 
constantly having to switch the dictionary of my spell-checker.  In true Emacs 
spirit, I decided to dust off my grandpa's parentheses and wrote some code to 
address this problem.  The result is ~guess-language-mode~, a minor mode for 
Emacs that guesses the language of the current paragraph and then changes the 
dictionary of ispell and the language settings of typo-mode (if present).  It 
also reruns Flyspell on the curre [...]
 
-Currently, the following languages are supported: Arabic, Czech, Danish, 
Dutch, English, Finnish, French, German, Italian, Norwegian, Polish, 
Portuguese, Russian, Serbian (Cyrillic and Latin), Slovak, Slovenian, Spanish, 
Swedish.
+Currently, the following languages are supported: Arabic, Czech, Danish, 
Dutch, English, Esperanto, Finnish, French, German, Italian, Norwegian, Polish, 
Portuguese, Russian, Serbian (Cyrillic and Latin), Slovak, Slovenian, Spanish, 
Swedish.
 
 It is easy to add more languages and this repository includes the necessary 
language statistics for 47 additional languages.  (These were copied from 
[[https://github.com/kent37/guess-language][guess_language.py]].)  If we 
already have the required language data (see directory 
[[https://github.com/tmalsburg/guess-language.el/tree/master/trigrams][trigrams]]),
 all you need to do is to add an entry to the variable 
~guess-language-langcodes~.  See [[https://github.com/tmalsburg/guess-langua 
[...]
 
@@ -75,6 +75,7 @@ Languages that are currently supported by guess-language-mode:
 | Danish             | ~da~           | dansk                     |            
                      |
 | Dutch              | ~nl~           | nederlands                |            
                      |
 | English            | ~en~           | en                        | English    
                      |
+| Esperanto          | ~eo~           | esperanto                 | English    
                      |
 | Finnish            | ~fi~           | finnish                   | Finnish    
                      |
 | French             | ~fr~           | francais                  | French     
                      |
 | German             | ~de~           | de                        | German     
                      |
diff --git a/guess-language.el b/guess-language.el
index e2ba6ed364..24b063dca4 100644
--- a/guess-language.el
+++ b/guess-language.el
@@ -42,10 +42,10 @@
 ;;
 ;; The detection algorithm is based on counts of character trigrams.
 ;; At this time, supported languages are Arabic, Czech, Danish, Dutch,
-;; English, Finnish, French, German, Italian, Norwegian, Polish,
-;; Portuguese, Russian, Serbian, Slovak, Slovenian, Spanish, Swedish.
-;; Adding further languages is very easy and this package already
-;; contains language statistics for 49 additional languages.
+;; English, Esperanto, Finnish, French, German, Italian, Norwegian,
+;; Polish, Portuguese, Russian, Serbian, Slovak, Slovenian, Spanish, and
+;; Swedish. Adding further languages is very easy and this package
+;; already contains language statistics for 49 additional languages.
 
 ;;; Code:
 
@@ -65,9 +65,10 @@ dictionary, input methods, etc."
 
 Uses ISO 639-1 identifiers.  Currently supported languages are:
 Arabic (ar),  Czech (cs),  Danish (da),  Dutch (nl),  English (en),
-Finnish (fi),  French (fr),  German (de),  Italian (it),
-Norwegian (nb),  Polish (pl),  Portuguese (pt),  Russian (ru),
-Slovak (sk),  Slovenian (sl),  Spanish (es),  Swedish (sv)"
+Esperanto (eo),  Finnish (fi),  French (fr),  German (de),
+Italian (it),  Norwegian (nb),  Polish (pl),  Portuguese (pt),
+Russian (ru), Slovak (sk),  Slovenian (sl),  Spanish (es),  and
+Swedish (sv)"
   :type '(repeat symbol))
 
 (defcustom guess-language-min-paragraph-length 40
@@ -87,6 +88,7 @@ little material to reliably guess the language."
     (da     . ("dansk"      nil))
     (de     . ("de"         "German"))
     (en     . ("en"         "English"))
+    (eo     . ("eo"         "English"))
     (es     . ("spanish"    nil))
     (fi     . ("finnish"    "Finnish"))
     (fr     . ("francais"   "French"))
diff --git a/testdata/all_supported_languages.org 
b/testdata/all_supported_languages.org
index f5ed965102..51bc120bfa 100644
--- a/testdata/all_supported_languages.org
+++ b/testdata/all_supported_languages.org
@@ -9,6 +9,8 @@ de: Dies ist ein kurzer Text zu Testzwecken geschrieben und 
übersetzt in mehrer
 
 en: This is a short text written for testing purposes and translated to 
several languages using Google Translate.
 
+eo: Ĉi tiu estas mallonga teksto skribita por elprov celoj kaj tradukitajn 
kelkajn lingvojn uzantan Google Traduki.
+
 es: Este es un texto corto escrito para propósitos de prueba y traducido a 
varios idiomas usando Google Translate.
 
 fi: Tämä on lyhyt teksti kirjoitettu testausta varten ja käännetty useita 
kieliä Google kääntää.
diff --git a/trigrams/eo b/trigrams/eo
new file mode 100644
index 0000000000..266134a82f
--- /dev/null
+++ b/trigrams/eo
@@ -0,0 +1,300 @@
+ la
+la
+ de
+de
+aj
+oj
+as
+is
+en
+ en
+ ka
+est
+o d
+ es
+kaj
+e l
+to
+sta
+o e
+io
+o k
+on
+ ko
+ro
+ta
+tas
+ al
+a k
+ pr
+n l
+a a
+ po
+ ki
+ ma
+o l
+jn
+ant
+ li
+a p
+ist
+s l
+nto
+sti
+j k
+no
+ita
+tis
+do
+an
+ent
+ re
+aŭ
+j e
+kon
+li
+toj
+ran
+n k
+ ti
+s e
+el
+al
+a s
+ in
+ter
+aro
+ an
+a m
+a e
+ia
+n d
+ojn
+per
+ s
+j d
+ se
+nta
+str
+sto
+a l
+ pl
+mo
+a d
+ ĝi
+ si
+ tr
+and
+s k
+o p
+lo
+j l
+tra
+par
+ pa
+unu
+pro
+ono
+o a
+nte
+j p
+ no
+ ku
+te
+mal
+taj
+ el
+kom
+iu
+art
+roj
+ ja
+ĝis
+ mo
+lan
+ra
+a r
+s a
+ vi
+era
+tro
+gra
+er
+e k
+ori
+n e
+ di
+ata
+int
+s p
+o s
+a f
+ko
+a t
+j a
+n p
+ ek
+kiu
+na
+ne
+ pe
+e e
+e d
+da
+ili
+l l
+ado
+ank
+ver
+por
+men
+e a
+ ne
+man
+ me
+ du
+un
+ un
+ato
+kun
+mon
+ali
+ste
+ajn
+dis
+tri
+rio
+j s
+ lo
+ara
+pre
+ te
+ gr
+oni
+kie
+nom
+jar
+nda
+i e
+ĝi
+noj
+kto
+ero
+n s
+igi
+cio
+e s
+a v
+a n
+or
+pri
+e p
+ fo
+ ĉe
+iĝi
+s s
+n a
+ ha
+eri
+ ar
+ndo
+a u
+ont
+ano
+lia
+iel
+ost
+ris
+ fa
+ort
+iko
+lin
+ari
+ ĉi
+ri
+iaj
+ion
+mun
+ ve
+ino
+tor
+ sa
+loj
+co
+nis
+ton
+ aŭ
+e m
+ona
+rto
+aci
+spe
+ala
+ple
+for
+o t
+vas
+olo
+tiu
+jo
+pos
+kaŭ
+re
+j m
+nio
+ fi
+ st
+o m
+ ba
+tan
+a j
+ekt
+ ge
+ons
+s m
+omo
+ing
+ mi
+omu
+a b
+a i
+ten
+enc
+res
+ika
+rbo
+vis
+nka
+pli
+ a
+ mu
+iuj
+tem
+hav
+ kr
+ na
+ila
+alo
+ ke
+aĵo
+umo
+i l
+ani
+ova
+num
+r l
+urb
+ron
+ ap
+am
+tat
+tur
+cia
+ ri
+ovi
+ava
+ntr
+ or
+ejo
+nst
+ka



reply via email to

[Prev in Thread] Current Thread [Next in Thread]