[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Varnamproject-discuss] [bug #41902] [libvarnam] Normalization of words

From: Navaneeth
Subject: [Varnamproject-discuss] [bug #41902] [libvarnam] Normalization of words while learning
Date: Wed, 19 Mar 2014 04:50:09 +0000
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:27.0) Gecko/20100101 Firefox/27.0


                 Summary: [libvarnam] Normalization of words while learning
                 Project: Varnamproject
            Submitted by: navaneethkn
            Submitted on: Wed 19 Mar 2014 01:50:08 PM TLT
                Category: libvarnam
                Severity: 3 - Normal
              Item Group: Bug
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any



In Unicode, there are characters which look same but have different code
points. The atomic chills is Malayalam is an example. Also Unicode text can
contain metadata characters like "SOFT HYPHEN (xad)". This has to be removed
and normalized to a standard form while learning a word.

Scheme file will define the word normalization rules and varnam will apply
them while learning. 


Reply to this item at:


  Message sent via/by Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]