[Varnamproject-discuss] [bug #41902] [libvarnam] Normalization of words

varnamproject-discuss

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Varnamproject-discuss] [bug #41902] [libvarnam] Normalization of words

From:	Navaneeth
Subject:	[Varnamproject-discuss] [bug #41902] [libvarnam] Normalization of words while learning
Date:	Wed, 19 Mar 2014 04:50:09 +0000
User-agent:	Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:27.0) Gecko/20100101 Firefox/27.0

URL:
  <http://savannah.nongnu.org/bugs/?41902>

                 Summary: [libvarnam] Normalization of words while learning
                 Project: Varnamproject
            Submitted by: navaneethkn
            Submitted on: Wed 19 Mar 2014 01:50:08 PM TLT
                Category: libvarnam
                Severity: 3 - Normal
              Item Group: Bug
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any

    _______________________________________________________

Details:

In Unicode, there are characters which look same but have different code
points. The atomic chills is Malayalam is an example. Also Unicode text can
contain metadata characters like "SOFT HYPHEN (xad)". This has to be removed
and normalized to a standard form while learning a word.

Scheme file will define the word normalization rules and varnam will apply
them while learning. 




    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?41902>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

[Prev in Thread]

Current Thread

[Next in Thread]

[Varnamproject-discuss] [bug #41902] [libvarnam] Normalization of words while learning, Navaneeth <=
- [Varnamproject-discuss] [bug #41902] [libvarnam] Normalization of words while learning, kiran, 2014/03/27
  - [Varnamproject-discuss] [bug #41902] [libvarnam] Normalization of words while learning, Navaneeth, 2014/03/28

Prev by Date: [Varnamproject-discuss] [bug #40510] Train all the words available in Datuk Corpus
Next by Date: [Varnamproject-discuss] [bug #41891] [libvarnam] MALAYALAM AU LENGTH MARK is not mapped in the scheme file
Previous by thread: [Varnamproject-discuss] [bug #41891] [libvarnam] MALAYALAM AU LENGTH MARK is mapped in the scheme file
Next by thread: [Varnamproject-discuss] [bug #41902] [libvarnam] Normalization of words while learning
Index(es):
- Date
- Thread