varnamproject-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Varnamproject-discuss] [bug #40412] Repeated tokens in the tokens list


From: Navaneeth
Subject: [Varnamproject-discuss] [bug #40412] Repeated tokens in the tokens list
Date: Tue, 29 Oct 2013 14:11:34 +0000
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:24.0) Gecko/20100101 Firefox/24.0

URL:
  <http://savannah.nongnu.org/bugs/?40412>

                 Summary: Repeated tokens in the tokens list 
                 Project: Varnamproject
            Submitted by: navaneethkn
            Submitted on: Tue 29 Oct 2013 02:11:33 PM GMT
                Category: libvarnam
                Severity: 4 - Important
              Item Group: Bug
                  Status: None
                 Privacy: Public
             Assigned to: navaneethkn
             Open/Closed: Open
         Discussion Lock: Any

    _______________________________________________________

Details:

It looks like `libvarnam` is adding duplicate items into the tokens list. For
eg, tokenization for the word
"അതുറങ്ങിയിട്ടുണ്ടായിരുന്നു".

[1 - 'a']
[4 - 'thu', 4 - 'tu']
[2 - 'ra', 2 - 'ra']
[4 - 'ngi', 4 - 'ngi', 4 - 'ngy', 4 - 'ngy']
[4 - 'yi', 4 - 'yy']
[4 - 'ttu']
[4 - 'nda', 4 - 'ndaa', 4 - 'nda', 4 - 'nda', 4 - 'ndaa', 4 - 'nda', 4 -
'nta', 4 - 'ntaa', 4 - 'nta']
[4 - 'yi', 4 - 'yy']
[4 - 'ru']
[4 - 'nnu']




    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?40412>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]