[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Varnamproject-discuss] Next step

From: Kevin Martin
Subject: [Varnamproject-discuss] Next step
Date: Fri, 27 Jun 2014 20:39:22 +0530

I've almost completed drafting the stem rules for varnam. The accuracy of stemming is around 85% when testing with words from malayalam wikipedia articles. However, this 85% accuracy involves words that are not stemmed at all. That is, if there are 100 words, 40 (speculation) of them won't be stemmed at all and would still be counted as correct because they do not need stemming. If 100 words that are stem-able are supplied to the algorithm, the accuracy might be lower.

I'm hoping to try and make varnam stem while it learn soon. I suppose I will have to test the results using the ibus engine. Exactly what am I looking for? More suggestions when trying to type a word? Is there a way to describe the change in a metric? (like 20% better suggestions). I've always wondered how to estimate things like these.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]