silpa-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[silpa-discuss] Re: Usage of Soundex algorithm in Spellchecker module


From: Santhosh Thottingal
Subject: [silpa-discuss] Re: Usage of Soundex algorithm in Spellchecker module
Date: Sat, 24 Apr 2010 16:34:43 +0530

On Sat, Apr 24, 2010 at 1:22 PM, Vasudev Kamath <address@hidden> wrote:
> Hi,
> I've seen that soundex is used to filter the candidates got after applying
> levenshtein algorithm. But as per this discussion soundex is pretty old and
> basically developed to compare american names (or words)
> http://stackoverflow.com/questions/42013/levenshtein-distance-based-methods-vs-
> soundex
> Now my point is how well it can help us in indic languages? as per my research
> i found one more algorithm called Metaphone developed in 1990's can we try to
> include this in silpa?


The algorithm used in silpa is not the algorithm used for American
names. we call it as soundex just because it is a sounds like search
algorithm. In original soundex, vowels are ignored unless they appear
as first letter. But in our soundex- indic soundex we are not ignoring
vowels and vowelsigns. The codes that we give for each letter is
according to the pronunciation characteristics of Indian letters. And
we consider all indian languages to a single category too, there by
facilitating cross-language comparison. In short, except the name,
everything else is different. So there is no point in comparing our
algorithm with the original soundex algorithm for English.
The truth is we cannot use metaphone or soundex just like that in
Indic languages. We adopted the idea and we write our own
implementation based on the linguistic feature of our languages.

More on this: http://thottingal.in/blog/2009/07/26/indicsoundex/

-Santhosh




reply via email to

[Prev in Thread] Current Thread [Next in Thread]