silpa-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [silpa-discuss] Improving Transliteration System


From: Santhosh Thottingal
Subject: Re: [silpa-discuss] Improving Transliteration System
Date: Thu, 18 Apr 2013 15:31:27 +0530




2013/4/18 Yash Sinha <address@hidden>
Hello!,
I wish to improve the transliteration system through the following ways:

Add Hindi as an intermediate language for transliteration.


That may not work because there is no single Indian language that can represent all sounds of any other Indian language. Some intermediate representation I have in mind is IPA or ISO 15919


The second way can be using Compressed Word Format Mapping algorithm using Modified Levenshtein algorithm.

1. CWF will convert a word to its basic phonetic alphabets.
For example:  musharraf  ->  musaraf,  chidhambaram -> cidambaram

Please read the existing Indic soundex algorithm too http://thottingal.in/blog/2009/07/26/indicsoundex/
 

3. We will then rank the words according to CWF+Mlev distance


Can you note down the approach a bit more detail somewhere, may be in our wiki?

Thanks
Santhosh


reply via email to

[Prev in Thread] Current Thread [Next in Thread]