silpa-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[silpa-discuss] Re: Optimization idea


From: Vasudev Kamath
Subject: [silpa-discuss] Re: Optimization idea
Date: Sat, 24 Apr 2010 23:05:24 +0530
User-agent: KMail/1.12.4 (Linux/2.6.33-2.slh.6-sidux-686; KDE/4.3.4; i686; ; )

Ok then let me summarize our discussion
1. Dictionary files will have a index file containing alphabets and line number 
for the words with that alphabets begin. (Assumption dictionary will have each 
words in seperate lines) Doubt- Will there be a single index file containing 
information on all dictionaries or multiple?
2. Format of the index file
     a=1
     b=2000
     .....
3. How file will be saved? normal file or python pickles
4. A generic python program for  generating this index file should be written.

Once the index file is created how it will be used with in the silpa?.. 
Currently when train is called if not already read the dictionary file is 
completely read and placed in the dictionary with langauge as key. What about 
the new approach ?

If the format for index file is same as i mentioned in point 2 i'll start 
working on index file creation script. Let me know if there is any changes in 
the points i mentioned above.

Thanks and Regards
Vasudev Kamath
> On Sat, Apr 24, 2010 at 10:33 PM, Vasudev Kamath
> 
> <address@hidden> wrote:
> > Yeah that clarifies my all doubts so we can go ahead with your idea which
> > will definitely improve the efficiency. We need to finalize the format
> > for indexing file and we should have some way to create this file
> > automatically by just giving the dictionary name as input. Let me
> > research on this once the format of index file is finalized
> 
> Great. it would be great if we can measure and quantify the
> performance change with old and new approaches
> Writing a program to create this index file will not be difficult I
> guess. The program should be generic such a way that we should be able
> to regenerate index when we update the dictionaries.
> 
> Thanks
> Santhosh
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]