silpa-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[silpa-discuss] Dictionary Index Generator


From: Vasudev Kamath
Subject: [silpa-discuss] Dictionary Index Generator
Date: Sun, 25 Apr 2010 17:00:06 +0530
User-agent: KMail/1.12.4 (Linux/2.6.33-2.slh.10-sidux-686; KDE/4.3.4; i686; ; )

Hi,
Finally I was able to write a script which generates index file for a given 
dictionary.
Here are some assumption
1. If file is english dictionary it is opened with normal 
open since english dictionary encoding is IS0-8859 else files are opened with 
utf-8 encoding.
2. For english small and capital letters are treated differently since words 
with a and A start at different locations in the dictionary. For fixing this 
dictionary needs to be fixed
3. I used cPickle instead of saving index as normal file, cPickle with protocol 
2 is used for efficiency purpose and hence index file won't be human readable.
Reason for using Python pickles as file format is to just reduce the complexity 
of processing index file. If desired we can create index as normal file. I need 
suggestions on this

I'm attaching the dictionary indexing script please test it and let me know of 
any changes that needs to be done. 

Thanks and Regards
Vasudev Kamath

Attachment: indexer.py
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]