[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Varnamproject-discuss] Varnam GSOC ideas

From: Navaneeth K N
Subject: Re: [Varnamproject-discuss] Varnam GSOC ideas
Date: Sun, 22 Feb 2015 17:49:06 +0530

Hi Kevin,

Urdu is something which we are trying to add. It is in the early stages. So 
don’t know if we will get any issues when doing that. But if Urdu works well, 
then that is a good confirmation that it can work well with non-indian 

Varnam should compile & work fine on Windows although I haven’t tried it for 
last few months. Let me know if you get any issues when doing so.

> On 17-Feb-2015, at 2:22 pm, Kevin Martin <address@hidden> wrote:
> I've been thinking about this idea. Its true that its relevance is only till 
> varnam gets all language support. But the rules for the stemmer too goes into 
> the scheme file. To improve or add the stemmer in other languages we have to 
> edit the scheme file again. For someone not well versed in ruby the syntax 
> (all the square brackets and curlies) might look a bit intimidating. I 
> remember soorej having difficulty reading the scheme file. So if we come up 
> with an editor/gui tool that makes editing the scheme file more intuitive for 
> the end user, I think it will eventually detach the burden of adding a 
> language/stemmer support from the developer. No matter how many comments we 
> include in the scheme file, an end user will always prefer a GUI interface to 
> a command line one.
> Also, can varnam be adapted to handle non-indic languages? I know that the 
> database now contains sanskrit based entries like swaras and viramas. But if 
> someone wants to, say, add support for arabic, will it need changes to the 
> underlying logic?
> Also, does varnam now compile under windows? 
> On Fri, Feb 13, 2015 at 10:51 AM, Navaneeth K N <address@hidden> wrote:
> We learned more words in Kannada. The whole wikipedia dump was fed into 
> varnam. I haven’t released it yet.
> ---
> Navaneeth
> > On 13-Feb-2015, at 10:48 am, Kevin Martin <address@hidden> wrote:
> >
> > 21 Gb? That's way too much! But how come Kannada take so much space when 
> > malayalam is less than 1 Gb?
> >
> > But I think we should get varnam into mobile. That will result in the 
> > project getting the popularity it deserves.
> >
> > On Fri, Feb 13, 2015 at 10:34 AM, Navaneeth K N <address@hidden> wrote:
> > Hi Kevin,
> >
> > Yes. That is a good idea. But it is usable only till varnam gets all the 
> > language support. I think as part of GSOC, we should target for something 
> > which can serve long term. Like the stemmer implementation.
> >
> > I was thinking getting varnam into mobile space and integrate it with Indic 
> > keyboard a new varnam keyboard altogether. What do you think about this?
> >
> > Or rewrite the learning algorithm so that learned data takes less space. 
> > This is critical when getting into the mobile space and offline editing. 
> > Currently, Kannada learned file takes about 21Gb of space.
> >
> > thoughts?
> >
> > —
> > Navaneeth
> >
> > > On 12-Feb-2015, at 9:20 pm, Kevin Martin <address@hidden> wrote:
> > >
> > > Hi,
> > >
> > > I was wondering about a few project ideas related to varnam for students 
> > > this year. When soorej and I was working on the inscript support soorej 
> > > had to modify the scheme file. He mentioned that it would be nice if we 
> > > had a GUI tool that makes it easy to write the scheme file. Can this be a 
> > > GSOC project idea, perhaps of a lower priority? Maybe we can add a few 
> > > more related tasks to this.
> >
> >
> >

reply via email to

[Prev in Thread] Current Thread [Next in Thread]