guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HELP] a search engine in GNU Guile


From: Amirouche Boubekki
Subject: Re: [HELP] a search engine in GNU Guile
Date: Fri, 09 Sep 2016 20:10:30 +0200
User-agent: Roundcube Webmail/1.1.2

On 2016-09-09 16:05, Ralf Mattes wrote:
On Fri, Sep 09, 2016 at 09:39:24AM -0500, Christopher Allan Webber wrote:
Amirouche Boubekki writes:

> - port whoosh/lucene to guile to improve text search

Sorry, but I don't see the point of this.

I mean to say "to improve text search of my previous attempt at writing a search engine". The previous iteration of this project does not support boolean search.

At least Lucene has a http-based
interface that can be accessed by any kind of client language.

That is trivial to do with guile too.

Why reinvent the wheel

Because it's a hobby.

(and, in the case of Lucene, a rather well working,

It's not possible to use a custom storage engine with Lucene.

extremly mature

My theory is that some search engine businesses like algolia forked Lucene to build it on top of something similar to wiredtiger and can now claim impressive performance.

What I mean to say basically, is that wsh.scm is innovation. I read here and there that big players are actually using storage engines similar to wiredtiger to build search engines...
So, it's not a bad idea it just an idea that is not common.

and complex wheel)?

How complex? That's what I try to understand. AFAIK it's not as complex as opencog
since I can rewrite more features.



This is something I'd love to see generally.  It would be nice to have
an indexing library, either by writing bindings to Xapian (which
unfortunately couldn't use the FFI since it's C++),

But almost all of Xapian's bindings are Swig-generated (and that seems to be the prefered way of generating bindings). IIRC I used the Swig Guile bindings years ago (I'm pretty shure that code got lost in a harddisk crash, but I'm to
lazy to google it up ...).

or natively porting
something like Whoosh, for Guile.

I've seen similar approaches for Common Lisp (search for montezuma) but in the
end it seems to be way too much work - remember that not a small part
of Lucene's
success is based on the existing ecosystem (Solr, excellent language
parsers et al.)

If you think about stemming then it's not supported yet by wsh at all. It's an area
I'd like to improve.

I agree that if someone wants to create a business using Guile, they would be up and running faster using ES or solr. It will be a good contribution to Guile ecosystem. I am not building a business, I'm studying free software zoo. wsh is basically a notes in the form of code on the road to what I actually want to reach which is concept search cf. https://en.wikipedia.org/wiki/Concept_search



reply via email to

[Prev in Thread] Current Thread [Next in Thread]