guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ANN] guile-snowball-stemmer 0.1.0


From: amirouche
Subject: Re: [ANN] guile-snowball-stemmer 0.1.0
Date: Tue, 07 May 2019 22:36:21 +0200
User-agent: Roundcube Webmail/1.3.8

On 2019-05-07 20:30, address@hidden wrote:
On 2019-05-07 15:28, address@hidden wrote:
I am pleased to announce the immediate availability of guile-snowball-stemmer.


I made (yet another toy) search engine. It is a small command
line tool that I attach to this mail. The code can be found at:

  https://git.sr.ht/~amz3/guile-gotofish

Here is an example run:

$ mkdir ~/.gotofish  # Database is stored there
$ guile -L . gotofish.scm search gnu guile  # Nothing yet!

# Let'index a couple of articles

$ curl https://en.wikipedia.org/wiki/GNU_Guile | html2text | guile -L . gotofish.scm index "GNU Guile"
Done!
$ curl https://en.wikipedia.org/wiki/Scheme_%28programming_language%29 | html2text | guile -L . gotofish.scm index "Scheme"
Done!
$ curl https://en.wikipedia.org/wiki/GNU | html2text | guile -L . gotofish.scm index "GNU"
Done!
$ curl https://en.wikipedia.org/wiki/Tf%E2%80%93idf | html2text | guile -L . gotofish.scm index "tf-idf"
Done!

# Let's search

$ guile -L . gotofish.scm search gnu guile
** Scheme
** GNU Guile

$  guile -L . gotofish.scm search gnu
** GNU
** GNU Guile
** Scheme

$  guile -L . gotofish.scm search science
** GNU
** GNU Guile
** Scheme

$  guile -L . gotofish.scm search retrieval

# Even if the exact word "retrieval" is not in those pages,
# "retrieved" has the same stem as "retrieval" so all are
# matches

** GNU
** tf-idf
** GNU Guile
** Scheme

$ guile -L . gotofish.scm search idf
** tf-idf


Also one can use multiple words to do a lookup.

This is very primitive but hopefully it will help get going
tomorrow to build my great app!

Attachment: gotofish.scm
Description: Text document

Attachment: README.md
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]