guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ANN] guile-snowball-stemmer 0.1.0


From: amirouche
Subject: Re: [ANN] guile-snowball-stemmer 0.1.0
Date: Tue, 07 May 2019 20:30:31 +0200
User-agent: Roundcube Webmail/1.3.8

On 2019-05-07 15:28, address@hidden wrote:
I am pleased to announce the immediate availability of guile-snowball-stemmer.

This is binding library that allows to compute the stem of words in various languages. The list of supported language is available in the following REPL
run.

This is a binding library. The official website is at https://snowballstem.org/

It is mostly useful in the context of information retrieval.

The code is at https://git.sr.ht/~amz3/guile-snowball-stemmer

The libstemmer shared library path is hardcoded as guix path of the library. A guix package definition of the C library is available in my guix channel at:

  https://git.sr.ht/~amz3/guix-amz3-channel

That said there is no guix package for the bindings. Just include the file
attached to this mail in you project.

Here is a demo:

scheme@(guile-user)> (import (snowball-stemmer))

scheme@(guile-user)> (stemmers)
$1 = ("turkish" "swedish" "spanish" "russian" "romanian" "portuguese"
"porter" "norwegian" "italian" "hungarian" "german" "french" "finnish"
"english" "dutch" "danish")

scheme@(guile-user)> (make-stemmer "amazigh")
ERROR: In procedure scm-error:
ERROR: snowball-stemmer "Oops! Stemmer not found" "amazigh"

scheme@(guile-user)> (define english (make-stemmer "english"))
scheme@(guile-user)> (stem english "cycling")
$2 = "cycl"
scheme@(guile-user)> (stem english "ecology")
$3 = "ecolog"
scheme@(guile-user)> (stem english "library")
$4 = "librari"
scheme@(guile-user)> (stem english "virtual")
$5 = "virtual"
scheme@(guile-user)> (stem english "environment")
$6 = "environ"

scheme@(guile-user)> (define french (make-stemmer "french"))
scheme@(guile-user)> (stem french "environnement")
$7 = "environ"
scheme@(guile-user)> (stem french "bibliotheque")
$8 = "bibliothequ"
scheme@(guile-user)> (stem french "gazette")
$9 = "gazet"
scheme@(guile-user)> (stem french "constituant")
$10 = "constitu"


Small update, I forgot to actually guard the stemmer.

Here is the patch:

diff --git a/snowball-stemmer.scm b/snowball-stemmer.scm
index b754808..603a97e 100644
--- a/snowball-stemmer.scm
+++ b/snowball-stemmer.scm
@@ -67,6 +67,7 @@
       (let ((out (proc (string->pointer algorithm) NULL)))
         (when(eq? out NULL)
(error 'snowball-stemmer "Oops! Stemmer not found" algorithm))
+        (stemmers-guardian out)
         out))))

 (define (reap-stemmers)

You will find attached to this mail the fixed version.

Attachment: snowball-stemmer.scm
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]