[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ANN] guile-snowball-stemmer 0.1.0
From: |
amirouche |
Subject: |
Re: [ANN] guile-snowball-stemmer 0.1.0 |
Date: |
Tue, 07 May 2019 20:30:31 +0200 |
User-agent: |
Roundcube Webmail/1.3.8 |
On 2019-05-07 15:28, address@hidden wrote:
I am pleased to announce the immediate availability of
guile-snowball-stemmer.
This is binding library that allows to compute the stem of words in
various
languages. The list of supported language is available in the following
REPL
run.
This is a binding library. The official website is at
https://snowballstem.org/
It is mostly useful in the context of information retrieval.
The code is at https://git.sr.ht/~amz3/guile-snowball-stemmer
The libstemmer shared library path is hardcoded as guix path of the
library.
A guix package definition of the C library is available in my guix
channel at:
https://git.sr.ht/~amz3/guix-amz3-channel
That said there is no guix package for the bindings. Just include the
file
attached to this mail in you project.
Here is a demo:
scheme@(guile-user)> (import (snowball-stemmer))
scheme@(guile-user)> (stemmers)
$1 = ("turkish" "swedish" "spanish" "russian" "romanian" "portuguese"
"porter" "norwegian" "italian" "hungarian" "german" "french" "finnish"
"english" "dutch" "danish")
scheme@(guile-user)> (make-stemmer "amazigh")
ERROR: In procedure scm-error:
ERROR: snowball-stemmer "Oops! Stemmer not found" "amazigh"
scheme@(guile-user)> (define english (make-stemmer "english"))
scheme@(guile-user)> (stem english "cycling")
$2 = "cycl"
scheme@(guile-user)> (stem english "ecology")
$3 = "ecolog"
scheme@(guile-user)> (stem english "library")
$4 = "librari"
scheme@(guile-user)> (stem english "virtual")
$5 = "virtual"
scheme@(guile-user)> (stem english "environment")
$6 = "environ"
scheme@(guile-user)> (define french (make-stemmer "french"))
scheme@(guile-user)> (stem french "environnement")
$7 = "environ"
scheme@(guile-user)> (stem french "bibliotheque")
$8 = "bibliothequ"
scheme@(guile-user)> (stem french "gazette")
$9 = "gazet"
scheme@(guile-user)> (stem french "constituant")
$10 = "constitu"
Small update, I forgot to actually guard the stemmer.
Here is the patch:
diff --git a/snowball-stemmer.scm b/snowball-stemmer.scm
index b754808..603a97e 100644
--- a/snowball-stemmer.scm
+++ b/snowball-stemmer.scm
@@ -67,6 +67,7 @@
(let ((out (proc (string->pointer algorithm) NULL)))
(when(eq? out NULL)
(error 'snowball-stemmer "Oops! Stemmer not found"
algorithm))
+ (stemmers-guardian out)
out))))
(define (reap-stemmers)
You will find attached to this mail the fixed version.
snowball-stemmer.scm
Description: Text document