aspell-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[aspell-user] Recommendations for non-interactive use


From: Greg Ward
Subject: [aspell-user] Recommendations for non-interactive use
Date: Wed Mar 20 13:10:04 2002
User-agent: Mutt/1.3.27i

I'm trying to figure out what the best tool for non-interactive,
just-show-me-the-misspelled-words-and-go-away use is.  The specific
context is a web crawler that spellchecks each page, so the ability to
parse HTML would be spiffy.

ispell works, but its HTML parser appears broken, so I have to parse the
HTML myself and feed ispell the non-tag text.  This is implemented and
working, so if nobody has a better idea, it's what I'll stick with.

I've just tried aspell .33.7.1, and its HTML parser is definitely
better, but it's two orders of magnitude slower than ispell.  (On one
50k HTML file, ispell takes 0.025 sec, and aspell takes 2.2 sec.)  IMHO
this is a showstopper, but I wonder if it's possible my aspell is
miscompiled or misconfigured or something.  Or is aspell just 100x
slower than ispell in general?  This is on Debian Linux 3.0 (unstable).

Finally, if anyone knows of another tool that simply detects and reports
misspelled words, without bothering to suggest alternatives, I'd love to
hear about it.  Did a quick freshmeat search this morning (which is what
reminded me of aspell), but didn't find anything.

Thanks --

        Greg
-- 
Greg Ward - software developer                address@hidden
MEMS Exchange                            http://www.mems-exchange.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]