[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A couple of lisp questions

From: Stefan Monnier
Subject: Re: A couple of lisp questions
Date: Wed, 12 Nov 2003 18:28:27 GMT
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50

Stefan> Take a look at how flyspell does it.  Or maybe auto-fill.

> I will. I think auto-fill cheats though, as its tied directly in to
> the command loop. I seem to remember reading that somewhere.

Not the command loop, just the self-command-insert command (which is
implemented in C).  You can hijack the auto-fill-function for
your own non-auto-fill use.

> usage-hash: "the"  -->  ("the" . 4)
>             "and"  -->  ("and" . 6)

Why not just

   "the" --> 4
   "and" --> 6

> Then a suffix hash

> suffix-hash: "t"   --> (("the" . 4) ("then" . 3) ("talk" . 2) etc)
>              "th"  --> (("the" . 4) etc )
>              "the" --> (("the" . 4) etc )

Is `try-completion' too slow (because the usage-hash is too large?) to
build the suffixes on the fly ?

> In this case the cons cells for each word are shared between the
> hashes, so this is not a massive memory waste as the written version
> appears. 

Each word of N letters has:
- one string (i.e. N + 16 bytes)
- one cons-cell (8 bytes)
- one hash-table entry (16 bytes)
in usage-hash, plus:
- N cons-cells (N*8 bytes)
- N hash entries shared with other words (at least 16 btes).
For a total of 9*N + 56 bytes per word.  Probably not a big deal.

> Ideally I would want to build up these word usage statistics as they
> are typed, but as you say its hard to do this. I think a flyspell like
> approach combined with text properties should work okay.

How do you avoid counting the same instance of a word several times?  Oh,
you mark them with a text-property, I see.  More like font-lock than flyspell.

> Anyway the idea with the weakness is that I want to garbage collect
> the dictionary periodically, throwing away old, or rarely used words.

I don't think weakness gives you that.  It seems difficult to use
weakness here to get even a vague approximation of what you want.

You can use a gc-hook to flush stuff every once in a while, but you
could just as well use an idle-timer for that.

> The serialization would be to enable saving across sessions. Most of
> the packages I know that do this depend on their objects having a read
> syntax, which doesn't work with hashes. I think the solution here is
> to convert the thing into a big alist to save it, and then reconstruct
> the hashes on loading.

Why not reconstruct the suffix upon loading?  This way you have no sharing
to worry about and you can just dump the hash via maphash & pp.

> Anyway the idea for all of this was to do a nifty version of
> abbreviation expansion, something like dabbrev-expand, but instead of
> searching local buffers, it would grab word stats as its going, and
> use these to offer appropriate suggestions. I was thinking of a user
> interface a little bit like the buffer/file switching of ido.el, of
> which I have become a committed user.

Sounds neat.

> the way, building an decent UI around this will probably take 10 times
> as much code!

And even more time,


reply via email to

[Prev in Thread] Current Thread [Next in Thread]