[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: predictive.el -- predictive completion of words as you type in Emacs
From: |
Toby 'qubit' Cubitt |
Subject: |
Re: predictive.el -- predictive completion of words as you type in Emacs |
Date: |
Tue, 28 Feb 2006 16:34:55 +0100 |
User-agent: |
Mutt/1.5.11 |
On Sun, Feb 26, 2006 at 04:48:56PM -0000, Phillip Lord wrote:
> > The predictive mode package adds a predictive completion minor mode
> > to Emacs. The sources are too big to post here, but are available
> > from:
> >
> > http://www.dr-qubit.org/download.php?file=predictive/predictive.tar.gz
> >
> > The package's web page can be found at:
> >
> > http://www.dr-qubit.org/emacs.php
>
>
> Looks interesting, but rather like pabbrev.el which does much the same
> thing. Have you tried both?
Wish I'd found pabbrev.el when I was first looking for such a package!
I would probably have contributed code to it instead of writing my
own. But maybe it didn't exist at the time. (I was looking about two
years ago, and predictive.el has been around for almost that long. I
just wasn't confident that it worked well enough to post it to
gnu-sources before now).
Still, given both packages do now exist, it does provide for an
interesting (at least for me) comparison of two different approaches
to the same problem. I've had a quick look at pabbrev.el, and here's
my summary of what I think the differences are. I've tried to make
this as unbiased as possible, but obviously I know predictive mode a
lot better, so let me know if I've got something wrong:
1) "Philosophically", predictive mode treats its dictionaries more like
static reference sources (although they are obviously updated as it
learns), whereas pabbrev mode treats them more as a dynamic analysis
of buffer contents. I think most of the differences between the two
packages stem from this slightly different way of thinking about the
dictionaries.
User-visible differences:
2) Predictive dictionaries are usually created from a complete list of
words in the language, e.g. it ships with word lists for English,
LaTeX, HTML etc. Pabbrev dictionaries are created dynamically from
the words used in buffers. Although it would just about be possible to
create predictive dictionaries dynamically, it is definitely not as
easy as just using pabbrev mode.
The disadvantage is that dictionaries have to be created as a separate
step before using them. The advantage is that all words in the
language are always available for completion, they don't need to
already have been used in a buffer.
3) The pabbrev dictionaries aren't persistent between emacs
sessions. Predictive dictionaries are (though it is possible to
make non-persistent dictionaries). The advantage of persistent
dictionaries is obvious: word frequency information keeps
accumulating, so the dictionaries increasingly adapt to your
writing style.
Of course, it wouldn't be that difficult to make pabbrev
dictionaries persistent, it just hasn't been done yet.
4) Predictive mode can automatically switch between different
dictionaries in different regions of a buffer. For example, it
automatically uses a dictionary of maths commands within a LaTeX
equation environment, or of HTML tags after "<".
Along the same lines, I plan to interface predictive mode with the
semantic package so that it can use information from its
lexer/parser to suggest even more intelligent completions in
programming modes (making it into an enhanced version of the
"Intellisense" feature found in some IDEs).
5) There are differences in the user interface for predictive and
pabbrev mode. Some that are down to nothing more than default key
bindings (e.g. punctuation characters accept completions in
predictive mode). Others are differences in the features provided:
e.g. predictive mode can display completions in a tooltip or
menu, the most likely completions can be selected with single character
hotkeys.
Low-level differences:
6) The data structures chosen for the dictionaries have different
trade-offs. Predictive dictionaries have O(log n) lookup for
completions, with an automatically updated O(1) cache of
results that took a long time to find. (The data structure for the
cache is in fact very similar to pabbrev's dictionary
structure). Pabbrev's dictionaries are O(1) lookup. In practise,
both are fast enough to type without completions causing delays.
I *think* that the predictive dictionaries have better space
(memory) scaling than the pabbrev dictionaries, with the trade-off
of the slower lookup (O(log n) instead of O(1)) described above. (I
need to look more carefully to work out exactly what the scalings
are.) I suspect that a reasonably complete dictionary of the most
common English words (say 40,000) would take up a lot of memory
using pabbrev's structures. Again, this reflects the difference in
philosophy described above.
7) Predictive doesn't keep the words sorted by frequency. It sorts
them on the fly when they're looked up. This adds even more
overhead to lookup than pabbrev's method (though it does make
inserting words faster).
It's amusing that both packages seem to have chosen the
"wrong" method, given their "philosophy". With it's more static
dictionaries, why does predictive sort on the fly instead of
storing that information? With it's dynamically generated
dictionaries, why does pabbrev have to keep them sorted in the data
structure? Of course, the answer is that lookup in predictive is
fast enough even when sorting on the fly, and word insertion in
pabbrev runs as an idle process so it too is already fast enough.
I've probably missed some things, since I haven't played with pabbrev
mode much yet. I don't know how much the packages could benefit from
each other. I dislike duplication, and would have preferred to
contribute to an existing project if I'd known it existed. But trying
to combine them into one package now doesn't look too likely.
Maybe it's nice to have two different approaches to learning and
storing word frequency information. No doubt each will have advantages
in different circumstances. Predictive mode is more "heavy weight",
and a "lighter-weight" package like pabbrev is probably better for the
majority of users.
The predictive user interface code could be useful for pabbrev
though. For example, if you call the `complete' function from
`predictive-completion.el', supplying it with a list of available
completions, it does all the work needed to provisionally insert the
completion in the buffer, to allow accepting, rejecting, cycling,
tab-completing and hotkey-selecting the provisional completion, and to
display completions in the echo area, in a tooltip, in a completion
menu or in a hierarchical completion browser. (Each feature can be
enabled or disabled via customizations.) If you wanted pabbrev mode to
provide similar features, it would make sense to reuse
`predictive-completion.el' so that development on it could benefit
both packages, and the same user customizations would apply to both
packages.
Let me know what you think, and feel free to continue privately if
this is getting too off-topic.
Toby
--
PhD Student
Quantum Information Theory group
Max Planck Institute for Quantum Optics
Garching, Germany
email: address@hidden
web: www.dr-qubit.org