bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash t


From: Andy Moreton
Subject: bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash tables
Date: Mon, 25 Mar 2019 20:15:33 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1.92 (windows-nt)

On Mon 25 Mar 2019, Eric Abrahamsen wrote:

> Andy Moreton <andrewjmoreton@gmail.com> writes:
>> Other notes from reading the code:
>>
>> 1) In `gnus-gnus-to-quick-newsrc-format' you ignore the contents of
>>    `gnus-newsrc-alist' when saving "newsrc.eld", and replace it with the
>>    details from `gnus-newsrc-hashtb'. Why ? The rest of the gnus code
>>    appears to treat `gnus-newsrc-alist' as the single source of truth,
>>    with the hash tables being used only for faster access to it.
>
> Eventually I would like to reduce the number of data structures so that
> groups are held in `gnus-newsrc-hashtb', and ordering is kept in
> `gnus-group-list', and that's it. `gnus-newsrc-alist' would only be used
> when persisting to disk. My next proposed change (once I've recovered my
> confidence) is to turn groups into actual objects, in which case the
> alist would really just be a kind of serialization format.

ok, but if the hash table and the alist reference the same lisp objects
then the existing setup is not so bad.

> The hash table ought to be in sync with the rest of the data structure
> -- if it isn't, that's another bug.
>
>> 2) In `gnus-gnus-to-quick-newsrc-format' you dropped the code to remove
>>    the dummy group from `gnus-newsrc-alist'. Why ? This internal dummy
>>    group is now saved in "newsrc.eld", which is not needed.
>
> This was an error. (Though in my case, I've had the dummy group in my
> newsrc.eld for months, and it hasn't done any harm. I don't know why
> it's necessary.)

I agree that it's harmless, it just seemed to be an unnecessary change.
I expect it is there to ensure that the alist is never empty to avoid
the need to check that everywhere.

>> 3) The format of the entries in `gnus-newsrc-hashtb' has changed,
>>    removing the second element. Why ?
>
> Because the old `gnus-gethash' call returned a slice of
> `gnus-newsrc-alist', where the second element was actually the group
> *before* the group you wanted, and the third element was the cdr of
> `gnus-newsrc-alist', starting with the group you wanted. This was
> undocumented, and took a bit to figure out. Now, the gethash call just
> gives you the group. Ideally, in the next set of changes, it will give
> you an object.

ok, so it sounds like the old code was trying hard to use the same lisp
objects in both the hash table and the alist. That avoids alloacting
twice the storage, and ensures that the hash table and the alist stay in
sync.

>> 4) You changed several hash tale sizesfrom 4096 to 4000, and 1024 to
>>    1000. Why ?
>
> My understanding is that using a prime number is significant when it
> comes to vector access, but that the hash table implementation is
> higher-level, where a prime number is no longer significant. If that's
> incorrect I would like to know!

None of these numbers are prime. The old numbers are powers of two,
which are reasonable for allocation sizes on a binary machine. Again,
not a terribly important change, but not really needed either.

>> Your patch contains several logical changes that would be easier to
>> understand (and bisect) as a series of patches with one logical change
>> in each patch:
>>  - code layout changes
>>  - add missing doc strings and code comments
>>  - change hash table implementation
>>  - change format of `gnus-newsrc-hashtb' entries
>>  - change usage of `gnus-group-change-level'
>>  - change coding of group names
>> While it can take extra work to split things up, the end result is much
>> easier to understand.
>
> In principle I agree with this completely. In practice I found it
> extraordinarily difficult to touch one part of Gnus without running into
> knock-on repercussions.

Agreed. I find that this sort of work usually goes in two phases: some
exploratory programming to decide on the path forward and the eventual
goal, followed by rearranging the changes into a logical set of
(bisectable) patches that get to that goal in small, self-contained
steps. The second half is extra work, but worth it to make it easier for
other people to review the patches (and to simplify any archaeology
needed by a later maintainer).

If changing gnus is hard, then perhaps writing new tests to document
what you have discovered about the code will help to ensure that later
changes do not change the observed behaviour.

> The ultimate goal of the changes I have in mind for Gnus is to address
> exactly this: to make it more modular, to improve isolation of code
> paths, and to reduce the number of semi-redundant data structures. But
> the process is evidently even messier than I thought. I held back
> another commit to group name encoding in an attempt to keep things
> simple, but that seems to have made things even worse.

A worthwhile goal - please do not get discouraged.

    AndyM






reply via email to

[Prev in Thread] Current Thread [Next in Thread]