bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HAMT iterators


From: Marc Nieper-Wißkirchen
Subject: Re: HAMT iterators
Date: Sun, 11 Oct 2020 14:44:17 +0200

Am So., 11. Okt. 2020 um 12:29 Uhr schrieb Bruno Haible <bruno@clisp.org>:

> The recursion across hamt_do_while / entry_do_while / subtrie_do_while /
> bucket_do_while builds up a stack whose essential contents is a set of
> indices, each at most 5 bits large, with a total of SIZE_WIDTH bits.
> In other words, the iterator object can be a

Yes, this is roughly how the implementation I worked on this morning
works. However, I haven't yet implemented the 5-bit compression.

> struct
>   {
>     const Hamt *hamt;
>     size_t position;
>     int depth;
>     const Hamt_entry *entry;
>   };
>
> where during the iteration
>   - hamt stays unmodified,
>   - position is incremented by some amount at each round,
>   - depth is the recursion depth.
>   - entry is a cache of hamt->nodes[(position >> 59) & 31][(position >> 54) & 
> 31]...
>     with depth-1 indexing operations, and is changed at each round,
>
> HASH-TABLE-ITERATOR sets position, depth, entry to point to the first entry.
> HASH-TABLE-ITERATE works as follows:
>   - in a bucket, it increments position and fetches the next entry of the 
> bucket,

For a bucket, in the worst case, we need the full size_t range for
position, so we have to store the path in the tree that leads to the
bucket in another size_t variable.

>   - if the bucket is done, it decreases depth and goes to
>     hamt->nodes[(position >> 59) & 31][(position >> 54) & 31]... (with depth-1
>     indexing operations) until it finds a subtrie that is not yet done,

This means a lot of pointer operations if large subtries are processed
that are deep in the trie. The hamt_do_while operation stores that
chain of entries from the root implicitly on the stack. The same
should be done for the iterator, meaning that an array of MAX_DEPTH
entries has to be cached. On a 32 bit system, this means 28 bytes,
and, on a 64 bit system, 130 bytes.

>     then it increases depth, entry to point to the first entry in that 
> subtrie.
>
> If done this way, the iterator will fit into the gl_set_iterator_t and
> gl_map_iterator_t that we already have in gl_set.h and gl_map.h.
>
> Bruno



reply via email to

[Prev in Thread] Current Thread [Next in Thread]