[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)
From: |
Marc Nieper-Wißkirchen |
Subject: |
Re: [New module] Persistent Hash Array Mapped Tries (HAMTs) |
Date: |
Sun, 11 Oct 2020 10:20:37 +0200 |
Am So., 11. Okt. 2020 um 03:28 Uhr schrieb Bruno Haible <bruno@clisp.org>:
>
> Hi Marc,
>
> Some comments:
>
> * GL_HAMT_THREAD_SAFE can be defined to 1 not only if
> __STDC_VERSION__ >= 201112 && ! defined __STD_NO_ATOMICS__
> but also if
> __GNUC__ + (__GNUC_MINOR >= 9) > 4
Fixed.
> (see the other mail).
>
> * Still '#ifdef GL_HAMT_THREAD_SAFE'.
Fixed.
> * Typo s/comparision/comparison/
Fixed.
> * Hamt_processor: The comment does not say what the return value means. Maybe
> it
> would be clearer if this type was moved down to the "Iteration" section.
Moved.
> * hamt_lookup: If the caller is allowed to modify the payload stored in the
> returned entry, this function should return a 'Hamt_entry *', not a
> 'const Hamt_entry *'. Just like 'strchr' and 'strstr' return a 'char *',
> not a 'const char *'.
Unless the caller knows what it does, modifying the payload is not a
good idea because the entry is shared between different hamts. If the
caller really wants to modify the payload, it has to do an explicit
type cast (which is safe).
> * In trienode_count, you don't need count_one_bits_l, only count_one_bits.
> It's OK to assume that 'uint32_t' and 'unsigned int' are the same type.
Okay.
> * If subtrie_lookup was inlined in entry_lookup, you could get rid of the
> forward declaration of entry_lookup, and compilers would be more likely
> to turn the recursion of entry_lookup into a loop (tail-recursion
> elimination).
I would prefer to have smaller functions. Are there any relevant
compilers who do not inline the function at higher recursion levels?
> * If subtrie_do_while was inlined in entry_do_while, you could get rid of the
> forward declaration of entry_do_while.
> * It would be useful to be able to construct modules 'hamt-set' and
> 'hamt-map',
> analogous to 'hash-set' and 'hash-map', but the hamt API currently has two
> issues that block it:
>
> 1) The iterator API is such that you cannot write the iteration using a
> loop;
> it involves a function call that is active during the entire iteration.
> This is also problematic for two other reasons:
>
> - Creating a closure in C (unlike Scheme or Lisp) is tedious hand-work.
> (The GNU C extension that allows local functions inside functions [1]
> is not a solution, because
> - it is unportable,
> - it forces an executable stack on many platforms, which is
> considered bad for security.)
>
> - In Common Lisp, iteration through a hash table is available not only
> through a function-call interface (MAPHASH [2]) but also through an
> iteration-loop interface (WITH-HASH-TABLE-ITERATOR [3], LOOP [4]).
>
> For example, in GNU clisp, LOOP expands to
>
> [1]> (macroexpand-1 '(loop for x being each hash-key of h do (foo
> x)))
> (MACROLET ((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-ERROR)))
> (BLOCK NIL
> (LET ((#:WHTI-3214 (SYSTEM::HASH-TABLE-ITERATOR H)))
> (MULTIPLE-VALUE-BIND (#:MORE?3215 #:HASH-KEY-3216
> #:HASH-VALUE-3217)
> (SYSTEM::HASH-TABLE-ITERATE #:WHTI-3214)
> (DECLARE (IGNORE #:HASH-VALUE-3217))
> (LET ((X NIL))
> (LET NIL
> (MACROLET ((LOOP-FINISH NIL '(GO SYSTEM::END-LOOP)))
> (TAGBODY SYSTEM::BEGIN-LOOP (UNLESS #:MORE?3215 (LOOP-FINISH))
> (SETQ X #:HASH-KEY-3216) (PROGN (PROGN (FOO X)))
> (MULTIPLE-VALUE-SETQ
> (#:MORE?3215 #:HASH-KEY-3216 #:HASH-VALUE-3217)
> (SYSTEM::HASH-TABLE-ITERATE #:WHTI-3214))
> (GO SYSTEM::BEGIN-LOOP) SYSTEM::END-LOOP
> (MACROLET
> ((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-WARN)
> '(GO SYSTEM::END-LOOP)))))))))))) ;
>
> You can see that there is an initial function call
> HASH-TABLE-ITERATOR
> upfront, and then a function call to HASH-TABLE-ITERATE in each round
> of the loop.
>
> This principle makes it possible to iterate e.g. through a hash table
> and a list in parallel.
In Scheme, it is even more relevant because you have call/cc so any
iteration can be suspended and restored any number of times.
I thought this could be solved by storing a "next" pointer in the
payload but this doesn't work with hamts sharing entries... So I will
add some iterator interface as well. Preferably one that can also be
used to implement Hamts in Scheme.
> 2) We have a dilemma regarding use of malloc vs. xmalloc. While modules
> meant for use in programs can readily use xmalloc, modules meant for use
> (also) in libraries cannot use xmalloc, since a library is not supposed
> to terminate the program, even when memory is tight.
GMP more or less has this behavior.
> The solution we found for this dilemma is that the *-set and *-map
> modules
> use just malloc, and we have thin wrapper modules (xset, xmap) that do
> call xalloc_die() in case of out-of-memory.
>
> Unfortunately, the API of many of the functions need to be adjusted to
> cope with the possibility of an ENOMEM failure. That is tedious work, and
> the code does not look so pretty afterwards... But I see no other way to
> make it fit for use in libraries.
The bigger problem is that it mustn't make the code slower for the
99,99999% of the use cases where there is either enough memory or
calling xalloc_die is the only reasonable action.
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), (continued)
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Bruno Haible, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Marc Nieper-Wißkirchen, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Paul Eggert, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Marc Nieper-Wißkirchen, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Marc Nieper-Wißkirchen, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Bruno Haible, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs),
Marc Nieper-Wißkirchen <=
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Marc Nieper-Wißkirchen, 2020/10/11
- Re: HAMT iterator, Bruno Haible, 2020/10/11
- Re: HAMT iterator, Marc Nieper-Wißkirchen, 2020/10/11
- Re: HAMT iterator, Bruno Haible, 2020/10/11
- Re: HAMT iterator, Marc Nieper-Wißkirchen, 2020/10/11
- Re: HAMT iterator, Bruno Haible, 2020/10/11
- Re: HAMT iterator, Bruno Haible, 2020/10/11
- Re: HAMT iterator, Marc Nieper-Wißkirchen, 2020/10/11
- Re: HAMT iterators, Bruno Haible, 2020/10/11
- Re: HAMT iterators, Marc Nieper-Wißkirchen, 2020/10/11