[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Tinycc-devel] struct bug: identical named struct members
From: |
Michael Matz |
Subject: |
Re: [Tinycc-devel] struct bug: identical named struct members |
Date: |
Sun, 29 Nov 2020 00:11:17 +0100 (CET) |
User-agent: |
Alpine 2.21 (LSU 202 2017-01-01) |
Hello,
On Sat, 28 Nov 2020, Tyge Løvset wrote:
Yes I started looking in struct_decl. Is there a reason why it doesn't use
an efficient unordered map for lookup, other than the extra code weight?
Mostly because of the T in tiny c compiler and because no profiling shows
field lookup to be a problem :)
If that (and the cstr string type) is stripped down to its bare minimum, it
would be perfect for general symbol tables.
It's not general symbol tables in this case, but fairly specific: the set
must be ordered (for struct layout), at least at some point; the set is
looked up by small integers (aka identifier name); the set tends to be
small.
The only other fast C map I know
of is khash (https://attractivechaos.github.io/klib), however not memory
efficient, and the codebase is somewhat bigger.
I guess there as many map implementations as there are C developers :-)
But, looking at tccgen.c, it may be too ambitious to integrate?
I personally would not integrate a full generally capable hashmap without
measurements on realistic sources (i.e. not sources that artificially use
structs with 1000 members and 10.000 accesses to the last member :) ).
It's simply such that the number of struct members in C sources tends to
be a dozen max, on average, where a linked list is fairly okay. (This
implies that the quadratic checking of duplicates at struct decl time
might be completely acceptable, eventually it will be overshadowed by
normal member lookups)
But do try, if the implementation turns out to not add memory overhead and
many source lines, and is generally in the spirit of TCC, why not :)
(what's the spirit? I don't know, you'll eventually get a feeling for
it.)
(you will probably see in the course of such experiment that various
things aren't that straight forward to add, e.g. currently all parser
structures are Syms, and they are generically freed per scope no matter if
they are types, symbols, cleanups, or anything else; you would have to
free the hash tables somewhere, which would exist only for struct types,
which would mean at least different handling for these and the other Syms,
i.e. you'll probably see that it reduces elegance somewhat)
ps: I haven't really looked much at the core code yet;
Keep reading then, it's a quirky, dense, capable and satisfying source
base :)
Ciao,
Michael.
I do have some
compiler tech experience way back from creating an external syntax checker
for www.autoitscript.com, using flex and yacc.
(http://www.google.com/search?q=au3check)
Cheers,
Tyge
On Sat, 28 Nov 2020 at 00:32, Michael Matz <matz.tcc@frakked.de> wrote:
Hello,
On Fri, 27 Nov 2020, Tyge Løvset wrote:
> Is this a known bug, or regression?
Known bug.
> I tried to follow the code in parse_btype() in tccgen.c for
the missing
> struct member symbol lookup check, but didn't succeed so far:
>
> } else {
> c = 0;
> flexible = 0;
> while (tok != '}') {
> if (!parse_btype(&btype, &ad1)) {
> skip(';');
> continue;
> }
Member lookup is linear, so checking for duplicates is
quadratic, so TCC
doesn't bother to do it. The check would belong to struct_decl,
not
parse_btype, probably involving find_field before adding it.
Ciao,
Michael._______________________________________________
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel