bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#41615: [feature/native-comp] Dump prettier C code.


From: Andrea Corallo
Subject: bug#41615: [feature/native-comp] Dump prettier C code.
Date: Sun, 31 May 2020 16:57:40 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Nicolas Bértolo <nicolasbertolo@gmail.com> writes:

>> I like this considerably less :)
>
> Ok, let's say goodbye to this patch.
>
>> It introduces quite some complexity and the same advantage in
>> debuggability can be achieved with something like the attached 8 line
>> patch (untested).
>
> Sounds good, I haven't tested it either.
>
>> Generally speaking I want to try to keep our back-end as simple as we
>> manage to.
>
> I initially wrote this patch chasing the reason for slow compile times. I 
> think
> that a 10k line C file should be compiled much faster than what gccjit 
> achieves.
> I thought that "uncommon" (for C) ways of doing thing were causing gccjit to 
> get
> stuck trying to optimize them hard, until it gave up. I thought that filling 
> the
> static data using memcpy() and constant strings would help GCC recognize this 
> as
> a constant initialization and hopefully just store a completely initialized 
> copy
> in memory.
>
> I found that GCC would inline memcpy() and the static initialization would 
> turn
> into a very long unrolled loop with SSE instructions. I tested this with -O3
> only in gccjit to force maximum optimization. I found this super strange
> considering that -ftree-loop-distribute-patterns is enabled at -O3 and it 
> should
> recognize the naive_memcpy() function as an implementation of memcpy() and 
> issue
> calls to libc's implementation. Instead, it was inlining and unrolling it.

Ok you confirm the suspects I wrote in the other mail!

I've used your patch as a base, apart for minors here and there I've
stripped out the definitions of bzero and memcpy.

I believe bzero is unnecessary given these are static allocated.

For memcpy we can just use the standard library implementation given
elns are linked to it.  The other advantage is that doing this way (here
at least) memcpy is not inlined also at speed 3, so we don't trap in the
optimizer issue!

All summed is even a little faster than the stock patch and closer to
the one with the specific GCC blob support.

Let me know if you like the attached and if does the job for you too.

Thanks

  Andrea

-- 
akrl@sdf.org

Attachment: 0001-Cut-down-compile-time-emitting-static-data-as-string.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]