lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Benchmarking: gcc-8 beats gcc-10 soundly?


From: Greg Chicares
Subject: Re: [lmi] Benchmarking: gcc-8 beats gcc-10 soundly?
Date: Sat, 19 Sep 2020 23:41:53 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0

On 2020-09-19 20:37, Greg Chicares wrote:
> On 2020-09-19 15:48, Vadim Zeitlin wrote:
>> On Sat, 19 Sep 2020 15:15:48 +0000 Greg Chicares <gchicares@sbcglobal.net> 
>> wrote:
>> 
>> GC> It looks like gcc-10 gives us slower lmi binaries. Picking
>> GC> the third '--selftest' scenario as an index of performance
>> GC> (results in microseconds--less is better):
>> GC> 
>> GC>      gcc-10   gcc-8  ratio
>> GC>      ------   -----  -----
>> GC>      102659   84947   1.21  32-bit
>> GC>       50121   37410   1.34  64-bit
>> GC> The fourth scenario is even worse:
>> GC> 
>> GC>       33250   20654   1.61  32-bit
>> GC>       24616   13009   1.89  64-bit
> 
> With -O3, the 64-bit build performs thus on those two scenarios:
>   naic, ee prem solve : 5.001e-02 s mean;      49710 us least of  20 runs
>   finra, no solve     : 2.483e-02 s mean;      24580 us least of  41 runs
> Thus, the -O3 to -O2 speed ratio is
>   49710 / 50121 = .992
>   24580 / 24616 = .999
> which isn't work the extra build time (82.89 vs 72.76 seconds).


'-O3 -march=native' seems actually worse that '-O3', at least
for 32-bit binaries:

  -march     above
  103175 vs 102659 [worse]
   33517 vs  33250 [worse]

Of course that seems counterintuitive: detecting my CPU and
generating optimized code for it has to be better--but only
if the promised improvements are for real. There's something
really wrong here.

Here's how I got the '-O3 -march=native' numbers:

/opt/lmi/src/lmi[0]$git checkout -- workhorse.make
/opt/lmi/src/lmi[0]$sed -i workhorse.make -e's/O2/O3 -march=native/'            
         
/opt/lmi/src/lmi[0]$grep 'O[1-3]' workhorse.make          
  optimization_flag := -O3 -march=native -fno-omit-frame-pointer

/opt/lmi/src/lmi[0]$env |grep LMI_
LMI_COMPILER=gcc
LMI_TRIPLET=i686-w64-mingw32

/opt/lmi/src/lmi[0]$make clean
rm --force --recursive /opt/lmi/gcc_i686-w64-mingw32/build/ship
/opt/lmi/src/lmi[0]$time make $coefficiency --output-sync=recurse install 
check_physical_closure 2>&1 | tee eraseme | less -SN
make $coefficiency --output-sync=recurse install check_physical_closure 2>&1  
1814.65s user 80.32s system 41% cpu 1:16:32.36 total
tee eraseme  0.01s user 0.00s system 0% cpu 1:16:33.49 total
less -SN  0.07s user 0.00s system 0% cpu 1:16:44.26 total

/opt/lmi/src/lmi[0]$wine /opt/lmi/bin/lmi_cli_shared.exe --accept 
--data_path=/opt/lmi/data --selftest                        
Test speed:
  naic, no solve      : 6.830e-02 s mean;      66448 us least of  15 runs
  naic, specamt solve : 1.127e-01 s mean;     111507 us least of   9 runs
  naic, ee prem solve : 1.041e-01 s mean;     103175 us least of  10 runs
  finra, no solve     : 3.472e-02 s mean;      33517 us least of  29 runs
  finra, specamt solve: 7.767e-02 s mean;      73799 us least of  13 runs
  finra, ee prem solve: 7.001e-02 s mean;      69129 us least of  15 runs

>>  I've already seen performance regressions in newer g++ versions, but I
>> don't think I've seen anything nearly like 89% slowdown, so it's indeed
>> very astonishing.
> 
> I had the thought that perhaps this is a MinGW-w64 snafu, which
> would explain why they haven't officially released anything
> beyond 8.x yet. Yet the bugzilla report doesn't seem to specify
> a platform, while the phoronix link in that report specifies:
> | Ubuntu 20.04 with the Linux 5.8 kernel
> 
> I guess I'd better try the flags phoronix tested:
> | "-O3 -march=native", and "-O3 -march=native -flto"
> Right now, lmi looks like the "SciMark" benchmark here:
>   
> https://www.phoronix.com/scan.php?page=article&item=gcc-10900k-compiler&num=2
> so maybe this will resolve the anomaly.

Nope.

With LTO, the 'product_files' binary fails, but that's not too
surprising given its historical problems documented in
'workhorse.make'. Here's the first and last of several errors
for the record (but ignore them and skip to the next section):

i686-w64-mingw32-g++ -o product_files.exe alert_cli.o generate_product_files.o 
main_common.o main_common_non_wx.o my_db.o my_fund.o my_prod.o my_proem.o 
my_rnd.o my_tier.o liblmi.dll -L . -L /opt/lmi/local/gcc_i686-w64-mingw32/lib 
-L /opt/lmi/local/gcc_i686-w64-mingw32/bin    -lexslt -lxslt -lxml2      
-Wl,-Map,product_files.exe.map 
/usr/bin/i686-w64-mingw32-ld: 
/tmp/product_files.exe.f0GLip.ltrans12.ltrans.o:<artificial>:(.text+0x17dc): 
undefined reference to `std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > 
>::pair(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >&&) [clone .lto_priv.0]'

/usr/bin/i686-w64-mingw32-ld: 
/tmp/product_files.exe.f0GLip.ltrans12.ltrans.o:<artificial>:(.text+0
x2b64): undefined reference to 
`std::vector<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<ch
ar const*, std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > > >, std
::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, 
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > 
> > > 
>::vector(std::vector<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char 
const*, std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > > >, 
std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char 
const*, std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > > > > > const&) [clone .lto_priv.0]'
collect2: error: ld returned 1 exit status

[Next section] However, 'skeleton.dll' also fails to build with LTO,
and that's where lmi's calculations reside so it's crucial:

i686-w64-mingw32-g++ -o skeleton.dll -shared about_dialog.o alert_wx.o 
census_document.o census_vie
w.o database_document.o database_view.o database_view_editor.o default_view.o 
docmanager_ex.o file_
command_wx.o gpt_document.o gpt_view.o group_quote_pdf_gen_wx.o icon_monger.o 
illustration_document
.o illustration_view.o input_sequence_entry.o main_common.o mec_document.o 
mec_view.o msw_workaroun
ds.o multidimgrid_any.o multidimgrid_tools.o mvc_controller.o mvc_view.o 
pdf_command_wx.o pdf_write
r_wx.o policy_document.o policy_view.o preferences_view.o previewframe_ex.o 
product_editor.o progre
ss_meter_wx.o rounding_document.o rounding_view.o rounding_view_editor.o 
single_choice_popup_menu.o skeleton.o system_command_wx.o text_doc.o 
text_view.o tier_document.o tier_view.o tier_view_editor.o transferor.o 
view_ex.o wx_checks.o wx_table_generator.o wx_utility.o liblmi.dll wx_new.dll 
-L . -L /opt/lmi/local/gcc_i686-w64-mingw32/lib -L 
/opt/lmi/local/gcc_i686-w64-mingw32/bin -lwxcode_mswu_pdfdoc-3.1  -L 
/opt/lmi/local/gcc_i686-w64-mingw32/lib -L 
/opt/lmi/local/gcc_i686-w64-mingw32/lib  -lwx_mswu-3.1-i686-w64-mingw32 
-mwindows    -lexslt -lxslt -lxml2      -Wl,-Map,skeleton.dll.map 
/usr/bin/i686-w64-mingw32-ld: input_sequence_entry.o (symbol from 
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn868_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
 multiple definition of `wxTextCtrlBase::SetValue(wxString const&)'; 
census_view.o (symbol from 
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn424_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
 first defined here
/usr/bin/i686-w64-mingw32-ld: input_sequence_entry.o (symbol from 
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn868_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
 multiple definition of `non-virtual thunk to wxTextCtrlBase::SetValue(wxString 
const&)'; census_view.o (symbol from 
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn424_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
 first defined here
/usr/bin/i686-w64-mingw32-ld: input_sequence_entry.o (symbol from 
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn868_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
 multiple definition of `non-virtual thunk to wxTextCtrlBase::SetValue(wxString 
const&)'; census_view.o (symbol from 
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn424_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
 first defined here
collect2: error: ld returned 1 exit status
make[1]: *** [/opt/lmi/src/lmi/workhorse.make:931: skeleton.dll] Error 1

It looks like gcc's LTO is brittle.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]