[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnugo-devel] paul_3_13.5.gz

From: Dave Denholm
Subject: Re: [gnugo-devel] paul_3_13.5.gz
Date: 06 Dec 2002 10:25:18 +0000

Arend Bayer <address@hidden> writes:

> Paul wrote:
> > i wrote:
> > > > the patch cuts off around 600k (!) from the executable and must give a
> > > > measureable speedup as well (haven't checked).
> > >
> > > today i checked the speed and got an unexpected result: the 1D
> > > version is a tiny little bit slower than the 2D one - the
> > > difference is 1 - 2 seconds on nngs.tst (which runs for a bit
> > > more than 20 minutes on my computer). i did the check three
> > > times and the results are very consistent.
> > >
> > > either something went wrong or the rather large transformation[][]
> > > array (almost 43k) is not doing very fast. i'm going to look
> > > through the whole patch in search for a mistake. i'll post the
> > > results soon.
> Well, it sounds very plausible that a lookup in a 43k array will be
> significantly slower than doing a matrix multiplication by hand.
> Memory latency is really slow, compared to fast processors today.
> (I even wonder whether the DFA matching is still faster, at all, on
> 2GHz+ processors.)

Not only that, but using a lookup table to get one word means that
a whole cache line must be discarded.

Something I've been meaning to follow up on is that I believe
that the latest pentiums have a way of reading a word without
dragging the cache line in. So if a large table is used, such
that we don't believe it is useful for it to be in the cache,
it may be possible to read straight from secondary cache
without affecting primary cache. DFA seems to be a good candidate
for this sort of thing.

Other tricks would be to request it long before it is needed,
so that the cpu can get on with useful work while the memory
is being read. (Recent pentiums are superscalar.)

Unfortunately, probably not possible to play this sort of trick
from c.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]