[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnugo-devel] paul_3_13.5.gz

From: Paul Pogonyshev
Subject: Re: [gnugo-devel] paul_3_13.5.gz
Date: Fri, 6 Dec 2002 13:24:19 +0200

Arend wrote:
> Paul wrote:
> [...]
> > > either something went wrong or the rather large transformation[][]
> > > array (almost 43k) is not doing very fast. i'm going to look
> > > through the whole patch in search for a mistake. i'll post the
> > > results soon.
> Well, it sounds very plausible that a lookup in a 43k array will be
> significantly slower than doing a matrix multiplication by hand.
> Memory latency is really slow, compared to fast processors today.
> (I even wonder whether the DFA matching is still faster, at all, on
> 2GHz+ processors.)
> [...]
> > - looks like `p' array is really not needed with the patch. not
> >   that copying board to p takes any noticeable time, but accessing
> >   board[] instead of p[] can also improve caching.

we don't only do the dfa matching, but standard matching as well.
yesterday i checked that removing the p[] array only (without any
other possible optimization) speeds pattern matching up by 5 - 7
seconds on nngs.tst. so, it already becomes faster than the current
2D implementation.

i've come up with another means of speeding up. first of all, we
can diminish the transformation[] array significantly. since
we always try to place pattern anchor somewhere near the center
of a pattern, it is unlikely that offset will be something
larger than (19/2, 19/2) or (-19/2, -19/2). that will decrease
array size from 43k to something near 12k, which must do great
things to caching. whether the offset fall into the range can
be checked during pattern creation.

another thing, instead of passing `int ll' and constantly evaluating
`transformation[ll][...]', we can precompute `transformation[ll]'
and pass it as argument wherever possible. we can even declare an
array of structures like

struct transformation {
  int trans1[MAX_OFFSET];
  int trans2[2][2];

struc transformation transformations[8];

and then pass a pointer to its elements around (since we still need
2D transformations in rare cases).

if no one has any objections, i think we can safely add the patch
to cvs. it doesn't break anything in regression, so it must be
bug-free. and i already know how to make it faster than the
current implementation (even without much work). it might in
principal break some parts of code which are not used currently
(like tree pattern matcher) but they will be easy to fix as soon
as anybody ressurect them.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]