lilypond-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Should we be touching goops?


From: Jean Abou Samra
Subject: Re: Should we be touching goops?
Date: Sun, 5 Jun 2022 14:12:38 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1



Le 04/06/2022 à 12:01, Luca Fascione a écrit :
I can't say that I completely follow all that's being discussed here, but I
have seen a few statements pass by that I found confusing and it might be
of some use that I share some comments based on what I myself have
experienced so far. I'll admit I lost track of who said what, so I'll focus
on the themes that came up as I heard them, without quotes. Do correct me
where I misunderstood what was actually going on.

First off there seems to be some confusion about the whole associativity
situation in computer algebra:

    - first things first: when using floats you can't rely on (a+b)+c ==
    a+(b+c): the reason is that every time you execute an algebraic operation
    you need to round the result, and in the two expressions the roundings are
    not equivalent, so here's a classic counter example: imagine that b has an
    exponent that is k bits lower than a and that k > 3. Now if the kth least
    significant bits in b are 101xxx (x meaning any) the rounder(*) will round
    "away from zero" if a and b have the same sign (up, if they're positive).
    Now: if c just so happens to knock off the 1xxx part of b, you can see the
    rounding will change its mind on you depending on how you associate. In
    particular if it just exactly zeroes it out you'll even get round-to-even
    behaviour (which is slightly more convoluted to reason about).
    - (*) this assumes round-to-nearest, there are equivalent discussions
    for the other rounding modes, but I don't believe we'd employ them.
    - So: for the compilers I'm familiar with (say an obvious set: gcc,
    clang, msvc, nvcc, icc, the usual suspects) you need to be -O3 (or enable
    specific optimizations with their flags) for this to happen. Otherwise
    a+b+c will be dispatched in source code order ie (a+b)+c otherwise it's a
    bug. (It's common for compilers to have -O2 to mean "only safe maths
    optimizations" and -O3 is more "play a little looser with maths")
    - If you're using integers none of the above applies
    - Compilers translating a+b+c as a+c+b: (integer only, as discussed)
    this happens for latency hiding, or register pressure reasons. Different
    compilers reason about this differently, but in general: either they ran
    out of register, and they're trying not to spill (which on x86 is not that
    big a deal if it stays in L1) or there is evidence b is coming straight
    from memory and its load cannot be hoisted up far enough to guarantee the
    stall won't happen (this again speaks to register pressure). Note that the
    reorder only saves you a cycle or two on a reasonably recent processor
    (10yrs?) so yes there's "a difference" but I'd have trouble believing it'll
    be a material one (an L3 or main-memory load costs you tens of these on a
    good day). This applies to +,-,*. Division is different, but that one
    doesn't reorder as often and it's relatively rare anyways.



As David already said, the part of LilyPond we're discussing is using
rationals. Furthermore, (a + b) + c being close but not equal to
a + (b + c) for floats is not really an issue for most parts of LilyPond.

"a + (b + c) is close but not equal to "(a + b) + c" is different
from "a + (b + c)" works whereas "(a + b) + c" errors out (in Scheme)
or doesn't compile (in C++)".


To the issue that languages don't come with built-in interesting object
systems/algebras and units: no they don't, and for good reason except that
the reason is the opposite of what was implied before.
It's not because it's not useful, or harmful, all the contrary: they expect
you, the user, to write your own libraries to deal with this.
That's one of the key reasons for having object orientation: making objects
belong to classes of behaviour that make your code behave
as-close-as-useful to the entities you're modeling.

Adding rich algebraic/units systems does happen, I wouldn't say it's
particularly rare: here's a reference from my field
https://dl.acm.org/doi/10.1111/j.1467-8659.2010.01722.x, associated with a
study of how much it helped.
Another point of evidence that folks use this is
https://www.boost.org/doc/libs/1_65_0/doc/html/boost_units.html, if it made
it to boost it should be reasonable to infer that people care about this.

Further, unless I'm confused, this discussion is about scheme algebra, not
C++ algebra, is that right?


The Moment type is one of our numerous "smob" types. Smobs
are the (now legacy) foreign object interface of Guile. Moments
can (and are) manipulated from both C++ and Scheme.


If that's the case, surely the truckloads of boxing/unboxing (or whatever
you call variable-to-value dereferencing in guile) you're doing to
implement the language semantics drown away any of these considerations?
Also, do we have evidence that the implementation even can implement (+ a
(+ b c)) as if it was (+ (+ a b) c) ?
Seems like it could only do it if it had a fairly intimate understanding of
the implementation of +, and again, it's a complete mystery to me how this
can possibly measurably affect performance.


From what I've heard, GOOPS used to be inefficient at dispatching
virtual calls. This problem is apparently gone now.

Boxing and unboxing has a certain cost, but LilyPond is not optimized
to the point that thinking about it causes significant savings. The
order of the most worthy optimizations is more high-level.



For these reasons, I feel this conversation is trying to make a decision
based on aspects that are difficult to show as being very material.
However I do feel there is a very material angle that should be discussed
and only a couple folks brought up (and seem to have been fairly
unceremoniously shot down).

If you look at source code implemented with one class system or the other,
which one is clearer in its meaning for a user that is _moderately_
familiar with the ontology at hand?

I feel this is the more important aspect here, and I'll share what I have
observed when facing a similar choice, because the answer in the end was
not what I had expected at first.
I have worked a fair bit with systems that deal with geometric entities
(points, planes, triangles, vectors, rays, lines, curves, that sort of
stuff).
In our field there are two schools of thought: the mathematicians (like me)
want affine algebra class systems (points, vectors and normals are captured
by different classes) and the software engineers want just vector
algebras (everything is a vector) and "you can keep it in your head what's
what". Inevitably, if you do this long enough, you end up working with both
systems, and the reality is that in the overwhelming majority of cases
only having vectors is fine. And the reason is that in reality affine
algebra systems end up being more pedantic and in your way than it's worth
to you. They do keep (certain) bugs away, but they cause so much extra
typing and allocations that are very difficult to optimize away reliably,
that the final balance is not very good.
However what's unbelievably confusing, and a very fertile ground for
difficult to find bugs, is mixing covariant vectors with contravariant ones
(ie vectors and normals).
We've had to fix many bugs of this kind, notwithstanding our efforts in
careful and diligent naming conventions for variable and function names, to
make sure we had code that looked like it was doing what it was actually
doing. Folks with years of experience in the field doing this the whole
day got caught in mistakes of this kind.
And the real issues with these bugs is that they would be subtle because
the source didn't help enough make it clear to the readers what was what.

For me one lesson learned from this is: there is a cost to what you keep in
your head while you're reading a piece of code, no matter how small.
Your job as a designer of an ontology is to make sure that this cost is
spent in a way that is most useful to the community working on the codebase.
You want to maximize the usefulness of code reviews and from this comes the
observation above:
folks need to be _moderately_ versant in the ontology at hand to be able
spot bugs, not deep experts.
And there are several reasons for this:
  - if you have this, you enlarge the group that can usefully comment on a
commit during review
  - in turn this means these people will participate in reviewing important
code, which will help them learn how the system is put together, and in
places where it actually matters
  - this helps them 2 ways: they learn what the system does (and where that
happens), and picking up good patterns for writing their new code
  - re-applying these good patterns makes the whole codebase look more
regular, which lowers cognitive overhead when you're reading code
  - and all the above creates serendipity and a very valuable
self-sustaining loop (*)
  - on top of it, this frees up the deep experts to work on the harder
problems: when faced with a challenge, finding a place to go and drawing
the path to get there is where you want to spend your money. Once the path
is traced, you'll see you have a good quality ontology from the fact that
walking this new path is a walk in the park for everyone else. People
reading a well thought out solution to a hard problem should go "of
course!", not "my brain hurts...".

(*) code quality goes up, folks write more relevant code because fewer bugs
are introduces, bugs are caught early so the fix is not particularly
involved (and the relevant code is still fresh in the author's head), folks
feel like they're contributing in meaningful ways, more time is spent on
new functionality instead of hammering at old material, which makes the
people more satisfied and realized ... you get the idea



I think we all agree that these are good things in
any software projects. The question is whether a
given change will contribute enough to these goals
to be worth it compared to its costs and downsides.


Best,
Jean




reply via email to

[Prev in Thread] Current Thread [Next in Thread]