bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Comments in %union processed incorrectly


From: Hans Aberg
Subject: Re: Comments in %union processed incorrectly
Date: Tue, 1 Jan 2002 12:43:43 +0100

At 13:40 -0800 2001/12/31, Paul Eggert wrote:
> I would prefer not making this change
>just before a release.

I think too that you should just finish off the current version, not
worrying about C++ beyond the very crude compile C as C++ support as it
will take more extensive changes. It is better to start to work up C++
later, which I think could start with 1.31, as I now have written two
skeleton files which only uses C++ standard constructs, and thus avoids
malloc/union problems.

In the first C++ skeleton file, I merely replace the stacks using C++
standard containers (or a custom stack). In the second C++ skeleton file, I
merge the two or three parser stacks into a single one. I think this might
provide a faster parser, as one then only has to make one allocation
instead of two or three.

A way back to a common C/C++ interface might be to replace these C++
constructs with macros, which then can be replaced by C constructs in the C
version.

>> It just happens that I do have a Location class, which
>> does have ctors.  But now, because of this single union, this is no
>> longer proper C++: classes with ctors cannot be stored in a union.
>
>Sorry, I'm not a C++ expert.  I don't know what a "ctors" is.  What is
>the relationship between ctors and yymemcpy?  For example, can one
>safely use yymemcpy to copy the bytes of a Location class?  If not,
>your code won't work anyway, even if it does compile.

In C++, if one has a class, one can define user constructors; the class is
then said to have non-trivial constructors. It may look like
  class A : public B {
    C c;
  public:
    A(C c0) : B(c), c(c0) {}
    ...
  };
Here, B(c), c(c0) are called "ctor-initializers".

But the main point in our discussion is that the compiler no longer can
ignore these constructors. It poses a problem in a union, because it is not
possible for the compiler to know when these constructors should be applied
-- by contrast, under C, there are no constructors to apply. And it is not
possible to know whether it is OK to merely use raw memory copying, as with
memcpy.

>If your code does safely work with yymemcpy, perhaps you could explain
>the C++ issues to me, to help me propose a solution that doesn't cause
>problems with C++.

So one could not rely on replacing memcpy for any C++ class.

>Here's an idea: we could avoid yymemcpy entirely, and simply use
>for-loops that copy the stack elements one by one.

The solution I proposed (for 1.31+) is to build an entire new stack macro
interface:
  #define YYSTACK_POP
  #define YYSTACK_PUSH(x)
  ...
It would not be difficult for me (or you) to extract such an interface from
the C++ skeleton files I made.

Then one can build either a C or a C++ interface onto that.

I need to tweak the Bison sources a little, to write out some more macros.
But these tweaks are relatively minor.

>  We could use
>__builtin_memcpy if using GNU C (not GNU C++), for performance reasons
>on GNU C hosts; but otherwise we could just use the for-loops.
>Perhaps this would avoid the problem with C++ constructors that Hans
>Aberg mentioned.

For genuine C++ support, I think that only using C++ container classes
would do. The C++ skeleton files I made can use the standard C++ containers
std::deque (default), std::vector, std::list, which would suffice for a
while in most uses.

The C stacks you have implemented are in principle equivalent to
std::vector, with the differences that the C stacks do not apply C++
constructors for data as it should, and that the C++ container classes
cannot use alloca (because they use a function call for allocation where
alloca will put its stuff, so that the allocation space is gone by the time
the class starts to operate).

It does not seem worth the effort to write working C++ stacks in C using
memcpy. But if you really want to try, I think it is better to do that
later on.

>> However, I think of merging these different stacks into one. Then one would
>> instead use:
>> struct yyalloc
>> {
>>   short yyss;
>>   YYSTYPE yyvs;
>> # if YYLSP_NEEDED
>>   YYLTYPE yyls;
>> # endif
>> };
>> and use that type for a single stack.
>
>That might make sense, though I would prefer not making this change
>just before a release.

In 1.31+. However I already decided to change this to:

// A type that can be used for a single stack (instead of the original two
or three).
struct yystack_type
{
  typedef short state_type;
  struct value_type {
    YYSTYPE semantic;   // Semantic value.
#if !YYLSP_NEEDED
    value_type() : semantic() {}
    value_type(const YYSTYPE& x) : semantic(x) {}
#else
    YYLTYPE location;
    value_type() : semantic(), location() {}
    value_type(const YYSTYPE& x, const YYLTYPE& l)
     : semantic(x), location(l) {}
#endif
  };

  state_type state;
  value_type value;

  yystack_type() : state(0), value() {}
  yystack_type(state_type s, const value_type& v)
   : state(s), value(v) {}

#if !YYLSP_NEEDED
  yystack_type(state_type s, const YYSTYPE& x)
   : state(s), value(x) {}
#else
  yystack_type(state_type s, const YYSTYPE& x, const YYLTYPE& l)
   : state(s), value(x, l) {}
#endif
};

I didn't want to confuse by using the name yyalloc, and it seemed logical
to have a stack value plus a combined semantic/location value. One may
think if this setup is performance optimized, but the first step is always
to get something that is working.

>  I suspect that it was done with separate
>stacks originally for performance reasons.  Those reasons may no
>longer apply these days, but I would measure any performance
>degradation (if any) due to this change before installing it.

I think that performance might go up if one is using only one stack, but
perhaps it was different on the machines of past with little memory. (I
think that in original Bison, the was no stack policy implemented, so that
fixed separate stacks would not hit performance; there is then little extra
time in incrementing more than one stack pointer.)

Anyway, I made both types of skeleton files, using separate/combined
stacks, so it should be easy for people to profile it: Just change the
skeleton file.

  Hans Aberg





reply via email to

[Prev in Thread] Current Thread [Next in Thread]