bug-guile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#19883: Correction for backtrace


From: David Kastrup
Subject: bug#19883: Correction for backtrace
Date: Thu, 26 Feb 2015 16:30:28 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux)

address@hidden (Ludovic Courtès) writes:

> David Kastrup <address@hidden> skribis:
>
>> Is there a reason you are not using the test program provided with this
>> bug report?
>
> Sorry, I had overlooked it.  Please assume good faith.
>
> With this patch:
>
> diff --git a/smobs.tcc b/smobs.tcc
> index c8b9dfe..b1d61b7 100644
> --- a/smobs.tcc
> +++ b/smobs.tcc
> @@ -132,9 +132,6 @@ void Smob_base<Super>::init ()
>    // types in order to be able to compare them.  The other comparisons
>    // are for static member functions and thus are ordinary function
>    // pointers which work without those contortions.
> -  if (static_cast<SCM (Super::*)()>(&Super::mark_smob) !=
> -      static_cast<SCM (Super::*)()>(&Smob_base<Super>::mark_smob))
> -    scm_set_smob_mark (smob_tag_, Super::mark_trampoline);
>    scm_set_smob_print (smob_tag_, Super::print_trampoline);
>    if (&Super::equal_p != &Smob_base<Super>::equal_p)
>      scm_set_smob_equalp (smob_tag_, Super::equal_p);
>
>
> “./test 1000 1000 10000” works fine.

That's not really surprising: this specifies a "fork factor" of 1000 and
1000 points of data which means that the data structures are nested only
one level deep.

Try ./test 2 2000 200

which will cause a nesting of about 10 levels, making it much more
likely for collection/allocation problems to strike.

> However, it’s not representative since the C++ objects in this test do
> not hold ‘SCM’ references, AFAICS.

Why isn't it representative?  That's _very_ much the situation in
LilyPond's code base.

>> LilyPond's GUILEv2 branch is currently out of order again since
>> 2.0.11 changed encoding mechanisms _again_
>
> Please submit a different bug report.

There were separate bug reports.  This is considered a feature.

>> It would not help since many of the references are stored in STL
>> containers (like std::vector <Grob *>) which have their data
>> allocated/deallocated separately from the memory area of the
>> structure itself.
>
> Oh, OK.  Still, I don’t think this is a problem because each C++
> object has a corresponding SMOB, and said SMOB is GC-protected; thus
> the C++ object itself is also GC-protected until the SMOB is
> unprotected.

The code given in test.cc is representative for LilyPond: most of the
C++ objects refer to other C++ objects via pointers, and the protection
of SMOB and C++ objects is managed through the mark callbacks.  Complex
C++ objects contain their own SCM value as part of the Smob base class.
Simple C++ objects (derived from Simple_smob) don't and are only
optionally managed by GUILE.

The test program currently does not demonstrate use of Simple_smob: at
the current point of time getting the crashes of Smob under control is
the first target.

> Here’s the patch I’ve ended up with:
>
> diff --git a/smobs.hh b/smobs.hh
> index 3701280..a41a645 100644
> --- a/smobs.hh
> +++ b/smobs.hh
> @@ -263,6 +263,20 @@ private:
>  protected:
>    Smob () : self_scm_ (SCM_UNDEFINED), protection_cons_ (SCM_EOL) { };
>  public:
> +  static void *operator new (std::size_t size)
> +  {
> +    /* This C++ object is referred to by the corresponding SMOB, which is
> +       itself GC-protected.  Thus, no need to protect the C++ object.  */
> +    return scm_gc_malloc (size, "lily-smob");
> +  }
> +
> +  static void operator delete (void *thing)
> +  {
> +    /* Nothing to do: the GC will reclaim memory for THING when it deems
> +       appropriate.  */
> +    // printf ("delete %p\n", thing);
> +  }
> +

As I stated: this will not help with STL containers which are
extensively used in pretty much every data structure of LilyPond.

>> Frankly, I don't get the current strategy of GUILE: basically any use
>> of scm_set_smob_mark will result in a function that can be called
>> with garbage from a smob that has already been deallocated via the
>> function registered with scm_set_smob_free.
>
> That’s not true.

Shrug.  I posted the backtrace and the example program.

> Let me explain what my personal expectations are.  First, like many
> others, I’m a volunteer with limited bandwidth.  Thus, I really prefer
> bug reports that are as concise as possible and to-the-point; I can’t
> afford to read so much.

That's what I consider the primary reason that every offer to help with
LilyPond so far has gone dead the moment I provided a branch and recipe
to work on.  LilyPond is a large and complex application.  That's why
I went to the pain of providing standalone code demonstrating the memory
allocation and management problems so that this could be worked on
independently from other problems.

> Second, I don’t want to mix issues: this bug report is about a
> specific GC issue, not about an encoding issue.  It’s better for me
> (and everyone I guess) when a bug report remains focused.

Which is why I provided a separate test program that does not break
whenever the "stable" Guile version changes its encoding APIs again.
Several weeks ago I promised RMS to update the GUILEv2 branch to the
current version of LilyPond as he got a renewed promise of cooperation.
It turned out that in the mean time Ubuntu had updated Guilev2 from
2.0.9 (or something) to 2.0.11, and the encoding issues I got under
control for 2.0.9 were back with a vengeance where a speedy solution was
not in sight.

So I spent the time to create test code _only_ involving the memory
management issues.  Now the complaint appears to be that the code is not
representative for LilyPond.  It most certainly is.

> Third, we must fix these LilyPond vs. Guile 2 issues.  You and others
> have spent a lot of time on this already.

In the last two years, nobody except myself spent any time on
GUILEv2/LilyPond.  Previous to that, Ian Hulin spent time on
reorganizing our code base and startup code in order not to get the byte
compilation, module system and other interpreter/compiler differences of
GUILEv2 mess up the basic operation, mostly by disabling a lot of stuff.
There will be a lot of followup work to get code management and
performance into reasonable shape once it works at all.

When it is not possible to run the test suites without random crashes,
there is no reasonably effective way for people to work on those aspects
however.

> I think it would help to get everyone involved on both sides.  Thus,
> could you Cc: this bug report to the LilyPond developer list, or the
> corresponding LilyPond bug report?  This is really important to me.

Shrug.  I'll put a link to this bug report to a suitable LilyPond issue.
It will not likely trigger a response or other feedback, however.
Nobody but myself is working on GUILEv2 migration.  The current C++
template class implementation/abstraction of the GUILE memory management
has been written by me.  No other currently active developer has had
either significant involvement with the current code, or the
functionally similar previous code that used C preprocessor macros
instead of language features to do essentially the same.

> Please don’t take it personally or anything.  I’m really trying hard
> to see how we can improve the collaboration between the two projects.

For better or worse, LilyPond has no other developer to offer than
myself for that task.  There is nobody else available who even has had
exposure to GUILEv2 or even GUILE1 internals.

-- 
David Kastrup





reply via email to

[Prev in Thread] Current Thread [Next in Thread]