emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Suppressing native compilation (short and long term)


From: Eli Zaretskii
Subject: Re: Suppressing native compilation (short and long term)
Date: Sat, 15 Oct 2022 11:51:06 +0300

> From: Liliana Marie Prikler <liliana.prikler@gmail.com>
> Cc: rlb@defaultvalue.org, emacs-devel@gnu.org
> Date: Fri, 14 Oct 2022 21:46:09 +0200
> 
> > > > When you encounter bugs in native compilation, please report them
> > > > to us, so we could fix them.  As of now, we are not aware of any
> > > > such bugs that were reported and haven't been fixed.  So if you
> > > > still have such problem, please report them ASAP.
> > > Of course, that's the intention, but this fix will only make it
> > > into the next Emacs release.  Thus, if you're between releases, you
> > > still need a workaround.
> > 
> > If the fix is urgent, why can't you patch the sources when you
> > prepare your distribution?
> Guix prides itself in being a package manager that can work around many
> failures (even as the proper workaround to bugs is discussed in mailing
> lists).  The fact, that the solutions to this issue is "compile 28.1
> without native-comp" or "use Emacs 27" does not reflect that
> particularly well.

I think this answers a different question.  I asked why you cannot
patch the Emacs you distribute when you consider a fix to be important
enough to not wait until the next Emacs release.  My point is that
reporting bugs in a timely fashion will help us fix them early on, and
you will then have a possibility of backporting the fixes to a
released Emacs and distributing an updated package with the fix, if
you think that's important enough.

> > > A particular candidate known to cause issues with the currently
> > > packaged 28.1 is [1].
> > 
> > Where's the description of the actual problem with natively compiling
> > that package?  And would you please submit a bug report with the
> > details, if you know them?
> I am not personally affected, so I can't.  I could direct people to the
> Emacs mailing lists, but it seems people in other threads have already
> started debugging.  Do you still wish me to do so? 

Which threads are you alluding to here?  Your [1] is just a reference
to ido-completing-read-plus package, and I don't see the description
of the problems with native-compilation on that site.  So yes, I'd
like to hear a description of the problem in that case.

> > > > Why isn't it sufficient to use no-native-compile?  It just means
> > > > that on some architectures the corresponding file will be loaded
> > > > as byte-compiled, and thus will be slightly slower (how much
> > > > slower depends on the code, so if you are worried, my
> > > > recommendation is first to measure the difference -- you might be
> > > > surprised).
> > > Because it'd require a distro-wide fix to address something that
> > > e.g. only happens on some AMD CPUs.
> > 
> > I'm asking why doing so is a problem?  Did you measure the effect on
> > performance and found it to be unacceptable in some cases?
> Isn't performance one of the main reasons to use native compilation? 

On average, yes.  But it depends on what the original Lisp code does.
We've found that in some cases the performance gains are minimal, and
in at least one very special case we found that native-compilation
produces a slightly slower code.  Which is why I asked the question
above: it is quite possible that the (hopefully, few) packages where
you need to avoid native-compilation for now don't gain performance
from using native-compilation enough for justify any more elaborate
measures.  And this is a temporary measure anyway, because those
problems will eventually be fixed, whether in the packages themselves
or in Emacs core.

> Note that I am talking in hypotheticals here when mentioning the AMD
> thing, i.e. we could very well imagine a performance-critical Emacs
> package having a native-compilation bug (I imagine those to be
> particularly likely for those trailing unreleased Emacs versions,
> though thankfully I don't think we've encountered one so far.)

Let's not be bothered by hypothetical cases until they actually
emerge.  When there are specific situations where this happens and
performance gains from native-compilation are critical, we can always
look for specific solutions for those cases, something that is
impossible without concrete cases.

> > OK, so why is this relevant to the issue of disabling?  Those who
> > choose ahead-of-time compilation will never see async JIT
> > compilation, and those who selected not to do ahead-of-time will
> > naturally see JIT compilation, as they've chosen.  What is the
> > problem here?
> The problem is that I can't meaningfully choose the "I don't want JIT
> for stuff I haven't AOT'd" option, especially not combined with "but I
> do want to load what I have AOT'd".

As I already explained, this mode of operation doesn't make sense to
me, and is currently not supported for that reason.  I fail to see why
people would want native-compilation for some parts of Emacs, but not
for others.  I haven't yet seen a valid use case where that would make
sense as the desired, clean, and non-kludgey solution.

Only one valid use case was brought p to this date, where it would be
desirable to delay JIT native-compilation temporarily: when the user
runs a laptop on batteries.  We will probably provide a solution for
that, which will automatically re-enable JIT compilation when AC power
is connected.  This would be a clean, non-kludgey solution for that
case.

None of the problems you describe are of that nature.  They all sound
like someone wants to arbitrarily disable native-compilation in some
cases, but not others, where reasonable solutions already exist.

And if you still disagree, then let's agree to disagree, because we
are just repeating the same arguments over and over again.

> > > > If a package is a single file or a small number of files, those
> > > > users can add the no-native-compile cookies in those files.
> > > This is not trivial in the case where the Elisp code is placed in
> > > system-managed storage and thus requires elevated privileges to
> > > modify (as is the default in most package managers, I assume).  Of
> > > course, you can copy the file to your $HOME, but editing it with a
> > > broken Emacs is rather painful.
> > 
> > Using broken packages is always painful, and native compilation
> > doesn't change that.
> Using broken packages normally doesn't result in the OOM killer firing
> off.

It could, rarely.

And which problem of native-compilation caused the OOM killer?  Where
is that problem described in enough detail for us to investigate it?
Was it reported to the Emacs bug-tracker, and if so, what is the bug
number, please?

IOW, we'd definitely want to avoid such catastrophic failures, but we
need the details to investigate and fix them.  I can tell you that I'm
using Emacs 28 with JIT native-compilation enabled for the best part
of this year, and have yet to see any problems even approaching the
one mentioned above.  So such problems are quite exceptional, and need
to be reported with every possible detail for us to be able to fix
them quickly.  They are definitely not a reason to disable
native-compilation.  We generally try to provide at least a workaround
for critical problems, once we have enough detail to understand what's
going on, so reporting a problem quickly will in many cases yield a
quick solution that doesn't hamper unrelated parts of the user's usage
patterns.

> > Packages provided by a distribution and installed into directories
> > where users cannot easily write should be well tested by the
> > distributor.  
> I think you're underestimating the number of breakages that can happen
> in a rolling release model.  Not every distro is as stable as Debian,
> but the joke's still on you because despite Debian's hard requirements,
> they still ended up encountering this bug.

Sure, that's understandable.  But each new problem that is found and
reported should cause the corresponding package to be updated with a
fix.  I don't see why such problems are deemed as reasons to disable
native-compilation for the entire Emacs session, or for requirements
that they be "fixed" in core.  Bugs should be fixed where their root
cause is.

> > You mean, you find the loading of preloaded *.eln files at startup
> > annoying?  Then you should know that this is the best solution we
> > found for dumping Emacs with natively-compiled preloaded code.
> No, I find it annoying that Emacs supposes it has a writable eln-cache
> always.

The user's home directory should always be writable.  This is required
by many Emacs features regardless of native-compilation.  For example,
saving customizations writes to a subdirectory of the user's home
directory, as does desktop.el or save-place etc.

If this is a problem during installation of packages, which run at
root level, the installation procedure can tweak
native-comp-eln-load-path to make sure there's a writable directory
there, or point HOME to a non-existent directory.

> This is not the case in typical package manager scenarios and
> it also isn't the case when users choose to make (parts of) their $HOME
> read-only, which is a supported configuration in Nix and Guix.

Users make ~/.emacs.d/ read-only?  Then how do they use all the
features, some of which mentioned above, that write to that directory?

> I can't think of a good reason why one would want to assume this
> invariant.

If this use case is supported by pointing the relevant variables, like
save-place-file, eshell-directory-name, desktop-dirname, etc., to
non-default places, then they can do the same with
native-comp-eln-load-path.  If this is not what you mean, please
describe how Nix and Guix support this use case where parts of $HOME
are read-only, and let's see how native-compilation should support it.

> > If you know of a better solution that doesn't suffer from any fatal
> > issues we found with the alternatives, please suggest such solutions,
> > and we will definitely consider them.
> I haven't read the discussions around the alternatives, but couldn't
> you just generate one trampoline per function which you use as soon as
> it's advised?

And then re-generate it again each time the advised function is called
again?

> Also, how come advice isn't breaking byte-compilation in exactly the
> same manner?

Andrea, can you please answer that?  I have only a very general
understanding of why trampolines are needed for native-compilation.

> > As I told earlier, disabling loading of native code made no sense to
> > us while Emacs 28 was in development; it still doesn't.  Either one
> > wants native-compilation, or one doesn't.  Making Emacs code more
> > complicated and harder to maintain due to features that make no sense
> > to us is a non-starter.  I see no problem with having to use a
> > separate build, since building a release tarball takes a minute or so
> > on a modern system.  And distros should definitely have a build
> > without native-compilation on offer, for a variety of valid reasons.
> I don't think that asking distros to package every Emacs variant twice
> is a great idea.  At Guix, we prefer to offer the most complete version
> of a package, so we ship with native compilation enabled.

I think this is a mistake.  Native-compilation is not for everyone.
It requires GCC and Binutils to be installed, and who says every Emacs
user wants that?

More generally, when we add optional features, we don't consider
whether having them all in the same build will make sense.  For
example, ImageMagick support has some advantages and some (quite
serious, IMO) disadvantages, so always providing it because it's "the
most complete version" doesn't necessarily make sense for the users.

> > > While bytecode performance on such machines might too be slow (but
> > > perhaps tolerable for the task), ahead-of-time compilation, perhaps
> > > with offloading, is preferable.
> > 
> > I recommend against this, because it is impossible to rely on AOT
> > installations to never compile at run time.  Users cannot rely on
> > that, and should be advised accordingly.
> But why can't they?

Trampolines is one reason.  I'm sure there are others.  Again, we
didn't design native-compilation support in Emacs to be switchable on
and off at run time, so it's small wonder that it doesn't work
reliably.  It would be a surprise if it did.

> > > For another, it can cause bugs like [2].
> > 
> > That bug by itself (the cause of massive launching of async
> > subprocesses) was never explored or described in that thread?  It
> > seems like the discussion switched to looking for ways of disabling
> > native-compilation right away, without a good understand of what was
> > happening.  Or did I mis something?  Async compilation by default
> > never launches more subprocesses than half the execution units of the
> > CPU, so what is described there should be carefully investigated and
> > the findings described.
> It'd be weird if someone found a counterexample to the above statement.

I don't understand this comment, sorry.

> > The other problem in that discussion, with warnings during async JIT
> > compilation is well-known, was reported several times, and the
> > culprit is always in the 3rd-part packages being compiled, which
> > should be fixed.  In any case, those are just warnings in almost all
> > cases, so their only adverse effect is annoyance (that can be
> > suppressed by clicking the button in the message).
> I read no such problem in that discussion.  Do we read the same thread?

I hope so.  I referred to this:

  https://issues.guix.gnu.org/issue/57878#13

> > Again, I see no reason to blame the upstream project for these
> > issues.  They should be solved by the offending 3rd-party packages,
> > and the distro should ideally uncover and fix them before they get to
> > users (I presume that you build and compile the add-on packages you
> > offer?).
> I'd like to tap at the "rolling release distro is not Debian" sign, but
> again, stable distros like Debian are experiencing issues with native
> compilation.

Once again: no one expects all the issues to be found in advance, but
when a new issue in a package is found, I do expect the distro to fix
it and publish an updated package.  I do not expect the distro to come
back to the upstream project and ask for knobs to deal with bugs in
3rd-party packages uncovered by latest Emacs features.

> > > Which defcustom?
> > 
> > Begin with those described in the ELisp manual, in the
> > "Native-Compilation Variables" node.  And my recommendation is to
> > review _all_ of the defcustoms in comp.el
> The only one I found is setting native-comp-speed to -1.  Is that the
> solution?  It doesn't appear to be.

It _is_ a solution, for one class of problems with native-compilation.
Another solution is to tweak native-comp-eln-load-path.
Yet another one is to temporarily point HOME to another, perhaps
non-existent, directory.

> > To summarize: native compilation in a build which supports it is
> > ubiquitous, and is not designed to be disabled except by
> > no-native-compile on a file by file level.  If a more general
> > disabling is needed for some reason, users should simply use a build
> > without native-compilation.  It's the same as various toolkit builds:
> > if the toolkit is broken or doesn't fit the user's needs, those users
> > should install a build with a different toolkit.
> Pardon my French, but that thinking in and of itself is broken.  Native
> compilation is not a choice in which you pick the one that most suits
> your fancy from a range of options – it could be that if you allowed
> the user to choose between libgccjit, clang and some other compilers
> that shall not be named, not that I recommend you implement this.  As
> such, I think users who do want to use native compilation should get
> some more say in when, where, and what to compile.

We hear the users and their complaints, and fix stuff that we think
belongs to Emacs.  This was and will always be the case.  In this
case, I'm not convinced that the issues you describe justify a new
knob in Emacs.  You've described several issues which either already
have solutions (which you reject), or should be solved elsewhere, not
in Emacs.

And if that still doesn't convince you, let's agree to disagree.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]