bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#39413: 26.2; Emacs gets hung


From: ISHIKAWA,chiaki
Subject: bug#39413: 26.2; Emacs gets hung
Date: Mon, 16 Aug 2021 09:20:40 +0900
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0

On 2021/08/16 8:27, ISHIKAWA,chiaki wrote:
On 2021/08/12 16:10, Eli Zaretskii wrote:
Cc: larsi@gnus.org, npostavs@gmail.com, 39413@debbugs.gnu.org,
  chiaki.ishikawa@ubin.jp
From: "ISHIKAWA,chiaki" <ishikawa@yk.rim.or.jp>
Date: Wed, 11 Aug 2021 22:41:56 +0900

What about the dump below, which was in the *Messages* buffer following
dump I sent.
I am afraid I picked up the wrong dump in my previous e-mail.

Now, is "gc-elapsed" the elapsed time spent in the GC routine?
If so, the two samples at the end of the dump here certainly look
suspicious.
Yes, they show a couple of GCs that took about 150 to 200 sec. That
is indeed excessive, but gc-elapsed reports the wall-clock time, not
the CPU time, so if Emacs wasn't getting processing resources to run,
which could happen on a stressed system, that could explain the long
GC times.  Note that other GC cycles, both before and after the 2
problematic ones, took much less time.  And the statistics returned by
memory-use-counts doesn't show any excessive number of Lisp objects;
in fact, my current Emacs session, which runs for 9.5 days, has an
order of magnitude more objects than what you show, and yet GC cycles
here are barely perceptible.

Hi,

Thank you for your reply.

So gc-elapsed is the elapsed wall-clock time.
Is there a function to obtain CPU time in Emacs Lisp?

On BSD, many years ago, I wrote a short routine that uses
resource-related system call. I am using Emacs under linux now, and I assume similar calls exist under linux. I checked the manual. I will use getrusage(2), which seems to be BSD-compatible, to obtain the resource usage statistics and include the numbers in my post-gc-hook result.

OTOH, I have always assumed that Emacs DID eats up CPU while longish GC. You may recall the graphics that showed the usage of a CPU core going all the way to 1000 in my previous e-mail (xosview). I am attaching it again. I used the graphics to show the memory usage (hand-drawn red circle) in my previous e-mai.

But if you look slightly above, you will notice CPU core #4 is used 100% (!). That is emacs process. No other process is running that earnestly at that moment. Thanks to many cores (actually I allocate 7 cores), the linux desktop was responding swiftly and xosview was recording system resource usage as the graphics shows. Only Emacs process could not respond to window repaint request from X when I re-arranged the windows layout, and that is why the Emacs window is garbled (hand-drawn blue circle was placed at the upper-left of the un-repainted area that remained black).

I believe Emacs indeed was eating up the CPU time in a busy loop or deeply nested calls. (maybe a loop that could be handled in a for loop with a relatively shallow stack was handled by a recursive call that can be nested as deep as the list length?)

Numbers from getrusage() will give us additional insight on this matter hopefully.

Thank you again for all your tips.


Chiaki


PS: I found profiler-cpu-* functions, but I don't think it is wise to run them during GC since they seem to allocate vector tables. However, taking a snapshot of strack trace every now and then during GC seems attractive for my investigation to figure out WHERE in GC, the excessive time is spent.

Aha, I realized that Emacs configure has a snipet to check for the presence of getrusage system call. Now I know there is a function in src/sysdep.c called get-internal-run-time which would be useful for my purpose except it calls make_lisp_time(make_timespec(secs, usecs * 1000)) at the end.
I want to obtain precise time spent by GC by doing the following.
   geturasge(RUSAGE_SELF, &usage_before_GC);

   GC

   geturasge(RUSAGE_SELF, &usage_after_GC);

obtain the CPU time usage difference between two rusage structure
to know the CPU time spent by GC.

Well, get-interla-run-time is useful for obtain the AFTER-GC time, but I am not sure if we can call get-internal-run-time within the garbage-collect, at the very first line of its call due to the use of make_lisp_time.

Well, I guess I have to create a function to record the CPU usage at the start of GC and then before exiting GC, calculate the difference and put it into a variable (sys time and usr time added for simplicity).

I think I will basically follow how Vgc_elapsed is updated at the latter part of garbage_collection().
I think I will return a vector of numbers from getrusage().
If you have any suggestions about how the return value/variable should look like, please let me know.

If I can find a reasonably portable way to obtain cache miss and cache hit numbers of a single core process like Emacs, I may want to add it just in case.

TIA.

Chiaki








reply via email to

[Prev in Thread] Current Thread [Next in Thread]