qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/3] chardev: report blocked write to chardev backend


From: Daniel P . Berrangé
Subject: Re: [PATCH 2/3] chardev: report blocked write to chardev backend
Date: Wed, 22 Nov 2023 10:42:45 +0000
User-agent: Mutt/2.2.10 (2023-03-25)

On Wed, Nov 22, 2023 at 11:38:28AM +0100, Thomas Huth wrote:
> On 21/11/2023 12.47, Marc-André Lureau wrote:
> > Hi
> > 
> > On Tue, Nov 21, 2023 at 1:45 PM Thomas Huth <thuth@redhat.com> wrote:
> > > 
> > > On 21/11/2023 10.39, Marc-André Lureau wrote:
> > > > Hi
> > > > 
> > > > On Mon, Nov 20, 2023 at 5:36 PM Nicholas Piggin <npiggin@gmail.com> 
> > > > wrote:
> > > > > 
> > > > > On Mon Nov 20, 2023 at 10:06 PM AEST, Marc-André Lureau wrote:
> > > > > > Hi
> > > > > > 
> > > > > > On Thu, Nov 16, 2023 at 3:54 PM Nicholas Piggin <npiggin@gmail.com> 
> > > > > > wrote:
> > > > > > > 
> > > > > > > If a chardev socket is not read, it will eventually fill and QEMU
> > > > > > > can block attempting to write to it. A difficult bug in avocado
> > > > > > > tests where the console socket was not being read from caused this
> > > > > > > hang.
> > > > > > > 
> > > > > > > warn if a chardev write is blocked for 100ms.
> > > > > > > 
> > > > > > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > > > > > > ---
> > > > > > > This is not necessary for the fix but it does trigger in the
> > > > > > > failing avocado test without the previous patch applied. Maybe
> > > > > > > it would be helpful?
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Nick
> > > > > > > 
> > > > > > >    chardev/char.c | 6 ++++++
> > > > > > >    1 file changed, 6 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/chardev/char.c b/chardev/char.c
> > > > > > > index 996a024c7a..7c375e3cc4 100644
> > > > > > > --- a/chardev/char.c
> > > > > > > +++ b/chardev/char.c
> > > > > > > @@ -114,6 +114,8 @@ static int qemu_chr_write_buffer(Chardev *s,
> > > > > > >    {
> > > > > > >        ChardevClass *cc = CHARDEV_GET_CLASS(s);
> > > > > > >        int res = 0;
> > > > > > > +    int nr_retries = 0;
> > > > > > > +
> > > > > > >        *offset = 0;
> > > > > > > 
> > > > > > >        qemu_mutex_lock(&s->chr_write_lock);
> > > > > > > @@ -126,6 +128,10 @@ static int qemu_chr_write_buffer(Chardev *s,
> > > > > > >                } else {
> > > > > > >                    g_usleep(100);
> > > > > > >                }
> > > > > > > +            if (++nr_retries == 1000) { /* 100ms */
> > > > > > > +                warn_report("Chardev '%s' write blocked for > 
> > > > > > > 100ms, "
> > > > > > > +                            "socket buffer full?", s->label);
> > > > > > > +            }
> > > > > > 
> > > > > > That shouldn't happen, the frontend should poll and only write when 
> > > > > > it
> > > > > > can. What is the qemu command being used here?
> > > > > 
> > > > > You can follow it through the thread here
> > > > > 
> > > > > ZVT-bY9YOr69QTPX@redhat.com/">https://lore.kernel.org/qemu-devel/ZVT-bY9YOr69QTPX@redhat.com/
> > > > > 
> > > > > In short, a console device is attached to a socket pair and nothing
> > > > > ever reads from it. It eventually fills, and writing to it fails
> > > > > indefinitely here.
> > > > > 
> > > > > It can be reproduced with:
> > > > > 
> > > > > make check-avocado
> > > > > AVOCADO_TESTS=tests/avocado/reverse_debugging.py:test_ppc64_pseries
> > > > > 
> > > > > 
> > > > 
> > > > How reliably? I tried 10/10.
> > > 
> > > It used to fail for me every time I tried - but the fix has already been
> > > merged yesterday (commit cd43f00524070c026), so if you updated today, 
> > > you'll
> > > see the test passing again.
> > 
> > Ok so the "frontend" is spapr-vty and there:
> > 
> > void vty_putchars(SpaprVioDevice *sdev, uint8_t *buf, int len)
> > {
> >      SpaprVioVty *dev = VIO_SPAPR_VTY_DEVICE(sdev);
> > 
> >      /* XXX this blocks entire thread. Rewrite to use
> >       * qemu_chr_fe_write and background I/O callbacks */
> >      qemu_chr_fe_write_all(&dev->chardev, buf, len);
> > }
> > 
> > (grep "XXX this blocks", we have a lot...)
> > 
> > Can H_PUT_TERM_CHAR return the number of bytes written?
> 
> You can find the definition of the hypercall in the LoPAPR spec:
> 
>  https://elinux.org/images/a/a4/LoPAPR_DRAFT_v11_24March2016.pdf
> 
> ... and if I get it right, it does not have a way to tell the guest the
> amount of accepted characters. But it could return H_BUSY if it is not able
> to enqueue all characters at once. As far as I can see, this will make the
> guest spin until it can finally send out the characters... not sure whether
> that's so much better...?

If the rest of the guest can get on with useful work that's better. If we
block in QEMU, the entire guest hardware emulation is blocked so nothing
can make progress if it exits from vCPU run context.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]