bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] user_intr: a lock to protect main_intr_queue


From: Samuel Thibault
Subject: Re: [PATCH] user_intr: a lock to protect main_intr_queue
Date: Wed, 5 Aug 2020 11:47:27 +0200
User-agent: NeoMutt/20170609 (1.8.3)

Junling Ma, le mar. 04 août 2020 17:59:57 -0700, a ecrit:
> > On Aug 4, 2020, at 5:30 PM, Samuel Thibault <samuel.thibault@gnu.org> wrote:
> > 
> > Junling Ma, le mar. 04 août 2020 17:08:03 -0700, a ecrit:
> >>>> static user_intr_t *
> >>>> search_intr (struct irqdev *dev, ipc_port_t dst_port)
> >>>> {
> >>>>  user_intr_t *e;
> >>>> -  queue_iterate (dev->intr_queue, e, user_intr_t *, chain)
> >>>> +  simple_lock(&dev->lock);
> >>>> +  queue_iterate (&dev->intr_queue, e, user_intr_t *, chain)
> >>>>    {
> >>>>      if (e->dst_port == dst_port)
> >>>> -        return e;
> >>>> +        {
> >>>> +          simple_unlock(&dev->lock);
> >>>> +          return e;
> >>>> +        }
> >>>>    }
> >>>> +  simple_unlock(&dev->lock);
> >>>>  return NULL;
> >>>> }
> >>> 
> >>> I believe you want to make the *caller* lock, here. Otherwise another
> >>> CPU could be removing the entry you have just found. That's actually the
> >>> point of splhigh/splx in the callers.
> >> 
> >> I think splhigh/splx disable interrupts, but the interrupt handler will 
> >> not touch the queue any more, only the intr_thread does. Wouldn’t a simple 
> >> lock be sufficient?
> > 
> > You not only have the interrupt handler, but also other processors doing
> > stuff.
> > 
> > Note: simple_lock is optimized to void in uni-processor builds. simple
> > locks are there only to handle multi-processor cases. They just cannot
> > protect at all against an interrupt, that's what splhigh is for.
> 
> Please bear with me as I am slow here. The interrupt handler does not touch 
> the queue. It works on the user_intr_t pointer. So we use simple_lock to 
> protect again other threads, but we do not need splhigh/splx to protect us 
> against the interrupt handler.

There's a misunderstanding. When I wrote "That's actually the point
of splhigh/splx in the callers." I meant that it is the point of
splhigh/splx being in the caller, and not in the callee. I am not
saying splhigh/splx should necessarily be kept. I am saying that
simple_lock/unlock should be in the caller, not the callee, for the same
reason why splhigh/splx was in the caller and not in the callee.

> >>>> @@ -64,7 +77,7 @@ irq_acknowledge (ipc_port_t receive_port)
> >>>>  if (irqtab.irqdev_ack)
> >>>>    (*(irqtab.irqdev_ack)) (&irqtab, e->id);
> >>>> 
> >>>> -  __enable_irq (irqtab.irq[e->id]);
> >>>> +  PROTECT(irqtab.lock, __enable_irq (irqtab.irq[e->id]));
> >>> 
> >>> I don't think we need a lock here. Yes, we need to protect
> >>> __enable_irq's stuff, but I don't think it's device/intr.c's duty to do
> >>> this. __enable_irq actually already has some of it, we'd just need to
> >>> add a simple lock in addition (and possibly introduce a simple+splhigh
> >>> helper to do this)
> >> 
> >> You are right. Actually, we probably do not need to lock at all, These ack 
> >> calls are serialized at the msg server, so they cannot mess each other. 
> >> The intr_thread does not enable irq unless the port is dead, in which case 
> >> there would be no ack calls.
> > 
> > I don't think we want to base safety over the precise behavior of other
> > code, but only over simple conventions that are properly documented.
> 
> We can protect __enable_irq and __disable_irq with a lock. But that is 
> tricky, because __disable_irq is called in the interrupt handler. Another 
> thread holding a lock while interrupt happens causes a deadlock.

That is why you cannot use only a simple lock, but also splhigh/splx.
Please read the LDD https://lwn.net/Kernel/LDD3/ and more precisely its
chapter on concurrency https://static.lwn.net/images/pdf/LDD3/ch05.pdf
and the spin_lock_irq helper that does both the equivalent of
simple_lock *and* splhigh.

> >>>>    if ((!e->dst_port || e->dst_port->ip_references == 1) && e->n_unacked)
> >>>> @@ -231,6 +244,7 @@ intr_thread (void)
> >>>>        s = splhigh ();
> >>>>      }
> >>>>  }
> >>>> +      simple_unlock(&irqtab.lock);
> >>> 
> >>> We probably want to unlock earlier than this, before we call
> >>> deliver_intr. Otherwise we keep the lock busy while doing the delivery.
> >>> Even worse with the kfree call. Again, basically follow the existing
> >>> splhigh/splx, it already provides the uniprocessor concerns which
> >>> already have the problem of interrupts happening concurrently.
> >> 
> >> Yes we only need to protect the dead port deletion part to prevent device 
> >> registratio from appending to the queue while we delete here. The delivery 
> >> loop should be safe without locking.
> > 
> > No, it's not completely safe, you need to protect against other
> > processors doing other stuff.
> 
> What other stuff though?

That can be anything. Registering a new irq, killing a process, etc. If
you expect only intr_thread to be removing entries from the queue, then
add it as a fat comment on the queue declaration, and then yes, you can
start assuming it, but not before.

Just one thing: your patch removes the deletion of dead port within the
while (irqtab.tot_num_intr) loop. But then you have a horrible scenario:
you get an interrupt for the first and the second entry of the
queue, you start delivering the first, and in the meanwhile the user
process for the second entry dies. You'll be stuck within the while
(irqtab.tot_num_intr) loop.

I'm beginning to think that we should perhaps just drop that while
(irqtab.tot_num_intr) optimization, and just iterate over the whole
queue each time, using clear_wait() each time something is done, to make
sure we don't miss a wake.

Samuel



reply via email to

[Prev in Thread] Current Thread [Next in Thread]