qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH fo


From: Alistair Francis
Subject: Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
Date: Wed, 19 Jul 2023 19:53:17 +1000

On Wed, Jul 19, 2023 at 3:39 PM Anup Patel <anup@brainfault.org> wrote:
>
> On Wed, Jul 19, 2023 at 7:03 AM Alistair Francis <alistair23@gmail.com> wrote:
> >
> > On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <atishp@atishpatra.org> wrote:
> > >
> > > On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor@kernel.org> wrote:
> > > >
> > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > >
> > > > > > > > OpenSBI v1.3
> > > > > > > >    ____                    _____ ____ _____
> > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > >         | |
> > > > > > > >         |_|
> > > > > > > >
> > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > >
> > > > > > > > Just to note, because we use our own firmware that vendors in 
> > > > > > > > OpenSBI
> > > > > > > > and compiles only a significantly cut down number of files from 
> > > > > > > > it, we
> > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a 
> > > > > > > > result, we have
> > > > > > > > not tested v1.3, nor do we have any immediate plans to change 
> > > > > > > > our
> > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > >
> > > > > > > > I unless there's something obvious to you, it sounds like I 
> > > > > > > > will need to
> > > > > > > > go and bisect OpenSBI. That's a job for another day though, 
> > > > > > > > given the
> > > > > > > > time.
> > > > > > > >
> > > > > >
> > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > DT passed to OpenSBI 1.3.
> > > > > >
> > > > > > This issue does not exist in any of the DTs generated by QEMU but 
> > > > > > some
> > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) 
> > > > > > have
> > > > > > the E-core disabled.
> > > > > >
> > > > > > I had discovered this issue in a totally different context after 
> > > > > > the OpenSBI 1.3
> > > > > > release happened. This issue is already fixed in the latest OpenSBI 
> > > > > > by the
> > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: 
> > > > > > utils:
> > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > >
> > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > obviously not.
> > > > >
> > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for 
> > > > > > the
> > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > >
> > > > > Unfortunately the HSS has not worked in QEMU for a long time, and 
> > > > > while
> > > > > I would love to fix it, but am pretty stretched for spare time to 
> > > > > begin
> > > > > with.
> > > > > I usually just do direct kernel boots, which use the OpenSBI that 
> > > > > comes
> > > > > with QEMU, as I am sure you already know :)
> > > > >
> > > > > > At this point, you can either:
> > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > >
> > > > I forgot to reply to this point, wondering what should be done with
> > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > of whether I can go and build a fixed version of OpenSBI.
> > > >
> > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > user using the latest kernel (> v6.4)
> > > may hit those random linear map related issues (in hibernation or EFI
> > > booting path).
> > >
> > > There are three possible scenarios:
> > >
> > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > or sifive fu540 machine users
> > > may hit this issue if the device tree has the disabled hart (e core).
> > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > have issues [1]
> > > 3. Include a non-release version OpenSBI in Qemu with the fix as an 
> > > exception.
> > >
> > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > advocating for it though ;)
> > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > -bios argument instead of the stock one.
> > > I could be wrong but my guess is the number of users facing #2 would
> > > be higher than #1.
> >
> > Thanks for that info Atish!
> >
> > We are stuck in a bad situation.
> >
> > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > do you think you could do that?
>
> OpenSBI has a major number and minor number in the version but it does
> not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> using sbi_get_impl_version().
>
> There are only three commits between the ACLINT fix and OpenSBI v1.3
> so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> four commits on-top of OpenSBI v1.3
>
> Does this sound okay ?

That sounds fine to me. It fixes the issue for the Microsemi board and
it's a very small change between 1.3 and 1.3.1

Alistair

>
> >
> > Otherwise I think we should stick with OpenSBI 1.3. Considering that
> > it fixes UEFI boot issues for the virt board (which would be the most
> > used) it seems like a best call to make. People using the other boards
> > are unfortunately stuck building their own OpenSBI release.
> >
> > If there is no OpenSBI 1.3.1 release we should add something to the
> > release notes. @Conor Dooley are you able to give a clear sentence on
> > how the boot fails?
> >
> > Alistair
> >
> > >
> > > [1] 
> > > https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai@tinylab.org/
> > > > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > > > > >     microchip-icicle-kit machine with OpenSBI 1.3
> > > > >
> > > > > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > > > > the DT node. I'll just use tip-of-tree myself & up to the
> > > >
> > > > Clearly didn't finish this comment. It was meant to say "up to the QEMU
> > > > maintainers what they want to do on the QEMU side of things".
> > > >
> > > > Thanks,
> > > > Conor.
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Atish
> > >
>
> Regards,
> Anup



reply via email to

[Prev in Thread] Current Thread [Next in Thread]