qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH fo


From: Anup Patel
Subject: Re: Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
Date: Wed, 19 Jul 2023 11:09:16 +0530

On Wed, Jul 19, 2023 at 7:03 AM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <atishp@atishpatra.org> wrote:
> >
> > On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor@kernel.org> wrote:
> > >
> > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > >
> > > > > > > OpenSBI v1.3
> > > > > > >    ____                    _____ ____ _____
> > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > >         | |
> > > > > > >         |_|
> > > > > > >
> > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > >
> > > > > > > Just to note, because we use our own firmware that vendors in 
> > > > > > > OpenSBI
> > > > > > > and compiles only a significantly cut down number of files from 
> > > > > > > it, we
> > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, 
> > > > > > > we have
> > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > platform firmware to vendor v1.3 either.
> > > > > > >
> > > > > > > I unless there's something obvious to you, it sounds like I will 
> > > > > > > need to
> > > > > > > go and bisect OpenSBI. That's a job for another day though, given 
> > > > > > > the
> > > > > > > time.
> > > > > > >
> > > > >
> > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > DT passed to OpenSBI 1.3.
> > > > >
> > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > the E-core disabled.
> > > > >
> > > > > I had discovered this issue in a totally different context after the 
> > > > > OpenSBI 1.3
> > > > > release happened. This issue is already fixed in the latest OpenSBI 
> > > > > by the
> > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: 
> > > > > utils:
> > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > >
> > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > obviously not.
> > > >
> > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > >
> > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > with.
> > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > with QEMU, as I am sure you already know :)
> > > >
> > > > > At this point, you can either:
> > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > >
> > > I forgot to reply to this point, wondering what should be done with
> > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > of whether I can go and build a fixed version of OpenSBI.
> > >
> > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > user using the latest kernel (> v6.4)
> > may hit those random linear map related issues (in hibernation or EFI
> > booting path).
> >
> > There are three possible scenarios:
> >
> > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > or sifive fu540 machine users
> > may hit this issue if the device tree has the disabled hart (e core).
> > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > have issues [1]
> > 3. Include a non-release version OpenSBI in Qemu with the fix as an 
> > exception.
> >
> > #3 probably deviates from policy and sets a bad precedent. So I am not
> > advocating for it though ;)
> > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > -bios argument instead of the stock one.
> > I could be wrong but my guess is the number of users facing #2 would
> > be higher than #1.
>
> Thanks for that info Atish!
>
> We are stuck in a bad situation.
>
> The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> do you think you could do that?

OpenSBI has a major number and minor number in the version but it does
not have release/patch number so best would be to treat OpenSBI vX.Y.Z
as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
using sbi_get_impl_version().

There are only three commits between the ACLINT fix and OpenSBI v1.3
so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
four commits on-top of OpenSBI v1.3

Does this sound okay ?

>
> Otherwise I think we should stick with OpenSBI 1.3. Considering that
> it fixes UEFI boot issues for the virt board (which would be the most
> used) it seems like a best call to make. People using the other boards
> are unfortunately stuck building their own OpenSBI release.
>
> If there is no OpenSBI 1.3.1 release we should add something to the
> release notes. @Conor Dooley are you able to give a clear sentence on
> how the boot fails?
>
> Alistair
>
> >
> > [1] 
> > https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai@tinylab.org/
> > > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > > > >     microchip-icicle-kit machine with OpenSBI 1.3
> > > >
> > > > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > > > the DT node. I'll just use tip-of-tree myself & up to the
> > >
> > > Clearly didn't finish this comment. It was meant to say "up to the QEMU
> > > maintainers what they want to do on the QEMU side of things".
> > >
> > > Thanks,
> > > Conor.
> >
> >
> >
> > --
> > Regards,
> > Atish
> >

Regards,
Anup



reply via email to

[Prev in Thread] Current Thread [Next in Thread]