grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] efi: Fix stack protector issues


From: Glenn Washburn
Subject: Re: [PATCH] efi: Fix stack protector issues
Date: Wed, 3 Jan 2024 12:18:39 -0600

On Wed, 3 Jan 2024 16:36:57 +0100
Ard Biesheuvel <ardb@kernel.org> wrote:

> On Mon, 1 Jan 2024 at 03:52, Glenn Washburn <development@efficientek.com> 
> wrote:
> >
> > On Sun, 31 Dec 2023 11:56:18 +0100
> > Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > > Hi Glenn,
> > >
> > > On Thu, 28 Dec 2023 at 03:26, Glenn Washburn
> > > <development@efficientek.com> wrote:
> > > >
> > > > On Sat, 23 Dec 2023 12:45:35 +0100
> > > > Ard Biesheuvel via Grub-devel <grub-devel@gnu.org> wrote:
> > > >
> > > > > From: Ard Biesheuvel <ardb@kernel.org>
> > > > >
> > > > > The 'ground truth' stack protector cookie value is kept in a global
> > > > > variable, and loaded in every function prologue and epilogue to store
> > > > > it into resp. compare it with the stack slot holding the cookie.
> > > > >
> > > > > If the comparison fails, the program aborts, and this might occur
> > > > > spuriously when the global variable changes values between the entry 
> > > > > and
> > > > > exit of a function. This implies that assigning the global variable at
> > > > > boot must not involve any instrumented function calls.
> > > >
> > > > Not quite true, I had an alternative patch that searched the stack for
> > > > the old canary and replaced it with the new canary. This implementation
> > > > is better, though less general (which doesn't add any value).
> > > >
> > >
> > > I don't think exposing an API that circumvents the C runtime and
> > > 'fixes' a call stack by replacing invalid cookie values with valid
> > > ones belongs in production code. It is not trivial to implement
> > > generically, and has the potential to do more harm than good if an
> > > attacker manages to access it.
> >
> > Agreed, which is why I didn't submit it. As it doesn't seem that you
> > disagree with me, my point still stands.
> >
> 
> Not sure what point you are trying to make here. We both agree that
> doctoring a live call stack to get the stack cookies in sync is a bad
> idea. The upshot of that is that we must never enter any functions
> with the old stack cookie value and return from them with another,
> which implies that assigning the global variable at boot must not
> involve any instrumented function calls.

I was pointing out a hidden assumption in your commit message, which is
that we don't want to doctor live call stacks. Without that assumption
your use of the word "must" is technically incorrect (and I'm assuming
a common definition of must). Perhaps adding a clause "Since
circumventing the C runtime to modify the canary on the callstack
directly is considered a bad idea, this implies ...".

> > >
> > > > > Note that the use of __attribute__((optimize)) is described as
> > > > > unsuitable for production use in the GCC documentation, so let's drop
> > > > > this as well now that it is no longer needed.
> > > >
> > > > Good to know. I didn't particularly like that change either.
> > > >
> > > > The added benefit of this change is that it looks like it will make it
> > > > easier to add stack protection to other targets.
> > > >
> > >
> > > Indeed.
> > >
> > > > >
> > > > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > > > > ---
> > > > >  grub-core/kern/efi/init.c      | 27 ++++++++-------------------
> > > > >  grub-core/kern/main.c          |  5 +++++
> > > > >  include/grub/stack_protector.h | 13 +++++++++++++
> > > > >  3 files changed, 26 insertions(+), 19 deletions(-)
> > > > >
> > > > > diff --git a/grub-core/kern/efi/init.c b/grub-core/kern/efi/init.c
> > > > > index 6c54af6e7..1637077e1 100644
> > > > > --- a/grub-core/kern/efi/init.c
> > > > > +++ b/grub-core/kern/efi/init.c
> > > > > @@ -39,12 +39,6 @@ static grub_efi_char16_t stack_chk_fail_msg[] =
> > > > >
> > > > >  static grub_guid_t rng_protocol_guid = GRUB_EFI_RNG_PROTOCOL_GUID;
> > > > >
> > > > > -/*
> > > > > - * Don't put this on grub_efi_init()'s local stack to avoid it
> > > > > - * getting a stack check.
> > > > > - */
> > > > > -static grub_efi_uint8_t stack_chk_guard_buf[32];
> > > > > -
> > > > >  /* Initialize canary in case there is no RNG protocol. */
> > > > >  grub_addr_t __stack_chk_guard = (grub_addr_t) 
> > > > > GRUB_STACK_PROTECTOR_INIT;
> > > > >
> > > > > @@ -77,8 +71,8 @@ __stack_chk_fail (void)
> > > > >    while (1);
> > > > >  }
> > > > >
> > > > > -static void
> > > > > -stack_protector_init (void)
> > > > > +grub_addr_t
> > > > > +grub_stack_protector_init (void)
> > > > >  {
> > > > >    grub_efi_rng_protocol_t *rng;
> > > > >
> > > > > @@ -87,23 +81,20 @@ stack_protector_init (void)
> > > > >    if (rng != NULL)
> > > > >      {
> > > > >        grub_efi_status_t status;
> > > > > +      grub_addr_t guard = 0;
> > > > >
> > > > > -      status = rng->get_rng (rng, NULL, sizeof (stack_chk_guard_buf),
> > > > > -                          stack_chk_guard_buf);
> > > > > +      status = rng->get_rng (rng, NULL, sizeof (guard) - 1,
> > > > > +                          (grub_efi_uint8_t *) &guard);
> > > > >        if (status == GRUB_EFI_SUCCESS)
> > > > > -     grub_memcpy (&__stack_chk_guard, stack_chk_guard_buf, sizeof 
> > > > > (__stack_chk_guard));
> > > > > +     return guard;
> > > > >      }
> > > > > -}
> > > > > -#else
> > > > > -static void
> > > > > -stack_protector_init (void)
> > > > > -{
> > > > > +  return 0;
> > > > >  }
> > > > >  #endif
> > > > >
> > > > >  grub_addr_t grub_modbase;
> > > > >
> > > > > -__attribute__ ((__optimize__ ("-fno-stack-protector"))) void
> > > > > +void
> > > > >  grub_efi_init (void)
> > > > >  {
> > > > >    grub_modbase = grub_efi_section_addr ("mods");
> > > > > @@ -111,8 +102,6 @@ grub_efi_init (void)
> > > > >       messages.  */
> > > > >    grub_console_init ();
> > > > >
> > > > > -  stack_protector_init ();
> > > > > -
> > > > >    /* Initialize the memory management system.  */
> > > > >    grub_efi_mm_init ();
> > > > >
> > > > > diff --git a/grub-core/kern/main.c b/grub-core/kern/main.c
> > > > > index 731c07c29..1244fa84f 100644
> > > > > --- a/grub-core/kern/main.c
> > > > > +++ b/grub-core/kern/main.c
> > > > > @@ -270,6 +270,11 @@ grub_main (void)
> > > > >
> > > > >    grub_boot_time ("After machine init.");
> > > > >
> > > > > +#ifdef GRUB_STACK_PROTECTOR
> > > > > +  /* This call can only be made from a function that does not 
> > > > > return. */
> > > > > +  grub_update_stack_guard ();
> > > > > +#endif
> > > > > +
> > > >
> > > > Why don't we do this before grub_machine_init () so that
> > > > grub_efi_init() is covered?
> > >
> > > Because we use locate_protocol(), which I don't think we should be
> > > using before grub_efi_init(), even if we could theoretically (but I
> > > haven't checked whether this is the case)
> >
> > What might the problem be?
> >
> 
> There is no problem, in fact. (now that I checked). But using EFI
> calls before doing any kind of init seems fragile to me.

You probably have a better intuition that me on this, but it would seem
to me that as long as one adheres to the mandates of EFI any EFI calls
should be fine because EFI does its own init that does not rely on the
EFI application.

> > > The important thing is to initialize this before accepting any user
> > > input, so I don't think grub_machine_init() is something that really
> > > needs the stack protector to begin with.
> >
> > I don't quite agree. There are various potential attack vectors, not
> > just user input. The threat model here is that the attacker has physical
> > access. So what if a malicious disk is added or existing disk has
> > firmware exploited such that it can send replies that overflow a buffer?
> > I also have a patch in the pipeline that will add in grub_efi_init()
> > functionality to populate the GRUB environment from a UEFI variable.
> > Other stuff may be added later. It seems prudent to me to have stack
> > guard initialized before any kind of input from the system.
> >
> 
> I agree with that in principle, but I think that 'machine init' needs
> a better definition then.

Perhaps you mean "name" instead of "definition". I'm unaware of
anywhere that "machine init" is defined. I think its reasonable to have
this outside of grub_machine_init() as a special case, even if a
reasonable definition of machine init might include it. A comment could
be added describing why this is the case.

> > > >  grub_update_stack_guard() and
> > > > __stack_chk_fail() do not appear to rely anything done in
> > > > grub_efi_init().
> > > >
> > >
> > > IMO after grub_machine_init is early enough, and this makes it more
> > > likely that the stack protector support can be ported to other
> > > platforms too.
> >
> > I also have this suspicion. However, I think it makes sense to verify
> > if it can work for EFI to have stack guard setup before
> > grub_efi_init(), and do that if so. If, down the line, a target has a
> > problem with this, we can either move it to where it is now or even
> > better do ifdefery to have it later just for those targets. Does it
> > make sense to make some targets less secure because the implementation
> > doesn't work on all targets? (I think this very feature answers that in
> > the negative)
> >
> 
> 'less secure against an unspecified threat' is not a great
> justification for adding intricate init code.

What makes the current patch less intricate? Also, I specified a couple
potential attack vectors, so I'd say 'unspecified' is unwarranted. If
you mean there is not a known existing exploit, then that reasoning
can be used against a lot of security related code. One hopes that code
to improve security helps to prevent classes of attacks, not just
specified ones. My suggestion helps to close potential gaps in that
protection.

> But the real problem here is that GRUB is required to interface with
> the wider system to obtain randomness, in this case from the
> EFI_RNG_PROTOCOL but there could be other sources on other platforms.
> 
> So perhaps we should introduce a separate 'machine' hook for the sole
> purpose of obtaining randomness, which is defined as being callable
> before machine_init(), and required not to initialize any part of the
> platform that is not needed for this purpose.
>
> But for the purposes of this patch, I think we can just go with your
> suggestion, and move the stack protector init before
> grub_machine_init(), given that only EFI platforms implement support
> for it, and improve upon that once the need arises.

This all seems reasonable to be and we can revisit this when the next
target/platform is implementing stack guard.

> 
> 
> > >
> > > > >    /* This breaks flicker-free boot on EFI systems, so disable it 
> > > > > there. */
> > > > >  #ifndef GRUB_MACHINE_EFI
> > > > >    /* Hello.  */
> > > > > diff --git a/include/grub/stack_protector.h 
> > > > > b/include/grub/stack_protector.h
> > > > > index c88dc00b5..9212bb4a6 100644
> > > > > --- a/include/grub/stack_protector.h
> > > > > +++ b/include/grub/stack_protector.h
> > > > > @@ -25,6 +25,19 @@
> > > > >  #ifdef GRUB_STACK_PROTECTOR
> > > > >  extern grub_addr_t EXPORT_VAR (__stack_chk_guard);
> > > > >  extern void __attribute__ ((noreturn)) EXPORT_FUNC 
> > > > > (__stack_chk_fail) (void);
> > > > > +
> > > > > +grub_addr_t
> > > > > +grub_stack_protector_init (void);
> > > > > +
> > > > > +static inline __attribute__((__always_inline__))
> > > > > +void grub_update_stack_guard (void)
> > > > > +{
> > > > > +  grub_addr_t guard;
> > > > > +
> > > > > +  guard = grub_stack_protector_init ();
> > > > > +  if (guard)
> > > > > +     __stack_chk_guard = guard;
> > > > > +}
> > > > >  #endif
> > > > >
> > > > >  #endif /* GRUB_STACK_PROTECTOR_H */



reply via email to

[Prev in Thread] Current Thread [Next in Thread]