grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC] ieee1275: Avoiding many unnecessary open/close


From: Eric Snowberg
Subject: Re: [RFC] ieee1275: Avoiding many unnecessary open/close
Date: Fri, 20 Nov 2020 15:52:47 -0700

> On Nov 16, 2020, at 10:29 AM, diegodo <diegodo@linux.vnet.ibm.com> wrote:
> 
> On 2020-09-24 20:58, Eric Snowberg wrote:
>>> On Sep 24, 2020, at 11:11 AM, diegodo <diegodo@linux.vnet.ibm.com> wrote:
>>> Hi,
>>> we are facing some performance issues with ieee1275 platform (ppc64le 
>>> architecture to be more specific) and I would like to suggest a couple of 
>>> changes to try to avoid them.
>>> Sometimes the system can take more than 10 minutes to boot (in fact, it can 
>>> takes more than 30 and 40 minutes in some cases). From my investigations, I 
>>> noticied the following points:
>>> 1. Some types of storage devices (NVMe, Fibre Channel, ...) can have a long 
>>> time for shutdown notification. Our investigation here showed us this 
>>> shutdown increase with the size of the storage. For example, 6.4TB NVMe can 
>>> takes something like 7 seconds just to issue a shutdown notification.
>>> 2. The grub calls a lot for OPEN/READ/CLOSE for each device. I ran some 
>>> tests with just one disk (NVMe 6.4TB) and found that, even with one disk, 
>>> grub call the OPEN/CLOSE functions more than 60 times. Considering the 
>>> opening time of the device something like 1.5 seconds, we have almost 9 
>>> minutes just opening and closing the disk.
>>> I did some changes in the code and the time during the boot dropped 
>>> drastically - from 492s to only 15s.
>>> The idea is to handle the _actual close_ while _opening_ the disk. So, 
>>> during the _grub_ofdisk_open_ function, we check if the requested disk is 
>>> the same as the already opened and close the previous one if we are dealing 
>>> with a different disk. Yet, I removed the OPEN/CLOSE calls inside the 
>>> _grub_ofdisk_get_block_size_ function and let it just checking for the 
>>> block size.
>>> What are your suggestions? Is there a better way to handle this? I really 
>>> appreciate your help here.
>> I ran into the same issue with the ieee1275 platform on SPARC when using
>> ofdisk. It would take 20+ minutes to get to the GRUB menu.  I found the same
>> problem, open and close can take a long time and upper level grub code would
>> issue hundreds of them. I ended up writing obdisk.c. Within it, each disk is
>> only opened once. I tried to make it as generic as possible. You might 
>> consider
>> moving from ofdisk to obdisk.  If you did, ppc64le device specific code would
>> need to be included in iterate_devtree.
>> It also includes new features not found in ofdisk, such as hot-plug support 
>> and
>> the ability to find disks that don’t have a device alias.
>>> Thanks
>>> ---
>>> grub-core/disk/ieee1275/ofdisk.c | 63 +++++++++++++++++---------------
>>> 1 file changed, 34 insertions(+), 29 deletions(-)
>>> diff --git a/grub-core/disk/ieee1275/ofdisk.c 
>>> b/grub-core/disk/ieee1275/ofdisk.c
>>> index d887d4b..d0ee4ae 100644
>>> --- a/grub-core/disk/ieee1275/ofdisk.c
>>> +++ b/grub-core/disk/ieee1275/ofdisk.c
>>> @@ -44,7 +44,7 @@ struct ofdisk_hash_ent
>>> };
>>> static grub_err_t
>>> -grub_ofdisk_get_block_size (const char *device, grub_uint32_t *block_size,
>>> +grub_ofdisk_get_block_size (grub_uint32_t *block_size,
>>>                         struct ofdisk_hash_ent *op);
>>> #define OFDISK_HASH_SZ      8
>>> @@ -461,6 +461,7 @@ grub_ofdisk_open (const char *name, grub_disk_t disk)
>>>  grub_ssize_t actual;
>>>  grub_uint32_t block_size = 0;
>>>  grub_err_t err;
>>> +  struct ofdisk_hash_ent *op;
>>>  if (grub_strncmp (name, "ieee1275/", sizeof ("ieee1275/") - 1) != 0)
>>>      return grub_error (GRUB_ERR_UNKNOWN_DEVICE,
>>> @@ -471,6 +472,34 @@ grub_ofdisk_open (const char *name, grub_disk_t disk)
>>>  grub_dprintf ("disk", "Opening `%s'.\n", devpath);
>>> +  op = ofdisk_hash_find (devpath);
>>> +  if (!op)
>>> +    op = ofdisk_hash_add (devpath, NULL);
>>> +  if (!op)
>>> +    {
>>> +      grub_free (devpath);
>>> +      return grub_errno;
>>> +    }
>>> +
>>> +  /* Check if the call to open is the same to the last disk already opened 
>>> */
>>> +  if (last_devpath && !grub_strcmp(op->open_path,last_devpath))
>>> +  {
>>> +      goto finish;
>>> +  }
>>> +
>>> + /* If not, we need to close the previous disk and open the new one */
>>> +  else {
>>> +    if (last_ihandle){
>>> +        grub_ieee1275_close (last_ihandle);
>>> +    }
>>> +    last_ihandle = 0;
>>> +    last_devpath = NULL;
>>> +
>>> +    grub_ieee1275_open (op->open_path, &last_ihandle);
>>> +    if (! last_ihandle)
>>> +      return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "can't open device");
>>> +  }
>>> +
>>>  if (grub_ieee1275_finddevice (devpath, &dev))
>>>    {
>>>      grub_free (devpath);
>>> @@ -491,25 +520,18 @@ grub_ofdisk_open (const char *name, grub_disk_t disk)
>>>      return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "not a block device");
>>>    }
>>> +
>>> +  finish:
>>>  /* XXX: There is no property to read the number of blocks.  There
>>>     should be a property `#blocks', but it is not there.  Perhaps it
>>>     is possible to use seek for this.  */
>>>  disk->total_sectors = GRUB_DISK_SIZE_UNKNOWN;
>>>  {
>>> -    struct ofdisk_hash_ent *op;
>>> -    op = ofdisk_hash_find (devpath);
>>> -    if (!op)
>>> -      op = ofdisk_hash_add (devpath, NULL);
>>> -    if (!op)
>>> -      {
>>> -        grub_free (devpath);
>>> -        return grub_errno;
>>> -      }
>>>    disk->id = (unsigned long) op;
>>>    disk->data = op->open_path;
>>> -    err = grub_ofdisk_get_block_size (devpath, &block_size, op);
>>> +    err = grub_ofdisk_get_block_size (&block_size, op);
>>>    if (err)
>>>      {
>>>        grub_free (devpath);
>>> @@ -532,13 +554,6 @@ grub_ofdisk_open (const char *name, grub_disk_t disk)
>>> static void
>>> grub_ofdisk_close (grub_disk_t disk)
>>> {
>>> -  if (disk->data == last_devpath)
>>> -    {
>>> -      if (last_ihandle)
>>> -   grub_ieee1275_close (last_ihandle);
>>> -      last_ihandle = 0;
>>> -      last_devpath = NULL;
>>> -    }
>>>  disk->data = 0;
>>> }
>>> @@ -685,7 +700,7 @@ grub_ofdisk_init (void)
>>> }
>>> static grub_err_t
>>> -grub_ofdisk_get_block_size (const char *device, grub_uint32_t *block_size,
>>> +grub_ofdisk_get_block_size (grub_uint32_t *block_size,
>>>                         struct ofdisk_hash_ent *op)
>>> {
>>>  struct size_args_ieee1275
>>> @@ -698,16 +713,6 @@ grub_ofdisk_get_block_size (const char *device, 
>>> grub_uint32_t *block_size,
>>>      grub_ieee1275_cell_t size2;
>>>    } args_ieee1275;
>>> -  if (last_ihandle)
>>> -    grub_ieee1275_close (last_ihandle);
>>> -
>>> -  last_ihandle = 0;
>>> -  last_devpath = NULL;
>>> -
>>> -  grub_ieee1275_open (device, &last_ihandle);
>>> -  if (! last_ihandle)
>>> -    return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "can't open device");
>>> -
>>>  *block_size = 0;
>>>  if (op->block_size_fails >= 2)
>>> --
>>> 2.21.3
>>> _______________________________________________
>>> Grub-devel mailing list
>>> Grub-devel@gnu.org
>>> https://lists.gnu.org/mailman/listinfo/grub-devel
> 
> 
> Thanks for the help here Eric and good job on writing this new module.
> 
> Well, I'm still considering this patch because:
> 
> 1. We are facing some problems with powerpc and it seems the code of ofdisk.c 
> is somewhat bogus. IMO would be nice to have this code at least working well 
> on community. It would help us almost instantly with our problems regarding 
> grub on powerpc.
> 2. Although we are considering/discussing to switch to obdisk module, this 
> will need an extra effort and time on writing specific code, testing, etc.
> 
> From our tests here with powerpc, this code is promising to solve many 
> problems that we are currently facing.
> 
> So, since we already have obdisk for SPARC, do you think it could be a 
> problem to update this code as suggested in the patch?
> 

Years ago I tried submitting a patch very similar to yours [1].  The 
response I got back from the maintainers was there was concern it may 
impact some of the other ieee1275 platforms.  I didn’t have the resources 
to test them all.  Personally I don’t see how anyone can use ofdisk.  It 
contains a terrible dangling pointer bug that prevented me from adding any
new code [2].


1. https://lists.gnu.org/archive/html/grub-devel/2016-06/msg00038.html
2. https://lists.gnu.org/archive/html/grub-devel/2016-06/msg00035.html






reply via email to

[Prev in Thread] Current Thread [Next in Thread]