bug-parted
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: parted - larger logical and physical block sizes on GPT disks


From: Elliott, Robert (Server Storage)
Subject: RE: parted - larger logical and physical block sizes on GPT disks
Date: Fri, 4 Nov 2005 08:51:42 -0600

There is a web site with presentations from Hitachi, Intel, LSI,
Microsoft, Seagate, and WD discussing the topic:
http://www.bigsector.org.

--
Rob Elliott, address@hidden
Hewlett-Packard Industry Standard Server Storage Advanced Technology
https://ecardfile.com/id/RobElliott


 

> -----Original Message-----
> From: K.G. [mailto:address@hidden 
> Sent: Friday, November 04, 2005 8:22 AM
> To: Elliott, Robert (Server Storage); address@hidden; 
> Sven Luther
> Cc: address@hidden; address@hidden
> Subject: Re: parted - larger logical and physical block sizes 
> on GPT disks
> 
> Hi,
> 
> Only my opinion:
> 
> On Wed, 2 Nov 2005 15:18:53 -0600 "Elliott, Robert (Server 
> Storage)" <address@hidden> wrote:
> > I noticed a few things in parted's source code that might warrant
> > fixing, particularly for the GUID Partition Table (GPT) 
> partition format
> > used by Extensible Firmware Interface (EFI) systems.  The 
> Unified EFI
> > Specification will discuss these issues.
> > 
> > 1. Logical block sizes are not necessary 512 bytes; they 
> could be 1024,
> > 2048, or 4096 bytes (at least).  Both the ATA and the SCSI 
> block command
> > sets support this.  ATA devices typically do not implement it; SCSI
> > logical units sometimes do.  The code has "512" sprinkled 
> throughout,
> > which will probably cause problems.
> 
> Well, the sector size is a known problem in Parted. Until recently we
> didn't receive much reports about it, because probably about 
> 99.9999999%
> of people are using 512 bytes sectors. But now multiples of 512 bytes
> are beginning to be seen sometimes (in raid systems? very big 
> disk?) and
> Parted is mostly unusable when that happens.
> I believe this should be fixed in the whole program, but unfortunately
> this probably would involve a lot of work.
> Also some file system or disk labels are only described for 512 bytes
> sectors, so this might be a problem. In the disk_atari.c I've
> recently written, I explicitly discard atari disklabel probing if the
> sector size isn't 512; I guess we should probably add those kinds of
> tests for problematic FS/disklabels.
> 
> > 2. Even if the logical block size is 512 bytes, the 
> underlying physical
> > block size may be a multiple of that. The drive performs
> > read-modify-write when a full physical block is not 
> accessed, incurring
> > a performance hit but maintaining compatibility with 
> software that uses
> > 512 byte logical blocks.  
> > 
> > Serial ATA disks are expected to start doing this soon; 
> their physical
> > block may contain 1, 2, 4, or 8 logical blocks (the ATA 
> IDENTIFY DEVICE
> > command indicates how many).  SCSI doesn't have a way to report this
> > type of behavior yet (it has always assumed that software 
> would support
> > a larger logical block size) but it might be added to match ATA.
> 
> Interesting. I guess this divides data structures in 2 sets: old ones
> which aren't aware of the logical vs physical disk block size 
> issue will
> only consider logical sizes - and new ones like GPT which 
> handle it fine
> with size fields and backward compatibility with systems that 
> don't probe
> the physical sector size, right?
> (indeed there's a third set: the ones that just assume 512 
> bytes sectors)
> 
> > In this situation, it is important to align important 
> structures like
> > partition boundaries on the physical block boundaries; if they are
> > unaligned, then accesses that are aligned to the start of 
> the partition
> > will actually result in excessive read-modify-writes by the disk.
> > 
> > For the GPT partition format, the first partition naturally 
> starts on
> > LBA 34, which is fine for 512 and 1024 byte physical block 
> sizes but not
> > good for 2048 or 4096 byte physical block sizes.  Partition 
> tools like
> > parted should, unless specifically requested otherwise by a
> > knowledgeable user, start aligning their GPT partitions on larger
> > boundaries (e.g. 128KiB would suffice for many years).
> 
> I believe this could be easily done with a constraint in 
> *_partition_align
> functions. I think that when we get the logical and physical 
> disk block
> size and handle the logical size cleanly, we should put that kind of
> alignment for most disklabels (even if they know nothing about
> logical/physical sector sizes) but this might be a problem if 
> "cylinder"
> alignment is needed.
> 
> > Excerpts from disk_gpt.c that might have problems:
> > typedef struct _GuidPartitionTableHeader_t {
> >     uint64_t Signature;
> >     uint32_t Revision;
> >     uint32_t HeaderSize;
> >     uint32_t HeaderCRC32;
> >     uint32_t Reserved1;
> >     uint64_t MyLBA;
> >     uint64_t AlternateLBA;
> >     uint64_t FirstUsableLBA;
> >     uint64_t LastUsableLBA;
> >     efi_guid_t DiskGUID;
> >     uint64_t PartitionEntryLBA;
> >     uint32_t NumberOfPartitionEntries;
> >     uint32_t SizeOfPartitionEntry;
> >     uint32_t PartitionEntryArrayCRC32;
> >     uint8_t Reserved2[512 - 92];
> > } __attribute__ ((packed)) GuidPartitionTableHeader_t;
> > 
> > Comment: The header (and its Reserved2 field) actually fills up the
> > entire logical block, not just 512.
> > 
> >     data_start = 2 + GPT_DEFAULT_PARTITION_ENTRY_ARRAY_SIZE / 512;
> >     data_end = dev->length - 2
> >                - GPT_DEFAULT_PARTITION_ENTRY_ARRAY_SIZE / 512;
> > 
> > Comment: The logical block size is not always 512 bytes.
> > Comment: This probably leads to the first partition 
> starting at LBA 34,
> > which is not aligned for 2048 or 4096 byte sectors.  
> > 
> > 
> >     if (!ped_device_read (dev, gpt, sector,
> >                           sizeof (GuidPartitionTableHeader_t) /
> > 512))
> > 
> > Comment: The logical block size is not always 512 bytes.
> > 
> >             if ((PedSector) PED_LE64_TO_CPU (gpt.AlternateLBA)
> >                             < disk->dev->length - 1) {
> >                     char zeros[512];
> > 
> > #ifndef DISCOVER_ONLY
> >                     if (ped_exception_throw (
> >                             PED_EXCEPTION_ERROR,
> >                             PED_EXCEPTION_FIX |
> > PED_EXCEPTION_CANCEL,
> >             _("The backup GPT table is not at the end of the disk,
> > as it "
> >               "should be.  This might mean that another operating
> > system "
> >               "believes the disk is smaller.  Fix, by moving the
> > backup "
> >               "to the end (and removing the old backup)?"))
> >                                     == PED_EXCEPTION_CANCEL)
> >                             goto error;
> > 
> >                     write_back = 1;
> >                     memset (zeros, 0, 512);
> >                     ped_device_write (disk->dev, zeros,
> >                                       PED_LE64_TO_CPU
> > (gpt.AlternateLBA),
> >                                       1);
> > #endif /* !DISCOVER_ONLY */
> > Comment: The logical block size is not always 512 bytes.
> > 
> > ...
> > etc. (search on "512" to find likely problems)
> 
> As I said before, disk_gpt.c is only a small part of the problem... :/
> 
> Cheers,
> Guillaume Knispel
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]