bug-parted
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parted - larger logical and physical block sizes on GPT disks


From: K.G.
Subject: Re: parted - larger logical and physical block sizes on GPT disks
Date: Fri, 4 Nov 2005 15:22:03 +0100

Hi,

Only my opinion:

On Wed, 2 Nov 2005 15:18:53 -0600 "Elliott, Robert (Server Storage)" 
<address@hidden> wrote:
> I noticed a few things in parted's source code that might warrant
> fixing, particularly for the GUID Partition Table (GPT) partition format
> used by Extensible Firmware Interface (EFI) systems.  The Unified EFI
> Specification will discuss these issues.
> 
> 1. Logical block sizes are not necessary 512 bytes; they could be 1024,
> 2048, or 4096 bytes (at least).  Both the ATA and the SCSI block command
> sets support this.  ATA devices typically do not implement it; SCSI
> logical units sometimes do.  The code has "512" sprinkled throughout,
> which will probably cause problems.

Well, the sector size is a known problem in Parted. Until recently we
didn't receive much reports about it, because probably about 99.9999999%
of people are using 512 bytes sectors. But now multiples of 512 bytes
are beginning to be seen sometimes (in raid systems? very big disk?) and
Parted is mostly unusable when that happens.
I believe this should be fixed in the whole program, but unfortunately
this probably would involve a lot of work.
Also some file system or disk labels are only described for 512 bytes
sectors, so this might be a problem. In the disk_atari.c I've
recently written, I explicitly discard atari disklabel probing if the
sector size isn't 512; I guess we should probably add those kinds of
tests for problematic FS/disklabels.

> 2. Even if the logical block size is 512 bytes, the underlying physical
> block size may be a multiple of that. The drive performs
> read-modify-write when a full physical block is not accessed, incurring
> a performance hit but maintaining compatibility with software that uses
> 512 byte logical blocks.  
> 
> Serial ATA disks are expected to start doing this soon; their physical
> block may contain 1, 2, 4, or 8 logical blocks (the ATA IDENTIFY DEVICE
> command indicates how many).  SCSI doesn't have a way to report this
> type of behavior yet (it has always assumed that software would support
> a larger logical block size) but it might be added to match ATA.

Interesting. I guess this divides data structures in 2 sets: old ones
which aren't aware of the logical vs physical disk block size issue will
only consider logical sizes - and new ones like GPT which handle it fine
with size fields and backward compatibility with systems that don't probe
the physical sector size, right?
(indeed there's a third set: the ones that just assume 512 bytes sectors)

> In this situation, it is important to align important structures like
> partition boundaries on the physical block boundaries; if they are
> unaligned, then accesses that are aligned to the start of the partition
> will actually result in excessive read-modify-writes by the disk.
> 
> For the GPT partition format, the first partition naturally starts on
> LBA 34, which is fine for 512 and 1024 byte physical block sizes but not
> good for 2048 or 4096 byte physical block sizes.  Partition tools like
> parted should, unless specifically requested otherwise by a
> knowledgeable user, start aligning their GPT partitions on larger
> boundaries (e.g. 128KiB would suffice for many years).

I believe this could be easily done with a constraint in *_partition_align
functions. I think that when we get the logical and physical disk block
size and handle the logical size cleanly, we should put that kind of
alignment for most disklabels (even if they know nothing about
logical/physical sector sizes) but this might be a problem if "cylinder"
alignment is needed.

> Excerpts from disk_gpt.c that might have problems:
> typedef struct _GuidPartitionTableHeader_t {
>       uint64_t Signature;
>       uint32_t Revision;
>       uint32_t HeaderSize;
>       uint32_t HeaderCRC32;
>       uint32_t Reserved1;
>       uint64_t MyLBA;
>       uint64_t AlternateLBA;
>       uint64_t FirstUsableLBA;
>       uint64_t LastUsableLBA;
>       efi_guid_t DiskGUID;
>       uint64_t PartitionEntryLBA;
>       uint32_t NumberOfPartitionEntries;
>       uint32_t SizeOfPartitionEntry;
>       uint32_t PartitionEntryArrayCRC32;
>       uint8_t Reserved2[512 - 92];
> } __attribute__ ((packed)) GuidPartitionTableHeader_t;
> 
> Comment: The header (and its Reserved2 field) actually fills up the
> entire logical block, not just 512.
> 
>       data_start = 2 + GPT_DEFAULT_PARTITION_ENTRY_ARRAY_SIZE / 512;
>       data_end = dev->length - 2
>                  - GPT_DEFAULT_PARTITION_ENTRY_ARRAY_SIZE / 512;
> 
> Comment: The logical block size is not always 512 bytes.
> Comment: This probably leads to the first partition starting at LBA 34,
> which is not aligned for 2048 or 4096 byte sectors.  
> 
> 
>       if (!ped_device_read (dev, gpt, sector,
>                             sizeof (GuidPartitionTableHeader_t) /
> 512))
> 
> Comment: The logical block size is not always 512 bytes.
> 
>               if ((PedSector) PED_LE64_TO_CPU (gpt.AlternateLBA)
>                               < disk->dev->length - 1) {
>                       char zeros[512];
> 
> #ifndef DISCOVER_ONLY
>                       if (ped_exception_throw (
>                               PED_EXCEPTION_ERROR,
>                               PED_EXCEPTION_FIX |
> PED_EXCEPTION_CANCEL,
>               _("The backup GPT table is not at the end of the disk,
> as it "
>                 "should be.  This might mean that another operating
> system "
>                 "believes the disk is smaller.  Fix, by moving the
> backup "
>                 "to the end (and removing the old backup)?"))
>                                       == PED_EXCEPTION_CANCEL)
>                               goto error;
> 
>                       write_back = 1;
>                       memset (zeros, 0, 512);
>                       ped_device_write (disk->dev, zeros,
>                                         PED_LE64_TO_CPU
> (gpt.AlternateLBA),
>                                         1);
> #endif /* !DISCOVER_ONLY */
> Comment: The logical block size is not always 512 bytes.
> 
> ...
> etc. (search on "512" to find likely problems)

As I said before, disk_gpt.c is only a small part of the problem... :/

Cheers,
Guillaume Knispel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]