[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] improve device probing heuristics

From: Andreas Dilger
Subject: Re: [PATCH] improve device probing heuristics
Date: Wed, 6 Feb 2002 23:21:27 -0700
User-agent: Mutt/

On Feb 06, 2002  19:41 -0800, Andrew Clausen wrote:
> --- Andreas Dilger <address@hidden> wrote:
> > Well, I had considered this also.  In fact, I might
> > make a separate
> > file (lib?) which is only concerned with iterating
> > whole block devices
> > and/or partitions.  I think this is used in many
> > places, so it might
> > be a benefit to keep this separable from blkid as a
> > whole.
> Obviously, this means you have to traverse the
> entire cache, but don't you need to do this
> anyway?
> (And iterating the cache should be cheap?)

Well, libblkid still needs a class of functions that I was talking about
above, namely, to find all of the block devices on that system.  Before
it has a cache, it has to have a list of devices to add to the cache.

That code is replicated in so many places, because you can't guarantee
that /proc/partitions is available, nor that it holds all of the block
devices that are actually accessible - only those that have already
been detected by the kernel.  It could very well be that for iSCSI or
such that it is blkid which is telling the kernel about the existence
of a device.

> OTOH, both libparted and libblkid have
> an abstraction of what a "device" is.
> This seems fairly important.  (For example,
> libext2fs also has one... the io channel
> thing)
> Perhaps liblkid should provide a standard
> that everyone can use?
> In other words, maybe libparted/device.c,
> libparted/linux.c, libparted/gnu.c should
> be part of libblkid?

Well, yes and no.  The code I refer to above is essentially what is
in linux_probe_all(), but enhanced with some of the changes I am making
(i.e. skipping removable devices, probing RAID devices if /proc/partitions
is not available, LVM (for blkid only at this point), etc.

Yes, maybe I could add a void * (or maybe even a list, in case there are
multiple layers like foo->libparted->libblkid) to each device struct which
the calling application can use to associate private data with the device.

> > Basically, you would want something like:
> > 
> > int blkid_iterate_devname(blkid_cache *cache,
> >                       int (fn *)(char *devname, long flags, void
> > *private), 
> >                       long flags, void *private);
> > 
> > and it calls (*fn) for each disk/partition that it
> > detects, passes the
> > application private data through, and flags.
> Either that, or a "get_next" function...

Either is fine.  The above is what I'm used to from e2fsprogs, but
I'm flexible.

> BTW: will we need fancy data structures for
> big systems?

I don't know.  I assume that no matter what sort of things happen with
a fancy device, in the end it will present a block device node which I
can read/write.   If it does other things, I'm not worried about that.

> > For input flags, I see:
> > 
> > BLKID_ITER_DISK  - call fn for whole disk devices (flag to fn also)
> > BLKID_ITER_PART  - call fn for partitions (flag to fn also)
> > BLKID_ITER_FULL  - don't get fancy with trying to probe a reduced
> >                    set of apparently available devices (e.g. don't
> >                    use the /proc/partitions or /devfs/discs data), buf
> >                    do a probe of all devices that we know to look for
> > BLKID_ITER_CACHE - call fn for devices listed in the blkid cache even
> >                    if they are not found via regular probe
> I think I prefer the following:
> * each device has a bit-field of:
>    - BLKID_STALE   (entry invalidated?)

Well, depending on how you set up the scanning code, these could be
either input flags or output flags.  On input they mean "I'm only
interested in getting devices with the given property", and on output
they mean "the device has this property".

> > I need the blkid_cache parameter only to support the
> > last flag.  The question also arises on whether we would
> > want to allow specifying iteration over a subset of devices
> > (e.g. only IDE, SCSI, (specific?) RAID)?
> This is another field...?
> (dev->bus?)

Yes, it could be.  Some of this information is encoded in the device
name and/or major/minor numbers, but it would also be nice for some
applications to not care.  Again, this could be either an input and/or
an output parameter.

> > I do not currently have iteration functions for all of
> > the structs which I export via libblkid, but looking at
> > libparted has really shown me that I need to do that.
> In the end, you need it because the user wants to see a list of
> devices, and click on the one she wants.

Well, I'm not so much worried about that.  Most of the code I'm writing
is not directly interfacing with a user, but is helper code for other
apps (mostly consolidating the duplicated and slightly mutated code from
many programs into a single place).

> > I have also considered having accessor functions for each data field
> > in each struct, so it hides struct internals from the applications,
> > but that is a lot of work to code...
> Does this ever gain you anything?
> I think it boils down to: 'How "internal" are
> the internals?'.  If the internals are so
> internal that you need to hide them, you
> probably got the ontology ("being") of the
> object wrong.  i.e. the wrong fields in the
> struct.

Well, I worked on libpng for a long time and we eventually moved to a
model where the struct is supposed to be totally opaque and all data
is gotten from accessor functions (even if they are just returning a
single field in the struct).  The reason we went to this model is that
doing anything else prevents you from ever making changes to the internals
of the struct (reorgs, name changes).  There were also problems with
struct alignment (we had a struct longjump in there, which changes size
depending on whether you compile with BSD or SYSV flags).

That said, it is a lot of work/code/hassle to do this if you don't need it.
We need a get/set for each struct field, we needed an interface to tell us
if a given feature is actually available before we try to access it, etc.

> > Yes, I would call the "low-level" devices as
> > whole-disk or unpartitioned
> > devices.  You would probably want to ignore MD
> > devices and loop, while
> > others might not.  Likewise with CDROMs.  Argh, this
> > is getting big.
> Indeed.

I think it boils down to being able to accurately specify what classes
of device you are interested in.  Parted, for example, currently wants
all unpartitioned devices (exclude disk partitions, MD, LVM, loop).
Conversely, blkid is _currently_ only handling partitioned or whole
disk devices (exclude partitioned devices) but may change in the future.

> The distro folks use libparted's list *grin*.
> (which is how the code got so complicated in
> the first place)

Well, I had thought there were tools/libs in the installers which
could give you a list of, say, disks, serial ports, etc.  Maybe
their code is too huge for some needs, or maybe it is dying to use
"the" block device iteration library, who knows.

> :(  I could redirect my .gnu.org mail spool,
> just I subscribe directly to most things
> address@hidden I could also
> download from there, but that leaves a mess
> to sort out later.  (Probably not too hard...
> and worth doing if I'm going to be without
> a server for a week).  OTOH, since I'm having
> plenty of problems at this end also, and I'm
> going to Sao Paulo soon...

Yes, I still get email from many sources as well (at least 3 or 4 on a
regular basis), but in general my total email volume is so high that I
don't miss it for a week or so if one is down for some reason.  Did I
ever mention that I didn't get email from one of my low-volume accounts
for _months_ and I didn't notice it?  Because I got some email CC'd to
that account, but also got it via a mailing list to another account,
I still got to see 90% of my email to that account, and I was _sure_
it was working correctly because I would recieve emails addressed to
that account.

I don't really want to consolidate my emails to a single account, as
that becomes a single point of failure (whether system or job related).
I suppose the (increasingly popular) solution is to have a personal
domain which you control, and then you can move that around as you need
so you always keep the same email address.

Cheers, Andreas
Andreas Dilger

reply via email to

[Prev in Thread] Current Thread [Next in Thread]