[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Shared-Memory for the Hurd?

From: Farid Hajji
Subject: Re: Shared-Memory for the Hurd?
Date: Fri, 1 Dec 2000 04:01:47 +0100

[Sorry for excessive verbosity here...]

I'm trying to figure out how to implement shared memory for the Hurd.
I already have experience with Lites/Mach, but I'm still very new to
Hurd-programming, so please be patient with silly questions. One motivation
for doing this is to get familiar with Hurd internals.

Many approaches come to mind (ordered in increasing hurd-knowledge level):

1. mmap()-based shm*() library implementation.
2. using libpager and a special external pager.
3. using a shm-server that vm_map()s own memory objects to shm-clients.
4. using a trivfs-translator that will vm_map() own memory on requests
   of shm-clients.
5. using a fullsized translator that will vm_map() own memory
   to shm-clients, as well as providing pseudo-files mapping
   shm-segment contents.

1.) mmap()-based shm*() library:

   * Each memory segment with key KEY will be mmap()ped to a file
     /sysv/shm/{KEY} of specified length and permissions.
     shmget() would create and zero-fill the file if needed, open
     it and return a descriptor to the caller. Flags and permissions
     will map nearly 1:1 to file permissions and open() O_* flags.
   * shmat() will simply mmap() to the file(-descriptor) returned
     by shmget(). Since mmap() uses vm_map() internally, this will
     have the file contents appear in the address-space of the
     caller in obvious ways.
   * shmdt() will simply munmap() the address mmap()-ped before,
     again in obvious ways.
   * shmctl() will e.g. unlink the file or do other stuff.

   * This approach is so generic, that it was almost certainly
     being written before on plain old Unix systems that provide
     mmap() syscalls but no shm*() interface. It can be reimplemented
     on the Hurd as well, if nobody knows of an existing free library.
   * The implementation will provide shm memory that is persistent
     across reboots (!), which is even more than what classical
     SysV shm requires.
   * Accessing the files directly (read, seek(), write()) is yet
     another way to change the shm segments and can't be that bad.
   * ipcs and ipcrm shm functionality is trivial to implement.
   * shm-segments being mmap()ed to files means that they can
     even persist across reboots, if the files are backed by a
     disk-based store (e.g. in the ext2fs filesystem).
   * it can be costly to create files through FSYS translator, both
     in time and space.

2.) libpager with special external pager.

   * Is this really feasible? What happens when the reference count
     of a memory object drops to zero? Mach notifies the pager that
     the MO is no longer valid and the pager is free to reuse the
     swap-space for other purposes. The MO was already dropped by
     Mach, so there is no chance in keeping it, right?

   -> special pagers wouldn't be of much help here, unless there
      is a way to tell mach to keep memory objects active (e.g.
      by transferring their port-rights to the special pager?),
      vm_wire() and so on...
      Any ideas?

3.) shm-server vm_map()ping own memory to requesting clients

   * a shm-server holds vm_allocate()ed pages and vm_map()s them
     to requesting shm-clients. This way, the memory object won't
     be reclaimed by mach if no shm-client is present. They (the
     memory objects) are still being held by the shm-server, so
     they persist as sysv shm semantics require.
   * a shm-client calls shmget() to open/create a shm-segment.
     shmget() would send the shm-client a send-right to its
     task control port to the shm-server. The shm-server vm_allocates()
     a memory object big enough to satisfy the client, saves a
     pointer/port to it as well as the task control port of the
     client in an internal table and returns a reply [code | port?]
     back to shm-client (back to shmget()).
   * the shm-client calls shmat(), specifying an address where to
     map the segment. shmat() sends a message to shm-server,
     specifying the requested address (as well as the number/port
     of the returned shmget() value). shm-server vm_map()s the
     page into the address space of the shm-client, using shm-client's
     previously sent-in task control port for this.
   * other shm-clients could also use shmget()/shmat() to access the
     same memory object in the shm-server in the same way.
   * shm-client calls shmdt() sending the shm-server the address
     of the shm-segment. shm-server vm_unmap()s the pages from
     the shm-client (using their task control port) but keeps
     the pages in its own task!
     [I use here vm_unmap() for clarity. In mach, it is of course
      a call to vm_deallocate(target_task, address, size), which
      will affect only one task].
   * some shm-client calls shmctl() to remove the shm-segment.
     shmctl() contacts shm-server and sends a message specifying
     the request. shm-server vm_deallocate()s the segment on its
     side, freeing memory.

   * Since the Hurd doesn't provide a snames/netnameserver and
     netname_check_in()/netname_look_up() functions to contact
     shm-server, the clients need to inherit a port to that
     server in some way.
   * How to get a port-right to access the shm-server [if classic
     Mach netnameserver doesn't work]?
       + inherit the port-right across task_create()/fork()?
       + obtain the port-right through proc?
       + obtain the port-right through the filesystem
         (a.k.a. shm-server is translator, at least to rendez-vous
          on the filesystem, like pfinet on /servers/socket/2?)
     Of course, the filesystem would be more hurd-ish.
   * The task control port (send-right) of shm-clients is being
     transmitted to shm-server. There is no need to find this
     port for shm-server. Is it safe to do so? I currently see
     no other way to use vm_map() here.

4.) shm-server as trivfs-based translator:

   * shm-server manages vm_allocate()ed shm-segments for shm-clients just
     like in 3.).
   * shm-server implemented as trivfs-based translator.
     This is not so hard, hurd/trans/hello[-mt].c is a good
     example to start with.
   * shmget() will have to be split in two operations:
       open()  to contact the shm-server translator (get a file_t).
         [/servers/ipc/shm], then
       write() to transmit the desired key-id, size and flags, so that
     shm-server can vm_allocate() the shm-segment if necessary.
   * shmat() will:
       write() to transmit the desired target address, the shm-segment-#
         and other flags.
       -> Problem here: How can the task control port [send-right] of the
            shm-client be transmitted to the shm-server translator?
            Is it generally possible to transfer port-rights through the
            trivfs interface? How? Is the file_t indirectly returned by
            an open() suitable for sending port-rights to the trivfs-
            translator? Hmmm...
       shm-server will vm_map() the shm-segment in the address space of
       the requesting task.
       -> Same problem: how to get the task control port of the caller
            of write()? We only get some other port here... (?).
    * shmdt() will:
        write() to transmit the desired address + indication to unmap it.
        shm-server will simply vm_unmap() the shm-segment from the
        address-space of the task.
    * shmctl(IPC_REMOVE) will write() shm-server, sending a shm-segment-#
        so that shm-server can vm_deallocate() it internally.

  * vm_map() and vm_unmap() need the task control port of the target task
    so that it can modify that address space. A shm-client can
    mach_task_self() to get it, but how can it be transmitted to the
    trivfs-translator (e.g. in a write() request)? Can it be sent to
    the file_t returned by open()? Will it be inserted in the translator?
    How does the translator get this port-right (receive it)?
  * The translator could create a receive port and advertise this port
    in read()-requests of clients. (How? Is it possible to send port-rights
    this way with trivfs?). Clients could now send their task control port
    send-right to this port.
  * Is a full .defs handshake necessary?

5.) fullsized translator, providing pseudo-filesystem:

  * each shm-segment is represented by a pseudo-file managed by a
    fullsized translator settrans on e.g. /shm. The translator manages
    vm_allocate()ed pages just like in 3.) and 4.), mapping them to
    the pseudo-files [that is: shm-segment 34343 is mapped to /shm/34343].
  * shmget(IPC_PRIVATE) will open /shm/0, a special file meaning: create
    a new segment. same for IPC_CREAT ... shm-server vm_allocates() the
    memory, creates the pseudo-file and returns a port to that file to
    the caller.
  * shmat() calls write() on the specified file, sending the requested
    address, flags [and task control port, how?] to the /shm translator.
    the translator vm_map()s the memory to the client on request.
  * shmdt() calls write() on the specified file, sending the requested
    address to /shm translator. translator gets filename (shm-segment),
    [and task control port] and calls vm_unmap() to remove the memory
    from the caller's address space.
  * shmctl(IPC_REMOVE) simply unlink()s the pseudo-file /shm/key-id,
    effectively triggering shm-translator to vm_deallocate() it.
    -> Problem: clients already vm_map()ping this region will still
       help a reference to that memory, until the last one
       vm_deallocate()s it.

  * shm-segments are visible as plain [pseudo-] files and could also
    be mmap()-ed, or read()/write()... from other processes that have
    enough permissions.

1. is probably very easy to implement.
2. is probably impossible.
3. is possible, if shm-clients can get an advertised port of shm-server
   to start talking. Where to get that? Via procfs (a.k.a. via
   pid_2task() or something like this?)? 3. uses much mach syscalls
   to send/receive messages with port-rights. Shouldn't that be
   isolated through higher Hurd layers?
4. if it is possible to transmit port-rights to a trivfs-based
   translator (e.g. through the file_t port open()ed), then it
   is possible and relatively easy to do shm this way.
5. same as in 4, but much more harder to implement, since there
   is no good template/hello for a full filesystem translator in
   the Hurd right now (besides full-blown ext2, ufs).
   -> What about /procfs translator? Someone wrote such a beast,
      but it is not in CVS. Where to get it please?

General question: Is it possible to transmit port-rights to
translators via ioctl()s? What callback function handles ioctl()s
to open filedescriptors? [ioctl() is a channel separate from write()
to send date to a file in Unix. Is this true for the Hurd too?]

Does all/most of this make sense, or I'm I now completely lost?

Hopefully my questions are now more easily understandable.

> > >   1. how to write (non-trivfs) translators
> The soruce (look at say ext2 and use that as a template for new ones).
Okay, I'll bite the bullet and will look at ext2 in more detail. What
I need right now is simply the list of callbacks that must be provided.
BTW, hurd/trans/hello[-mt].c was a very nice example for trivfs and a
similar example for non-trivfs translators would be nice to have too.
Do you know where procfs translator is? AFAIK, it's not in CVS yet.

> > >   2. how to use stores
> The source (and it is not easy).
That's the reason I asked in the first place ;-). Right now, I'll probably
avoid stores completely and keep vm_allocate()ed regions in the shm-trans
or shm-server directly.

> > >   3. how the port rights are passed from task to task
> The CMU documentation.  But this is actually quite easy, there is two
> ways:  a task inherits ports from its parent (the bootstrap port, the
> root, etc).  Second, port rights are passed along inside of messages:
Sorry for not being more precise. I just wanted to know how to get
a send-right to the task control port of the shm-requesting clients,
probably via proc (pid_2task() or somesuch). Or the other way round:
how to send that task control port to a shm-translator (file_t?).



Farid Hajji -- Unix Systems and Network Admin | Phone: +49-2131-67-555
Broicherdorfstr. 83, D-41564 Kaarst, Germany  |
- - - - - - - - - - - - - - - - - - - - - - - + - - - - - - - - - - - -
Murphy's Law fails only when you try to demonstrate it, and thus succeeds.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]