bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] fully enable rpctrace to trace multitask programs.


From: olafBuddenhagen
Subject: Re: [PATCH] fully enable rpctrace to trace multitask programs.
Date: Mon, 8 Jun 2009 08:32:54 +0200
User-agent: Mutt/1.5.18 (2008-05-17)

Hi,

On Sun, May 31, 2009 at 11:32:25AM +0800, Zheng Da wrote:

> I rewrite rpctrace so that it can trace multitask programs.

I wonder, is this inherently a single large change, or would it somehow
be possible to split it into several distinct patches, which could be
applied individually?...

> 1. tracing the source and destination of the RPC requests/replies. It
> is required when we move the port rights and it also allows rpctrace
> to show the caller of RPCs (I think it is a nice feature).

It only shows the caller when it is also a traced task, i.e. this is
only meaningful when indeed tracing multitasked stuff, right?

> The patch can also fix some bugs reported in savannah, for example,
> the traced task hangs when the it gets signals.

Are these fixes a side effect of the new functionality, or have you done
fixes explicitely? In the latter case, they should be split out into
distinct patches...

> 1. rpctrace hangs when it traces itself. I think the reason is that
> the tracer and tracee both need to set the kernel task port and thread
> port.

This sounds plausible. I guess it could only be fixed by proxying the
RPCs for setting the kernel ports...

However, AIUI the original rpctrace doesn't use these -- so there must
be other problems as well?...

It seems to me that it would be useful to add some consistency checks at
various places, making sure that things are still in the state rpctrace
believes them to be.

> 2. pfinet dies when it is traced by rpctrace. The reason is unknown.

> 3. gcc/mig reports "server type check failure" when it is traced by
> rpctrace.

Would be useful if these could be tracked down... Even if it turns out
that the actual issues are not easily fixable.

> The patch is below. Comments and tests are very welcome.

I don't know enough about rpctrace to do a full review... So this is
only going to be a lot of nitpicking and some strange questions :-)

> diff --git a/utils/rpctrace.c b/utils/rpctrace.c
> index d80f41d..eb33af7 100644
> --- a/utils/rpctrace.c
> +++ b/utils/rpctrace.c
> @@ -39,6 +39,7 @@
>  #include <stdbool.h>
>  #include <stddef.h>
>  #include <argz.h>
> +#include <sys/mman.h>
> 
>  const char *argp_program_version = STANDARD_HURD_VERSION (rpctrace);
> 
> @@ -59,6 +60,9 @@ static const struct argp_option options[] =
>    {0}
>  };
> 
> +#define UNKNOWN_TASK -1
> +#define UNKNOWN_NAME -1

Not a good idea to define these to an arbitrary value -- there is no
guarantee that this value won't appear in normal usage...

You could use one of the reserved values (MACH_PORT_NULL,
MACH_PORT_DEAD) perhaps, if there are no clashes to be expected with the
normal use of these values. Or you could simply allocate a dummy port
that will never be used, except as a placeholder for these values...

Note that according to the GNU Mach documentation, MACH_PORT_DEAD is
actually ~0, i.e. -1 -- so if you use MACH_PORT_DEAD, the effect will be
the same, but the overloading will be explicit then...

> +
>  static const char args_doc[] = "COMMAND [ARG...]";
>  static const char doc[] = "Trace Mach Remote Procedure Calls.";
>  
> @@ -82,6 +86,49 @@ msgid_ihash_cleanup (void *element, void *arg)
>  static struct hurd_ihash msgid_ihash
>    = HURD_IHASH_INITIALIZER (HURD_IHASH_NO_LOCP);
> 
> +/* all traced tasks. */
> +task_t *traced_tasks;
> +size_t nb_traced_tasks;
> +
> +void
> +add_task (task_t task)
> +{
> +  static size_t capability = 100;

"capability" is a rather confusing name... Commonly such things are
called alloc_size or something like that.

> +
> +  if (traced_tasks == NULL)
> +    traced_tasks = malloc (sizeof (task_t) * capability);

No need to special-case this: realloc() applied on NULL is equivalent to
malloc().

Of course, this requires the start value for alloc_size to be 0, and
consequently the incrementation algorithm needs to change. (e.g.
alloc_size = alloc_size * 2 + 100)

> +  else if (nb_traced_tasks == capability)
> +    {
> +      capability *= 2;
> +      traced_tasks = realloc (traced_tasks, capability);
> +    }
> +  assert (traced_tasks);
> +
> +  traced_tasks[nb_traced_tasks++] = task;
> +}

actually I don't think it's a good idea to use chunked allocation here
at all... Probably just adds unnecessary complexity. Simply realloc()
each time -- it's not *that* inefficient. Or use a proper dynamic data
structure; like a linked list, or perhaps even better an ihash.

> +
> +void
> +remove_task (task_t task)
> +{
> +  int i,j;
> +  for (i = 0, j = 0; i < nb_traced_tasks && j < nb_traced_tasks; i++, j++)
> +    {
> +      if (traced_tasks[j] == task)
> +     j++;
> +      traced_tasks[i] = traced_tasks[j];
> +    }
> +  nb_traced_tasks = i;
> +}

Yeah, this definitely looks like a job for a dynamic data structure...

> +
> +task_t
> +get_task (int idx)
> +{
> +  if (idx >= nb_traced_tasks)
> +    return -1;
> +  assert (traced_tasks);
> +  return traced_tasks[idx];
> +}
> +
>  /* Parse a file of RPC names and message IDs as output by mig's -list
>     option: "subsystem base-id routine n request-id reply-id".  Put each
>     request-id value into `msgid_ihash' with the routine name as its value.  
> */
> @@ -190,21 +237,40 @@ msgid_trace_replies (const struct msgid_info *info)
>  /* We keep one of these structures for each port right we are tracing.  */
>  struct traced_info
>  {

I had various doubts about this structure, until I realized that the
real issue goes deeper: It is no longer appropriate to use a single
structure here, when tracing multiple tasks.

I hope you are familiar with relational databases a bit, as it's easiest
to explain using that terminology: in the original case the situation
was simple, as for every traced port, there was exactly one sender and
one receiver; one wrapper port, and one forward port. (The role
alternating between the traced task and the "outside", depending on what
end the traced task holds). There was a 1:1 relation, meaning everything
can be stored in a single table.

With multiple traced tasks, the situation is different though: we still
have exactly one receiver for each port, but any number of senders -- a
1:m relationship. You really need two tables now.

You have tried still to keep it in a single data structure, but this is
wrong, and it shows: The bits of data required for the sender part and
the receiver parts are very different. Consequently, you get lots of
fields that are only meaningful in some cases, or ever worse, have
overloaded meaning in different cases. In database-speak, the table is
heavily denormalized.

The solution is simple: Use two different data structures. One for
receiver_info, holding the forward port and other information about the
receiver; and another one for sender_info, holding the wrapper port and
other information about the sender. Many things will become much cleaner
and more obvious this way.

(I'm not quite sure what best to do about send-once wrappers, though...)

On a related note, you are calling the data structures holding receiver
info "receive wrapper"; in every message, you have a "send wrapper" and
a "receive wrapper" involved. This doesn't really make sense: the port
is only wrapped once.

The receiver-side data structure doesn't actually do any wrapping -- it
only holds information about the receiver and the forward port. The
sender side structure holds the wrapper port, so the term "wrapper" is
more appropriate here -- but still, I suggest using the term only for
the actual wrapper port structure, not for all of the sender_info
structure...

I think it would indeed be useful to do the restructuring in separate
patches, and let the actual support for multiple tasks go in last,
building on that. (It seems to me that the bulk of the changes is
actually in the restructuring...)

> +  /* 'pi' is only used for send right or send-once right,
> +   * no task can get the send right or send-right to
> +   * the wrapper for the receive right.*/
>    struct port_info pi;
> 
> -  mach_port_t forward;               /* real port */
> +  /* real port.
> +     it is only used by send-once right and receive right. */
> +  mach_port_t forward;
>    mach_msg_type_name_t type;

Putting a comment above a block of several lines like this, suggests
that the comment applies to the whole block -- which is probably not
intended here?...

Use a blank line to make it clear that the comment doesn't also apply to
the second line.

Also, serious nitpicking mode: Sentences begin with a capital letter :-)
(Only exception is if the sentence starts with an identifier, in which
case it should not be capitalized of course...)

> -
> +  /* the task who has the right.
> +     it is only used by the receive right and send right */
> +  task_t task;
> +  /* the port name in the owner task.
> +   * it is only used by the receive right. */
> +  mach_port_t portname;
> +
> +  /* It is only used by the send right and the receive right.
> +   * For the send right, it points to the corresponding receive right.
> +   * For the receive right, it links to other receive rights. */
> +  struct traced_info *receive_right;

Why does it need to link to other receive rights?...

> +  /* it is used to form the list of send rights for different tasks.
> +   * the head is the receive right. */
> +  struct traced_info *next;

Another case of problematic overloading: the head pointer should use a
distinct name from the next pointers; it's very confusing otherwise. But
this issue, like many other ones in this data structure, will go away
when having distinct structers for sender and receiver...

> +                             
>    char *name;                        /* null or a string describing this */
> 
>    union
>    {
>      struct traced_info *nextfree; /* Link when on free list.  */
> 
> -    struct                   /* For a send right wrapper.  */
> +    struct                   /* For a receive right wrapper.  */
>      {
>        hurd_ihash_locp_t locp;        /* position in the traced_names hash 
> table */
> -    } send;
> +    } receive;

This change looks very strange... But I guess it's just another artifact
from the unhappy conflation of the information in a single data
structure, having a completely different meaning than originally.

> 
>      struct                   /* For a send-once right wrapper.  */
>      {
> @@ -220,11 +286,70 @@ struct traced_info
>  };
>  #define INFO_SEND_ONCE(info) ((info)->type == MACH_MSG_TYPE_MOVE_SEND_ONCE)
> 
> +struct req_info
> +{
> +  boolean_t valid;
> +  mach_msg_id_t req_id;
> +  mach_port_t reply_port;
> +  task_t from;
> +  task_t to;
> +  struct req_info *next;
> +};
> +
> +static struct req_info *req_head = NULL;

Perhaps using ihash would be appropriate here as well? It would avoid
the need for custom list handling.

> +
> +static struct req_info *
> +add_request (mach_msg_id_t req_id, mach_port_t reply_port,
> +          task_t from, task_t to)
> +{
> +  struct req_info *req = malloc (sizeof (*req));
> +  assert (req);
> +  req->req_id = req_id;
> +  req->from = from;
> +  req->to = to;
> +  req->reply_port = reply_port;
> +  req->valid = TRUE;
> +
> +  req->next = req_head;
> +  req_head = req;
> +
> +  return req;
> +}
> +
> +static struct req_info *
> +remove_request (mach_msg_id_t req_id, mach_port_t reply_port)
> +{
> +  struct req_info **prev;
> +  struct req_info *req;
> +
> +  if (req_head == NULL)
> +    return NULL;

No need to special-case this -- the loop and final check will handle
this case just fine.

> +
> +  prev = &req_head;
> +  while (*prev)
> +    {
> +      if ((*prev)->req_id == req_id && (*prev)->reply_port == reply_port)
> +     break;
> +      prev = &(*prev)->next;
> +    }

I prefer FOR loops when traversing lists -- IMHO they are more elegant
and easier to read...

But I guess that's a matter of taste, so feel free to ignore this remark
:-)

> +  if (*prev == NULL)
> +    return NULL;
> +
> +  req = *prev;
> +  *prev = req->next;
> +  return req;
> +}
> +
> +struct port_info *notify_pi;

Is this actually related to tracing multiple tasks?...

> +/* The list of receive wrappers, but only the ones for the traced tasks. */
> +struct traced_info *receive_right_list;
> +static struct traced_info dummy_wrapper;
>  static struct traced_info *freelist;
> 
>  struct hurd_ihash traced_names
> -  = HURD_IHASH_INITIALIZER (offsetof (struct traced_info, u.send.locp));
> +  = HURD_IHASH_INITIALIZER (offsetof (struct traced_info, u.receive.locp));
>  struct port_class *traced_class;
> +struct port_class *other_class;
>  struct port_bucket *traced_bucket;
>  FILE *ostream;
> 
> @@ -247,42 +372,115 @@ static void print_data (mach_msg_type_name_t type,
>                       const void *data,
>                       mach_msg_type_number_t nelt,
>                       mach_msg_type_number_t eltsize);
> +
>  
>  /*** Mechanics of tracing messages and interposing on ports ***/
> 
> -

As always, please avoid gratuitous whitespace changes.

> -/* Create a new wrapper port and do `ports_get_right' on it.  */
> +/* Create a new wrapper for the receive right.
> + * The wrapper lives until the traced receive right dies. */
>  static struct traced_info *
> -new_send_wrapper (mach_port_t right, mach_port_t *wrapper_right)
> +new_receive_wrapper (mach_port_t right, mach_port_t owner)

Perhaps putting the functions in a different order would have made the
patch less messy...

BTW, Git implements the patience diff algorithm now! The patch becomes
somewhat more readable if you generate it with git diff --patience. I
wonder why it isn't default... (And it isn't even possible to make it
default through the config file, as far as I can see :-( )

Unfortunately, the new_send_wrapper() vs. new_receive_wrapper() stuff is
still messy even with patience diff. I believe the problem is again the
fact that we have a completely different structure in the wrapping
process now, and the old send_wrapper/receive_wrapper naming simply
doesn't reflect this appropriately -- as explaind when talking about the
traced_info structure above.

(The moving of the locp from "sender" to "receiver" in the traced_info
structure probably is indeed also related to this.)

I hope that when using distinct data structures, and function names
better reflecting the new constellation, things will become clearer...

>  {
>    error_t err;
>    struct traced_info *info;
> +  mach_port_t foo;
> 
> -  /* Use a free send-once wrapper port if we have one.  */
> -  if (freelist)
> +  info = malloc (sizeof (*info));
> +  assert (info);

Never assert() on malloc() errors!

malloc() failures are perfectly possible in normal use; they do *not*
indicate program errors. Use normal error reporting for them.

(Note that when building with NDEBUG, assert()s are not even compiled at
all.)

> +  info->forward = right;
> +  info->type = MACH_MSG_TYPE_PORT_RECEIVE;
> +  info->task = owner;
> +  info->portname = UNKNOWN_NAME;
> +  info->receive_right = NULL;
> +  if (owner != UNKNOWN_TASK)
>      {
> -      info = freelist;
> -      freelist = info->u.nextfree;
> +      info->receive_right = receive_right_list;
> +      receive_right_list = info;
>      }
> -  else
> +  info->next = 0;
> +  info->name = 0;
> +
> +  /* Request the dead-name notification,
> +   * so if the receive right is destroyed,
> +   * we can destroy the wrapper. */
> +  err = mach_port_request_notification (mach_task_self (), right,
> +                                     MACH_NOTIFY_DEAD_NAME, 1,
> +                                     notify_pi->port_right,
> +                                     MACH_MSG_TYPE_MAKE_SEND_ONCE, &foo);
> +  assert_perror (err);

Are you sure that mach_port_request_notification() will only fail when
the program messed up, and never in normal use?...

> +  mach_port_deallocate (mach_task_self (), foo);
> +
> +  err = hurd_ihash_add (&traced_names, info->forward, info);
> +  assert_perror (err);

Same here -- and in many other places too...

Admittedly, the original code does the same... Not sure what to do about
this :-(

> +  return info;
> +}
> +
> +static void
> +destroy_receive_wrapper (struct traced_info *info)
> +{
> +  struct traced_info *send_wrapper;
> +  struct traced_info **prev;
> +
> +  mach_port_deallocate (mach_task_self (), info->forward);
> +  /* Remove it from the receive right list. */
> +  prev = &receive_right_list;
> +  if (*prev)
>      {
> -      /* Create a new wrapper port that forwards to *RIGHT.  */
> -      err = ports_create_port (traced_class, traced_bucket,
> -                            sizeof *info, &info);
> -      assert_perror (err);
> -      info->name = 0;
> +      while (*prev != info && *prev)
> +     prev = &((*prev)->receive_right);
> +      /* If we find the receive wrapper in the list. */
> +      if (*prev)
> +     *prev = info->receive_right;
> +    }

The outer IF is pointless -- again, the loop and final IF will handle
the emply list case just fine.

> +
> +  send_wrapper = info->next;
> +  while (send_wrapper)
> +    {
> +      struct traced_info *next = send_wrapper->next;
> +      /* we have to complete destroy the port,
> +       * so the tasks who have the send right can be notified. */
> +      assert (send_wrapper->pi.weakrefcnt == 1);
> +      ports_port_deref_weak (send_wrapper);
> +      assert (send_wrapper->pi.refcnt == 1);
> +      ports_destroy_right (send_wrapper);
> +      send_wrapper = next;
>      }
> 
> -  info->forward = right;
> -  info->type = MACH_MSG_TYPE_MOVE_SEND;
> +  hurd_ihash_locp_remove (&traced_names, info->u.receive.locp);
> +  free (info);
> +}
> 
> -  /* Store it in the reverse-lookup hash table, so we can
> -     look up this same right again to find the wrapper port.
> -     The entry in the hash table holds a weak ref on INFO.  */
> -  err = hurd_ihash_add (&traced_names, info->forward, info);
> +/* Create a new wrapper port and do `ports_get_right' on it.
> + *
> + * The wrapper lives until there is no send right to it,
> + * or the corresponding receive wrapper is destroyed.
> + */
> +static struct traced_info *
> +new_send_wrapper (struct traced_info *receive, task_t task,
> +               mach_port_t *wrapper_right)
> +{
> +  error_t err;
> +  struct traced_info *info;
> +
> +  /* Create a new wrapper port that forwards to *RIGHT.  */
> +  err = ports_create_port (traced_class, traced_bucket,
> +                        sizeof *info, &info);
>    assert_perror (err);
> +  info->name = 0;
> +
> +  assert ((info->pi.flags & PORT_HAS_SENDRIGHTS) == 0);
> +  asprintf (&info->name, "  %d<--%d(pid%d)",
> +         receive->forward, info->pi.port_right, (int) task2pid (task));

I don't think it's really necessary to explicitely cast to int?...

> +  info->forward = 0;
> +  info->type = MACH_MSG_TYPE_MOVE_SEND;
> +  info->task = task;
> +  info->portname = UNKNOWN_NAME;
> +  info->receive_right = receive;
> +  info->next = receive->next;
> +  receive->next = info;
> +  info->u.nextfree = NULL;
> +
>    ports_port_ref_weak (info);
> -  assert (info->u.send.locp != 0);
> 
>    *wrapper_right = ports_get_right (info);
>    ports_port_deref (info);
> @@ -302,6 +500,7 @@ new_send_once_wrapper (mach_port_t right,
> mach_port_t *wrapper_right)

Your mail client wrapped this line (and many more like this), breaking
the patch :-(

>      {
>        info = freelist;
>        freelist = info->u.nextfree;
> +      assert (info->pi.weakrefcnt == 0);
>      }
>    else
>      {
> @@ -314,6 +513,10 @@ new_send_once_wrapper (mach_port_t right,
> mach_port_t *wrapper_right)
> 
>    info->forward = right;
>    info->type = MACH_MSG_TYPE_MOVE_SEND_ONCE;
> +  info->task = UNKNOWN_TASK;
> +  info->portname = UNKNOWN_NAME;
> +  info->receive_right = NULL;
> +  info->next = NULL;
> 
>    /* Send-once rights never compare equal to any other right (even
>       another send-once right), so there is no point in putting them
> @@ -332,6 +535,32 @@ new_send_once_wrapper (mach_port_t right,
> mach_port_t *wrapper_right)
>    return info;
>  }
> 
> +static void
> +traced_clean (void *pi)
> +{

No comment for this function?...

> +  struct traced_info **prev;
> +  struct traced_info *info = pi;
> +
> +  assert (info->type == MACH_MSG_TYPE_MOVE_SEND);
> +  if (info->name)
> +    free (info->name);

The condition is unnecessary -- if the pointer is NULL, free() will
handle it just fine.

> +
> +  if (info->receive_right)
> +    {
> +      /* Remove it from the send right list. */
> +      prev = &info->receive_right->next;
> +      if (*prev)
> +     {
> +       while (*prev != info && *prev)
> +         prev = &((*prev)->next);
> +       assert (*prev);
> +       *prev = info->next;
> +     }

Another case of unnecessary outer IF...

This code is actually incorrect I believe: surely the assert() should
also trigger when the list is empty?

> +
> +      info->next = NULL;
> +      info->receive_right = NULL;
> +    }
> +}
> 
>  /* This gets called when a wrapper port has no hard refs (send rights),
>     only weak refs.  The only weak ref is the one held in the reverse-lookup
> @@ -339,74 +568,253 @@ new_send_once_wrapper (mach_port_t right,
> mach_port_t *wrapper_right)
>  static void
>  traced_dropweak (void *pi)
>  {
> +  struct traced_info **prev;
>    struct traced_info *const info = pi;
> 
>    assert (info->type == MACH_MSG_TYPE_MOVE_SEND);
> -  assert (info->u.send.locp);
> 
> -  /* Remove INFO from the hash table.  */
> -  hurd_ihash_locp_remove (&traced_names, info->u.send.locp);
> +  free (info->name);
> +  info->name = 0;
> +
> +  assert (info->pi.port_right);
> +
> +  /* Remove it from the send right list. */
> +  prev = &info->receive_right->next;
> +  assert (*prev);
> +  if (*prev)

Here, it's even more obvious that the extra IF is pointless: it follows
right after an assert() for the same condition!

> +    {
> +      while (*prev != info && *prev)
> +     prev = &((*prev)->next);
> +      assert (*prev);
> +      *prev = info->next;
> +    }

Haven't I seen this code before?... ;-)

Even if it's very small, it's probably still good to put it in an extra
function -- redundancy is always problematic. Actually the problem shows
in this case, as you introduced a difference in the code, see above...

> +  /* This function is called because the send wrapper doesn't have
> +   * any send right. If we find the all send wrappers are gone,
> +   * it means that our traced port doesn't have send right.
> +   * We notify the owner of the receive right by deallocating
> +   * the forward port. */

I find this comment really hard to understand... The original one in the
old code was clearer, though of course it needs to be adapted.

What you are trying to say is simply, that if this send wrapper was the
last one for the port, we give up the forward port as well, right?

> +  if (info->receive_right->next == NULL)
> +    {
> +      fprintf (ostream, "no senders\n");
> +      destroy_receive_wrapper (info->receive_right);
> +    }
> +
> +  info->next = NULL;
> +  info->receive_right = NULL;
> +
>    ports_port_deref_weak (info);
> +}
> 
> -  /* Deallocate the forward port, so the real port also sees no-senders.  */
> -  mach_port_deallocate (mach_task_self (), info->forward);
> +boolean_t
> +seen_receive_right (task_t task, mach_port_t name)
> +{
> +  struct traced_info *info = receive_right_list;
> +  while (info)
> +    {
> +      if (info->task == task && info->portname == name)
> +     return TRUE;
> +      info = info->next;
> +    }
> +  return FALSE;
> +}
> 
> -  /* There are no rights to this port, so we can reuse it.
> -     Add a hard ref and put INFO on the free list.  */
> -  ports_port_ref (info);
> +/* This function is to find the receive right for the send right 'send'
> + * among traced tasks. I assume that all receive rights are moved
> + * under the control of rpctrace.
> + *
> + * Note: 'send' shouldn't be the send right to the wrapper.
> + *
> + * Note: the traced_info returned from the function
> + * might not be the receive right in the traced tasks.
> + * */
> +struct traced_info *
> +discover_receive_right (mach_port_t send)
> +{

I'm rather sceptical about this whole function. Is it really necessary
to do such a brute-force search for the receive right? Shouldn't we
actually always know about all receive rights in all traced tasks, as we
are tracing their task ports?

BTW, I don't understand the last part of the comment at all...

> +  int i;
> +  error_t err;
> +  struct traced_info *info;
> 
> -  free (info->name);
> -  info->name = 0;
> +  info = hurd_ihash_find (&traced_names, send);
> +  /* If we have seen the send right or send once right. */
> +  if (info
> +      /* If the receive right is in one of traced tasks,
> +       * but we don't know its name
> +       * (probably because the receive name has been moved),
> +       * we need to find it out. */

As I said above, I don't see how this could ever happen...

> +      && !(info->task != UNKNOWN_TASK
> +       && info->portname == UNKNOWN_NAME))
> +    return info;
> +
> +  for (i = 0; i < nb_traced_tasks; i++)
> +    {
> +      int j;
> +      mach_port_t *portnames = NULL;
> +      mach_msg_type_number_t nportnames = 0;
> +      mach_port_type_t *porttypes = NULL;
> +      mach_msg_type_number_t nporttypes = 0;
> +      struct traced_info *receive_wrapper = NULL;
> +      task_t task = traced_tasks[i];
> +      err = mach_port_names (task, &portnames, &nportnames,
> +                          &porttypes, &nporttypes);

Missing blank line between declarations and statements...

> +      if (err)
> +     {
> +       remove_task (task);
> +       continue;
> +     }

Is this really a good way to handle errors here? If a task unexpectedly
disappears (or otherwise becomes unaccessible) without our noticing,
doesn't that mean something went seriously wrong, and we better complain
loudly rather than just silently dropping it?...

> 
> -  info->u.nextfree = freelist;
> -  freelist = info;
> +      for (j = 0; j < nportnames; j++)
> +     {
> +       mach_port_t send_right;
> +       mach_msg_type_name_t type;
> +       if (!(porttypes[j] & MACH_PORT_TYPE_RECEIVE) /* not a receive right */
> +           || (porttypes[j] & MACH_PORT_TYPE_RECEIVE

Checking again for the same condition in the || part is redundant -- it
must be true here, or the first part would have triggered.

> +               && seen_receive_right (task, portnames[j])))
> +         continue;
> +
> +       err = mach_port_extract_right (task, portnames[j],
> +                                      MACH_MSG_TYPE_MAKE_SEND,
> +                                      &send_right, &type);
> +       assert_perror (err);
> +
> +       receive_wrapper = hurd_ihash_find (&traced_names, send_right);
> +       /* If we have seen this send right before. */
> +       if (receive_wrapper)
> +         mach_port_deallocate (mach_task_self (), send_right);
> +       else
> +         receive_wrapper = new_receive_wrapper (send_right, task);

Do I get it right, that this function will create forward ports for any
receive rights it finds along the way, as a side effect?

Seems questionable to me. At least it should be documented.

> +
> +       receive_wrapper->portname = portnames[j];
> +
> +       /* We have found the wrapper for the receive right. */
> +       if (send_right == send)
> +         break;
> +       receive_wrapper = NULL;
> +     }
> +      if (portnames)
> +     vm_deallocate (mach_task_self (), (vm_address_t) portnames,
> +                    nportnames * sizeof (*portnames));
> +      if (porttypes)
> +     vm_deallocate (mach_task_self (), (vm_address_t) porttypes,
> +                    nporttypes * sizeof (*porttypes));
> +
> +      if (receive_wrapper)
> +     return receive_wrapper;
> +    }
> +  return NULL;
>  }
> 
> +/* get_send_wrapper searches for the wrapper
> +   of the send right for the target task.
> +   if it doesn't exist, create a new one. */
> +struct traced_info *get_send_wrapper (struct traced_info *receive_wrapper,
> +                                   mach_port_t task, mach_port_t *right)
> +{
> +  struct traced_info *info = receive_wrapper->next;
> +
> +  while (info)
> +    {
> +      if (info->task == task)
> +     {
> +       *right = ports_get_right (info);
> +       assert (info->pi.refcnt == 1);
> +       return info;
> +     }
> +      info = info->next;
> +    }
> +  /* no send wrapper is found. */
> +  return new_send_wrapper (receive_wrapper, task, right);
> +}
> 
>  /* Rewrite a port right in a message with an appropriate wrapper port.  */
>  static struct traced_info *
> -rewrite_right (mach_port_t *right, mach_msg_type_name_t *type)
> +rewrite_right (mach_port_t *right, mach_msg_type_name_t *type,
> +            struct req_info *req)
>  {
>    error_t err;
> -  struct traced_info *info;
> +  struct traced_info *receive_wrapper;
> +  struct traced_info *send_wrapper;
> +  task_t dest = UNKNOWN_TASK;
> +  task_t source = UNKNOWN_TASK;
> 
>    /* We can never do anything special with a null or dead port right.  */
>    if (!MACH_PORT_VALID (*right))
>      return 0;
> 
> -  switch (*type)
> +  if (req)
>      {
> -    case MACH_MSG_TYPE_PORT_SEND:
> -      /* See if we are already tracing this port.  */
> -      info = hurd_ihash_find (&traced_names, *right);
> -      if (info)
> +      if (req->valid)    /* it's a RPC request */
>       {
> -       /* We are already tracing this port.  We will pass on a right
> -          to our existing wrapper port.  */
> -       *right = ports_get_right (info);
> -       *type = MACH_MSG_TYPE_MAKE_SEND;
> -       return info;
> +       source = req->from;
> +       dest = req->to;
> +     }
> +      else
> +     {
> +       source = req->to;
> +       dest = req->from;
>       }
> +    }
> +
> +  switch (*type)
> +    {
> +    case MACH_MSG_TYPE_PORT_SEND:
> +      assert (req);
> 
>        /* See if this is already one of our own wrapper ports.  */
> -      info = ports_lookup_port (traced_bucket, *right, 0);
> -      if (info)
> +      send_wrapper = ports_lookup_port (traced_bucket, *right, 0);
> +      if (send_wrapper)
>       {
> -       /* This is a send right to one of our own wrapper ports.
> -          Instead, send along the original send right.  */
> +       /* This is a send right to one of our own wrapper ports. */
>         mach_port_deallocate (mach_task_self (), *right); /* eat msg ref */
> -       *right = info->forward;
> -       err = mach_port_mod_refs (mach_task_self (), *right,
> -                                 MACH_PORT_RIGHT_SEND, +1);
> -       assert_perror (err);
> -       ports_port_deref (info);
> -       return info;
> +       /*  If the send right is moved to the task with the receive right,
> +        * translate it.*/

"translate it" is not very specific...

Also, you haven't commented at all on the else case.

> +       assert (send_wrapper->receive_right);
> +       if (dest == send_wrapper->receive_right->task)
> +         {
> +           *right = send_wrapper->receive_right->forward;
> +           err = mach_port_mod_refs (mach_task_self (), *right,
> +                                     MACH_PORT_RIGHT_SEND, +1);
> +           assert_perror (err);
> +           ports_port_deref (send_wrapper);
> +         }
> +       else
> +         {
> +           struct traced_info *send_wrapper2
> +             = get_send_wrapper (send_wrapper->receive_right, dest, right);
> +           ports_port_deref (send_wrapper);
> +           *type = MACH_MSG_TYPE_MAKE_SEND;
> +           send_wrapper = send_wrapper2;
> +         }
> +       return send_wrapper;
>       }
> 
> -      /* We have never seen this port before.  Create a new wrapper port
> -      and replace the right in the message with a right to it.  */
> -      *type = MACH_MSG_TYPE_MAKE_SEND;
> -      return new_send_wrapper (*right, right);
> +      receive_wrapper = discover_receive_right (*right);
> +      if (receive_wrapper == NULL)
> +     {
> +       /* It's unusual to see an unknown send right from a traced task.
> +        * We ignore it. */
> +       if (source != UNKNOWN_TASK)
> +         {
> +           error (0, 0, "get an unknown send right from process %d",
> +                  (int) task2pid (source));
> +           return &dummy_wrapper;
> +         }

"Unusual"? What does that mean?...

Isn't this something that should never happen; and if it does, something
must be seriously screwed? In that case, is it even useful to go on at
all?...

> +       /* the receive right is owned by an unknown task. */
> +       receive_wrapper = new_receive_wrapper (*right, UNKNOWN_TASK);
> +       mach_port_mod_refs (mach_task_self (), *right,
> +                           MACH_PORT_RIGHT_SEND, 1);
> +     }
> +      /* if the send right is moved to the task with the receive right,
> +       * don't do anything. */

Again, you didn't explain the else case anywhere...

> +      if (dest == receive_wrapper->task)
> +     return receive_wrapper;
> +      else
> +     {
> +       assert (*right == receive_wrapper->forward);
> +       mach_port_deallocate (mach_task_self (), *right);
> +       send_wrapper = get_send_wrapper (receive_wrapper, dest, right);
> +       *type = MACH_MSG_TYPE_MAKE_SEND;
> +       return send_wrapper;
> +     }
> 
>      case MACH_MSG_TYPE_PORT_SEND_ONCE:
>        /* There is no way to know if this send-once right is to the same
> @@ -429,49 +837,86 @@ rewrite_right (mach_port_t *right,
> mach_msg_type_name_t *type)
>        to A that were sent through and replaced with our wrapper (B).
>        If not, we create a new receive right.  */

This comment needs to be rewritten... While the basic idea still holds,
the way it is, it does not really fit the situation with multiple tasks.

The explanation about receive rights from the beginning of your mail
would fit here, although it took me a while to understand it... (And the
explanation about send rights would fit for the send rights case above.)

>        {
> -     mach_port_t rr;         /* B */
> -     char *name;
> -
> -     info = hurd_ihash_find (&traced_names, *right);
> -     if (info)
> +     assert (req);
> +     receive_wrapper = hurd_ihash_find (&traced_names, *right);
> +     if (receive_wrapper)

Avoid using different variable names when changing code, unless the old
name really does not make sense anymore... Such renames make the patch
larger and harder to follow.

If you think that renaming is useful, you can do that in an extra
cleanup patch. But most of the time, it's better to avoid it
alltogether, as it breaks diffing between versions before and after the
rename. (And many operations that depend on diffing, like cherry-picking
for example.)

This actually applies not only to variable names, but also other kinds
of non-functional changes...

>         {
> -         /* This is a receive right that we have been tracing sends to.  */

You dropped the old comment, but haven't added a new one for this code
block at all...

> -         name = info->name;
> -         rr = ports_claim_right (info);
> -         /* That released the refs on INFO, so it's been freed now.  */
> +         struct traced_info *send_wrapper2;
> +         char *name;
> +         mach_port_t rr;

Why did you move the declarations of these two variables down here?

Looks like another change that could go in an extra cleanup patch, but
better should not go into a patch that does non-trivial changes...

> +
> +         /* The port has at least one send right - the one
> +          * in receive_wrapper->forward.
> +          * We wrap the port.
> +          * if the source task doesn't have the send right, the port
> +          * will be destroyed after we deallocate the only send right. */

I don't see what this comment refers to...

> +         /* We have to deallocate the send right in
> +          * receive_wrapper->forward before we import the port to
> +          * port_info. So the reference count in the port info will be 1,
> +          * if it doesn't have any other send rights. */
> +         mach_port_deallocate (mach_task_self (), receive_wrapper->forward);

This comment is very hard to understand; I don't think it is helpful at
all :-(

I can only *guess*, from my understanding of the code, that what you are
trying to say is: the task just gave up the receive right, so we won't
be forwarding to it anymore, and thus need to drop the send right for
the forward port.

> +         err = ports_import_port (traced_class, traced_bucket,
> +                                  *right, sizeof *send_wrapper,
> +                                  &send_wrapper);
> +         assert_perror (err);
> +         send_wrapper->forward = 0;
> +         send_wrapper->type = MACH_MSG_TYPE_MOVE_SEND;
> +         send_wrapper->task = source;
> +         send_wrapper->portname = UNKNOWN_NAME;
> +         send_wrapper->name = receive_wrapper->name;
> +         /* Initialize them in case that the source task doesn't
> +          * have the send right to the port, and the port will
> +          * be destroyed immediately. */
> +         send_wrapper->receive_right = NULL;
> +         send_wrapper->next = NULL;
> +         ports_port_ref_weak (send_wrapper);
> +         ports_port_deref (send_wrapper);
> +
> +         hurd_ihash_locp_remove (&traced_names, 
> receive_wrapper->u.receive.locp);
> +
> +         send_wrapper2 = get_send_wrapper (receive_wrapper, dest, &rr);
> +         assert (send_wrapper2->pi.refcnt == 1);
> +         name = send_wrapper2->name;
> +         send_wrapper2->name = NULL;
> +         /* The port has been removed from the send wrapper,
> +          * so we cannot put it in the free list. */

I don't understand this comment at all :-(

> +         ports_port_deref_weak (send_wrapper2);
> +         rr = ports_claim_right (send_wrapper2);
> +         /* Get us a send right that we will forward on.  */
> +         err = mach_port_insert_right (mach_task_self (), rr, rr,
> +                                       MACH_MSG_TYPE_MAKE_SEND);
> +         assert_perror (err);
> +         receive_wrapper->forward = rr;
> +         receive_wrapper->task = dest;
> +         if (dest != UNKNOWN_TASK)
> +           {
> +             receive_wrapper->receive_right = receive_right_list;
> +             receive_right_list = receive_wrapper;
> +           }
> +         /* the port name will be discovered
> +          * when we search for this receive right. */
> +         receive_wrapper->portname = UNKNOWN_NAME;
> +         receive_wrapper->name = name;
> +
> +         send_wrapper->receive_right = receive_wrapper;
> +         send_wrapper->next = receive_wrapper->next;
> +         receive_wrapper->next = send_wrapper;
> +
> +         err = hurd_ihash_add (&traced_names, receive_wrapper->forward,
> +                               receive_wrapper);
> +         assert_perror (err);
> +         *right = rr;
>         }
>       else
>         {
> -         /* This is a port we know nothing about.  */
> -         rr = mach_reply_port ();
> -         name = 0;
> +         /* weird? no send right for the port. */

Again, is that something that could happen in normal operation?...

> +         err = mach_port_insert_right (mach_task_self (), *right, *right,
> +                                       MACH_MSG_TYPE_MAKE_SEND);
> +         assert_perror (err);
> +         receive_wrapper = new_receive_wrapper (*right, dest);
>         }
> 
> -     /* Create a new wrapper object that receives on this port.  */
> -     err = ports_import_port (traced_class, traced_bucket,
> -                              *right, sizeof *info, &info);
> -     assert_perror (err);
> -     info->name = name;
> -     info->type = MACH_MSG_TYPE_MOVE_SEND; /* XXX ? */
> -
> -     /* Get us a send right that we will forward on.  */
> -     err = mach_port_insert_right (mach_task_self (), rr, rr,
> -                                   MACH_MSG_TYPE_MAKE_SEND);
> -     assert_perror (err);
> -     info->forward = rr;
> -
> -     err = hurd_ihash_add (&traced_names, info->forward, info);
> -     assert_perror (err);
> -     ports_port_ref_weak (info);
> -
> -     /* If there are no extant send rights to this port, then INFO will
> -        die right here and release its send right to RR.
> -        XXX what to do?
> -     */
> -     ports_port_deref (info);
> -
> -     *right = rr;
> -     return info;
> +     return receive_wrapper;
>        }
> 
>      default:
> @@ -482,7 +927,7 @@ rewrite_right (mach_port_t *right,
> mach_msg_type_name_t *type)
> 
>  static void
>  print_contents (mach_msg_header_t *inp,
> -             void *msg_buf_ptr)
> +             void *msg_buf_ptr, struct req_info *req)
>  {
>    error_t err;
> 
> @@ -523,7 +968,7 @@ print_contents (mach_msg_header_t *inp,
>        else
>       msg_buf_ptr += ((nelt * eltsize + sizeof(natural_t) - 1)
>                       & ~(sizeof(natural_t) - 1));
> -
> +

Something very strage must have happened here... Perhaps you
inadvertantly added a space, but your mail client ate it when sending
the patch?

>        if (first)
>       first = 0;
>        else
> @@ -549,20 +994,58 @@ print_contents (mach_msg_header_t *inp,
>         poly = 0;
>         for (i = 0; i < nelt; ++i)
>           {
> +           mach_port_t orig_port = portnames[i];
>             newtypes[i] = name;
> 
> -           if (inp->msgh_id == 3215) /* mach_port_insert_right */
> +           ti = rewrite_right (&portnames[i], &newtypes[i], req);
> +
> +           if (inp->msgh_id == 2161 /* the reply message for thread_create */
> +               || inp->msgh_id == 2107 /* for task_create */)

I don't like constructs of the form "if (foo == a || foo == b) { if (foo
== a) {} else {} }"... It's much clearer to use "switch (foo) {}" or "if
(foo == a) {} else if (foo == b) {}" IMHO.

But I guess this is also a matter of taste...

>               {
> -               /* XXX
> -                */
> -               fprintf (ostream,
> -                        "\t\t[%d] = pass through port %d, type %d\n",
> -                        i, portnames[i], name);
> -               continue;
> +               if (inp->msgh_id == 2161)
> +                 {
> +                   err = mach_port_insert_right (mach_task_self (),
> +                                                 portnames[i], portnames[i],
> +                                                 MACH_MSG_TYPE_MAKE_SEND);
> +                   assert_perror (err);
> +                   err = thread_set_kernel_port (orig_port, portnames[i]);
> +                   assert_perror (err);
> +                   if (ti->name)
> +                     free (ti->name);
> +                   asprintf (&ti->name, "thread%d(pid%d)",
> +                             (int) orig_port, (int) task2pid (req->from));
> +                 }
> +               else
> +                 {
> +                   pid_t pid;
> +                   struct traced_info *receive_wrapper;
> +                   struct traced_info *task_wrapper;
> +                   mach_port_t pseudo_task;
> +                   add_task (orig_port);

Missing blank line...

> +                   receive_wrapper = hurd_ihash_find (&traced_names, 
> orig_port);
> +                   assert (receive_wrapper == ti->receive_right);
> +                   task_wrapper = new_send_wrapper (receive_wrapper,
> +                                                    orig_port, &pseudo_task);
> +                   err = mach_port_insert_right (mach_task_self (),
> +                                                 pseudo_task, pseudo_task,
> +                                                 MACH_MSG_TYPE_MAKE_SEND);
> +                   assert_perror (err);
> +                   err = task_set_kernel_port (orig_port, pseudo_task);
> +                   assert_perror (err);
> +
> +                   pid = task2pid (orig_port);
> +                   if (ti->name)
> +                     free (ti->name);

Unnecessary IF...

> +                   asprintf (&ti->name, "pid%d(pid%d)",
> +                             (int) pid, (int) task2pid (req->from));
> +                   free (task_wrapper->name);
> +                   asprintf (&task_wrapper->name, "pid%d(pid%d)",
> +                             (int) pid, (int) pid);
> +                 }
> +
> +               mach_port_deallocate (mach_task_self (), portnames[i]);
>               }
> 
> -           ti = rewrite_right (&portnames[i], &newtypes[i]);
> -
>             putc ((i == 0 && nelt > 1) ? '{' : ' ', ostream);
> 
>             if (portnames[i] == MACH_PORT_NULL)
> @@ -632,19 +1115,24 @@ print_contents (mach_msg_header_t *inp,
>               type->msgt_name = name;
>           }
>         else if (newtypes[0] != name)
> -         if (type->msgt_longform)
> -           lt->msgtl_name = newtypes[0];
> -         else
> -           type->msgt_name = newtypes[0];
> +         {
> +           if (type->msgt_longform)
> +             lt->msgtl_name = newtypes[0];
> +           else
> +             type->msgt_name = newtypes[0];
> +         }

Putting it in braces is definitely preferable, and in fact recommended
by the GNU Coding Standards... However, like all cleanups, this should
go in an extra patch -- it is totally unrelated to the other changes.

>       }
>        else
>       print_data (name, data, nelt, eltsize);
> +      fflush (ostream);

I guess this is not related to tracing multiple tasks, but rather a
general fix?...

>      }
>  }
> 
>  int
>  trace_and_forward (mach_msg_header_t *inp, mach_msg_header_t *outp)
>  {
> +  mach_port_t reply_port;
> +
>    const mach_msg_type_t RetCodeType =
>    {
>      MACH_MSG_TYPE_INTEGER_32,        /* msgt_name = */
> @@ -671,19 +1159,22 @@ trace_and_forward (mach_msg_header_t *inp,
> mach_msg_header_t *outp)
>       with a send-once right, even if there have never really been any.  */
>    if (MACH_MSGH_BITS_LOCAL (inp->msgh_bits) == MACH_MSG_TYPE_MOVE_SEND_ONCE)
>      {
> -      if (inp->msgh_id == MACH_NOTIFY_DEAD_NAME)
> +      if (inp->msgh_id == MACH_NOTIFY_DEAD_NAME && info == (void *) 
> notify_pi)
>       {
> -       /* If INFO is a send-once wrapper, this could be a forged
> -          notification; oh well.  XXX */
> -
> +       struct traced_info *receive_wrapper;
>         const mach_dead_name_notification_t *const n = (void *) inp;
> 
> -       assert (n->not_port == info->forward);
>         /* Deallocate extra ref allocated by the notification.  */
>         mach_port_deallocate (mach_task_self (), n->not_port);
> -       ports_destroy_right (info);
> -       ports_port_deref (info);
> +       receive_wrapper = hurd_ihash_find (&traced_names, n->not_port);
> +       /* The receive wrapper might have been destroyed. */

How?

Also, this comment is rather confusing: it looks like it would explain
the purpose of the whole IF block, while actually it only explains why
the block needs to be conditional...

> +       if (receive_wrapper)
> +         {
> +           assert (n->not_port == receive_wrapper->forward);
> +           destroy_receive_wrapper (receive_wrapper);
> +         }
>         ((mig_reply_header_t *) outp)->RetCode = MIG_NO_REPLY;
> +       ports_port_deref (info);
>         return 1;
>       }
>        else if (inp->msgh_id == MACH_NOTIFY_NO_SENDERS
> @@ -698,43 +1189,109 @@ trace_and_forward (mach_msg_header_t *inp,
> mach_msg_header_t *outp)
>         ((mig_reply_header_t *) outp)->RetCode = MIG_NO_REPLY;
>         return 1;
>       }
> +      /* Get some unexpected notification for rpctrace itself,
> +       * TODO ignore them for now. */
> +      else if (info == (void *) notify_pi)
> +     {
> +       ports_port_deref (info);
> +       ((mig_reply_header_t *) outp)->RetCode = MIG_NO_REPLY;
> +       return 1;
> +     }
>      }
> 
> +  assert (info != (void *) notify_pi);
>    assert (MACH_MSGH_BITS_LOCAL (inp->msgh_bits) == info->type);
> 
>    complex = inp->msgh_bits & MACH_MSGH_BITS_COMPLEX;
> 
>    msgid = msgid_info (inp->msgh_id);
> 
> +  /* If it's the request of exec_startup_get_info,
> +   * it means that the traced process starts to run */

What's the purpose of this? I thought all task/thread creation is
intercepted anyways?...

> +  if (inp->msgh_id == 30500)
> +    {
> +      struct traced_info *thread_send_wrapper;
> +      struct traced_info *thread_receive_wrapper;
> +      thread_t *threads = NULL;
> +      size_t nthreads = 0;
> +      err = task_threads (get_task(0), &threads, &nthreads);

Missing blank line...

> +      if (err)
> +     error (1, err, "task_threads");
> +
> +      for (int i = 0; i < nthreads; i++)
> +     {
> +       /* If we hasn't seen the port. */

"Haven't seen the port yet."

Also, it's probably clearer to put the comment immediately before or
even after the IF...

> +       thread_receive_wrapper = hurd_ihash_find (&traced_names, threads[i]);
> +       if (thread_receive_wrapper == NULL)
> +         {
> +           mach_port_t new_thread_port;
> +           thread_receive_wrapper = new_receive_wrapper (threads[i], 
> UNKNOWN_TASK);
> +           thread_send_wrapper = new_send_wrapper (thread_receive_wrapper,
> +                                                   info->task, 
> &new_thread_port);
> +           free (thread_send_wrapper->name);
> +           asprintf (&thread_send_wrapper->name, "thread%d(pid%d)",
> +                     (int) threads[i], (int) task2pid (info->task));
> +     
> +           err = mach_port_insert_right (mach_task_self (),
> +                                         new_thread_port, new_thread_port,
> +                                         MACH_MSG_TYPE_MAKE_SEND);
> +           if (err)
> +             error (0, err, "mach_port_insert_right");

Is it really useful to go on after this operation fails?...

> +
> +           err = thread_set_kernel_port (threads[i], new_thread_port);
> +           if (err)
> +             error (0, err, "thread_set_kernel_port");
> +
> +           mach_port_deallocate (mach_task_self (), new_thread_port);
> +         }
> +     }
> +      munmap (threads, nthreads * sizeof (thread_t));

Using vm_deallocate() is clearer.

> +    }
> +
>    /* Swap the header data like a crossover cable. */
>    {
>      mach_msg_type_name_t this_type = MACH_MSGH_BITS_LOCAL (inp->msgh_bits);
>      mach_msg_type_name_t reply_type = MACH_MSGH_BITS_REMOTE (inp->msgh_bits);
> +
> +    /* Save the original reply port in the RPC request. */
> +    reply_port = inp->msgh_remote_port;
> 
>      inp->msgh_local_port = inp->msgh_remote_port;
>      if (reply_type && msgid_trace_replies (msgid))
>        {
> -     struct traced_info *info;
> -     info = rewrite_right (&inp->msgh_local_port, &reply_type);
> -     assert (info);
> -     if (info->name == 0)
> -       {
> -         if (msgid == 0)
> -           asprintf (&info->name, "reply(%u:%u)",
> -                     (unsigned int) info->pi.port_right,
> -                     (unsigned int) inp->msgh_id);
> -         else
> -           asprintf (&info->name, "reply(%u:%s)",
> -                     (unsigned int) info->pi.port_right, msgid->name);
> -       }
> -     if (info->type == MACH_MSG_TYPE_MOVE_SEND_ONCE)
> +     struct traced_info *reply_pi;

Another gratuitous rename I believe?...

> +     /* It must be send-once right. */
> +     reply_pi = rewrite_right (&inp->msgh_local_port, &reply_type, NULL);
> +     /* The reply port might be dead, e.g., the traced task has died.
> +      * If so, we don't need to trace the reply message. */
> +     if (reply_pi)
>         {
> -         info->u.send_once.sent_to = info->pi.port_right;
> -         info->u.send_once.sent_msgid = inp->msgh_id;
> +         assert (inp->msgh_local_port);
> +         if (reply_pi->name == 0)
> +           {
> +             if (msgid == 0)
> +               asprintf (&reply_pi->name, "reply(%u:%u)",
> +                         (unsigned int) reply_pi->pi.port_right,
> +                         (unsigned int) inp->msgh_id);
> +             else
> +               asprintf (&reply_pi->name, "reply(%u:%s)",
> +                         (unsigned int) reply_pi->pi.port_right, 
> msgid->name);
> +           }
> +         if (reply_pi->type == MACH_MSG_TYPE_MOVE_SEND_ONCE)
> +           {
> +             reply_pi->u.send_once.sent_to = reply_pi->pi.port_right;
> +             reply_pi->u.send_once.sent_msgid = inp->msgh_id;
> +           }
>         }
>        }
> 
> -    inp->msgh_remote_port = info->forward;
> +    if (info->type == MACH_MSG_TYPE_MOVE_SEND_ONCE)
> +      inp->msgh_remote_port = info->forward;
> +    else
> +      {
> +     assert (info->receive_right);
> +     inp->msgh_remote_port = info->receive_right->forward;
> +      }
>      if (this_type == MACH_MSG_TYPE_MOVE_SEND_ONCE)
>        {
>       /* We have a message to forward for a send-once wrapper object.
> @@ -744,6 +1301,7 @@ trace_and_forward (mach_msg_header_t *inp,
> mach_msg_header_t *outp)
>          we are consuming its `forward' right in the message we send.  */
>       free (info->name);
>       info->name = 0;
> +     info->forward = 0;

Is this related to tracing multiple tasks?

>       info->u.nextfree = freelist;
>       freelist = info;
>        }
> @@ -762,22 +1320,57 @@ trace_and_forward (mach_msg_header_t *inp,
> mach_msg_header_t *outp)
>         && info->type == MACH_MSG_TYPE_MOVE_SEND_ONCE
>         && inp->msgh_size >= sizeof (mig_reply_header_t)
>         && (*(int *) &((mig_reply_header_t *) inp)->RetCodeType
> -           == *(int *)&RetCodeType))
> +           == *(int *)&RetCodeType)
> +       /* The notification message is considered as a request. */
> +       && (inp->msgh_id > 72 || inp->msgh_id < 64))
>       {
> +       struct req_info *req = remove_request (inp->msgh_id - 100,
> +                                              inp->msgh_remote_port);
> +       assert (req);
> +       req->valid = FALSE;
>         /* This sure looks like an RPC reply message.  */
>         mig_reply_header_t *rh = (void *) inp;
>         print_reply_header (info, rh);
>         putc (' ', ostream);
> -       print_contents (&rh->Head, rh + 1);
> +       fflush (ostream);
> +       print_contents (&rh->Head, rh + 1, req);
>         putc ('\n', ostream);
> +       free (req);
>       }
>        else
>       {
> +       task_t to = 0;
> +       struct req_info *req;
> +
>         /* Print something about the message header.  */
>         print_request_header (info, inp);
> -       print_contents (inp, inp + 1);
> +       fflush (ostream);
> +       /* If it's a nofication message,
> +        * I hope there is no port right in the message body. */

"hope"?...

Isn't this comment redundant anyways with the one below?

> +       if (inp->msgh_id <= 72 && inp->msgh_id >= 64)
> +         {
> +           assert (info->type == MACH_MSG_TYPE_MOVE_SEND_ONCE);
> +           /* mach_notify_port_destroyed message has a port,
> +            * TODO how do I handle it? */
> +           assert (inp->msgh_id != 69);
> +         }
> +       /* if it's mach_port RPC,
> +        * the port rights in the message will be moved to the target task. */
> +       else if (inp->msgh_id >= 3200 && inp->msgh_id <= 3218)
> +         to = info->receive_right->forward;
> +       else
> +         to = info->receive_right->task;
> +       req = add_request (inp->msgh_id, reply_port, info->task, to);
> +       print_contents (inp, inp + 1, req);
>         if (inp->msgh_local_port == MACH_PORT_NULL) /* simpleroutine */
> -         fprintf (ostream, ");\n");
> +         {
> +           /* If it's a simpleroutine,
> +            * we don't need the request information any more. */
> +           req = remove_request (inp->msgh_id, reply_port);
> +           assert (req);
> +           free (req);
> +           fprintf (ostream, ");\n");
> +         }
>         else
>           /* Leave a partial line that will be finished later.  */
>           fprintf (ostream, ")");
> @@ -1014,8 +1607,6 @@ print_data (mach_msg_type_name_t type,
>  
>  /*** Main program and child startup ***/
> 
> -task_t traced_task;
> -
> 
>  /* Run a child and have it do more or else `execvpe (argv, envp);'.  */
>  pid_t
> @@ -1024,7 +1615,9 @@ traced_spawn (char **argv, char **envp)
>    error_t err;
>    pid_t pid;
>    mach_port_t task_wrapper;
> +  task_t traced_task;
>    struct traced_info *ti;
> +  struct traced_info *receive_ti;
>    file_t file = file_name_path_lookup (argv[0], getenv ("PATH"),
>                                      O_EXEC, 0, 0);
> 
> @@ -1038,6 +1631,7 @@ traced_spawn (char **argv, char **envp)
>                    0, &traced_task);
>    assert_perror (err);
> 
> +  add_task (traced_task);
>    /* Declare the new task to be our child.  This is what a fork does.  */
>    err = proc_child (getproc (), traced_task);
>    if (err)
> @@ -1046,9 +1640,12 @@ traced_spawn (char **argv, char **envp)
>    if (pid < 0)
>      error (2, errno, "task2pid");
> 
> +  receive_ti = new_receive_wrapper (traced_task, UNKNOWN_TASK);
>    /* Create a trace wrapper for the task port.  */
> -  ti = new_send_wrapper (traced_task, &task_wrapper);/* consumes ref */
> -  asprintf (&ti->name, "task%d", (int) pid);
> +  ti = new_send_wrapper (receive_ti, traced_task, &task_wrapper);

Why did you drop the original comment from this line?...

> +  ti->task = traced_task;
> +  free (ti->name);
> +  asprintf (&ti->name, "pid%d(pid%d)", (int) pid, (int) pid);
> 
>    /* Replace the task's kernel port with the wrapper.  When this task calls
>       `mach_task_self ()', it will get our wrapper send right instead of its
> @@ -1124,6 +1721,7 @@ main (int argc, char **argv, char **envp)
>    bool nostdinc = FALSE;
>    const char *outfile = 0;
>    char **cmd_argv = 0;
> +  error_t err;
> 
>    /* Parse our options...  */
>    error_t parse_opt (int key, char *arg, struct argp_state *state)
> @@ -1196,7 +1794,11 @@ main (int argc, char **argv, char **envp)
>    setlinebuf (ostream);
> 
>    traced_bucket = ports_create_bucket ();
> -  traced_class = ports_create_class (0, &traced_dropweak);
> +  traced_class = ports_create_class (&traced_clean, &traced_dropweak);
> +  other_class = ports_create_class (0, 0);
> +  err = ports_create_port (other_class, traced_bucket,
> +                        sizeof (*notify_pi), &notify_pi);
> +  assert_perror (err);
> 
>    hurd_ihash_set_cleanup (&msgid_ihash, msgid_ihash_cleanup, 0);
> 
> @@ -1222,6 +1824,8 @@ main (int argc, char **argv, char **envp)
>      else
>        fprintf (ostream, "Child %d %s\n", pid, strsignal (WTERMSIG (status)));
>    }
> +
> +  ports_destroy_right (notify_pi);
> 
>    return 0;
>  }

BTW, have you checked that no changes are required to the
documentation?...

-antrik-




reply via email to

[Prev in Thread] Current Thread [Next in Thread]