Aloha -
I've run into a real kernel-level problem with 'netmsg'.
It's related to the libpager issue. The problem arises when a process gets an unworkable memory object and tries to vm_map it. This causes the mach_msg() that sent the vm_map to block indefinitely, even though I've specified MACH_SEND_TIMEOUT with a zero timeout.
More specifically, the process in question is the exec server. It gets a memory object from the file server to read a file, then uses it to map the file into a remote task. This causes a vm_map to go across the network connection. The kernel, upon receiving the vm_map, sends a memory_object_init message, and then blocks waiting for the reply.
The block occurs in vm_object_copy_strategically(), which is labeled in its comments "[t]his operation may block". Almost the first thing it does it to wait for the memory object to become ready.
In our case, libpager already has a different client, so the memory object never becomes ready.
The big problem, as I see it, is that mach_msg() is blocking, and that hangs my entire thread. It seems to me that these low-level RPC operations like vm_map can't block, otherwise it would defeat the purpose of MACH_SEND_TIMEOUT. So vm_map() should record the mapping and then return, putting the copy operation on some kind of queue. I guess.
Any thought on how to resolve this?
agape
brent