Re: Hurd lecture

On Fri, Jan 19, 2018 at 7:35 AM, Ricardo Wurmus <rekado@elephly.net> wrote:

Hi Brent,

> I put a screencast of the lecture on youtube:
>
> https://www.youtube.com/watch?v=JwsuAEF2FYE

thank you. This was very interesting. The introduction to Mach IPC and
memory management was especially good. I wonder if a shorter variant of
this part of the lecture could be used by new contributors as an
alternative to reading the Mach kernel postscript books.

That's a good idea, and I can leverage the work I've already done with the graphics. I'll have to think about what we might want in a second video that isn't in the first one. Any suggestions?

Personally, I’m very interested in a Single System Image Hurd cluster; I
still have a bunch of unused Sun cluster nodes with x86_64 CPUs, but
sadly there is no high-speed network to connect them all (just regular
old 1G network cards).

In your experience, is high-speed network very important or are there
ways to make it unlikely that memory has to be transferred across nodes?

My experience is that we're nowhere close to 1 Gbps vs 40 Gbps being an issue.

My primary test environment is a virtual machine that talks to itself as the "remote", and the performance problems there are severe enough that running remote programs results in a delay that is noticeable to the human running the program.

Actually, we're barely even at that point. Stock hurd can't even execute remote programs. You either need to patch the exec server so that it reads binaries and shared libraries instead of memory mapping them, or use the multi-client libpager that I've been working on the past six months, but is still failing some test cases.

See, for example, http://lists.gnu.org/archive/html/bug-hurd/2016-08/msg00099.html, or do a reverse chronological search on bug-hurd for "libpager".

At the very end of that video lecture, I suggested how we might address the performance problems in "netmsg", after we've got a multi-client libpager, and after we've got a 64-bit user space, and after we've got SMP support. (Who cares about a cluster where you can only use 4 GB and one core on each node?)

First, netmsg needs to be rewritten so that it doesn't serialize all the Mach traffic over a single TCP session. That's probably its biggest bottleneck.

After that, I think we should move our networking drivers into a shared library, so that netmsg can access the PCI device directly. That would avoid a lot of context switches, like netmsg <-> TCP/IP stack <-> network driver.

And it's not as crazy as it might first seem. Networking devices are increasingly virtuallized. Not only can you just fiddle a few software options to add a virtual PCI device to a virtual machine, but Cisco's vNIC cards can present themselves as anywhere from 1 to 16 PCI devices. So just fiddle a few options in your management console, and even your hardware can now made a new PCI device appear out of thin air.

So, it makes sense to allocate an entire PCI device to netmsg, and let all your inter-node Mach traffic run over one interface, while your normal TCP/IP stack has a separate interface. Of course, we also need to support configurations where you can't do that, but I think raw PCI access is going to be the way to go if you really want performance.

Those are my current thoughts on Hurd cluster performance. In short, priorities are:

1. multi-client libpager (nearly usable)

2. 64-bit user space

3. SMP

4. rewrite netmsg to avoid TCP serialization (and other issues)

5. raw PCI device access

agape

brent

From:	Brent W. Baccala
Subject:	Re: Hurd lecture
Date:	Fri, 19 Jan 2018 12:42:09 -0500