qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: socket.c added support for unix domain socket datagram transport


From: Stefano Brivio
Subject: Re: socket.c added support for unix domain socket datagram transport
Date: Tue, 27 Apr 2021 23:51:52 +0200

On Mon, 26 Apr 2021 13:14:48 +0200
Ralph Schmieder <ralph.schmieder@gmail.com> wrote:

> > On Apr 23, 2021, at 18:39, Stefano Brivio <sbrivio@redhat.com>
> > wrote:
> > 
> > [...]
> >
> > Okay, so it doesn't seem to fit your case, but this specific point
> > is where you actually have a small advantage using a stream-oriented
> > socket. If you receive a packet and have a smaller receive buffer,
> > you can read the length of the packet from the vnet header and then
> > read the rest of the packet at a later time.
> > 
> > With a datagram-oriented socket, you would need to know the maximum
> > packet size in advance, and use a receive buffer that's large
> > enough to contain it, because if you don't, you'll discard data.  
> 
> For me, the maximum packet size is a jumbo frame (e.g. 9x1024) anyway
> -- everything must fit into an atomic write of that size.

Well, the day you want to do some batching... ;) but sure, I see your
point.

> > [...]
> > 
> > On a side note, I wonder why you need two named sockets instead of
> > one -- I mean, they're bidirectional...  
> 
> Hmm... each peer needs to send unsolicited frames/packets to the
> other end... and thus needs to bind to their socket.  Pretty much for
> the same reason as the UDP transport requires you to specify a local
> and a remote 5-tuple.  Even though for AF_INET, the local port does
> not have to be specified, the OS would assign an ephemeral port to
> make it unique. Am I missing something?

I see your point now. Well, I think it's different from the AF_INET case
due to the way AF_UNIX works: UNIX domain sockets don't necessarily
need to make the endpoint known or visible, see a more detailed
explanation at:
        
https://comp.unix.admin.narkive.com/AhAOKP1s/lsof-find-both-endpoints-of-a-unix-socket

Even though, nowadays on Linux:

$ nc -luU my_path & (sleep 1; nc.openbsd -uU my_path & lsof +E -aUc nc)
[1] 373285
COMMAND      PID    USER   FD   TYPE             DEVICE SIZE/OFF    NODE NAME
nc        373285 sbrivio    3u  unix 0x000000004076431a      0t0 3957568 
my_path type=DGRAM ->INO=3956394 373288,nc.openbs,4u
nc.openbs 373288 sbrivio    4u  unix 0x00000000f5b2e2e1      0t0 3956394 
/tmp/nc.XXXXC0whUu type=DGRAM ->INO=3957568 373285,nc,3u

for datagram sockets, the endpoint is exported, and lsof can report that
the endpoint for "my_path" here (-luU binds to a UNIX domain datagram
socket, -uU connects to it). With a stream socket, by the way:

$ nc -lU my_path & (sleep 1; nc.openbsd -U my_path & lsof +E -aUc nc)
[1] 375445
COMMAND      PID    USER   FD   TYPE             DEVICE SIZE/OFF    NODE NAME
nc        375445 sbrivio    3u  unix 0x0000000053abf57c      0t0 3969787 
my_path type=STREAM
nc        375445 sbrivio    4u  unix 0x000000001960c1ef      0t0 3969788 
my_path type=STREAM ->INO=3970624 375448,nc.openbs,3u
nc.openbs 375448 sbrivio    3u  unix 0x000000000538fa63      0t0 3970624 
type=STREAM ->INO=3969788 375445,nc,4u

so I think it should be optional. Even with datagram sockets, just like
the example above (I'm not suggesting that you do this, it's just
another possible choice), only one peer needs to bind to a named
socket, and yet they can exchange data.

> Another thing: on Windows, there's a AF_UNIX/SOCK_STREAM
> implementation... So, technically it should be possible to use that
> code path on Windows, too.  Not a windows guy, though... So, can't
> say whether it would simply work or not:
> 
> https://devblogs.microsoft.com/commandline/af_unix-comes-to-windows/

Thanks for the pointer. I can't test this, so I wouldn't remove that
#ifndef, but perhaps I could add a link to this, in case somebody needs
it and stumbles upon this code path.

-- 
Stefano




reply via email to

[Prev in Thread] Current Thread [Next in Thread]