[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: O_DIRECT "packet mode" pipes on Linux

From: Vito Caputo
Subject: Re: O_DIRECT "packet mode" pipes on Linux
Date: Wed, 23 Sep 2020 23:18:36 -0700

On Thu, Sep 24, 2020 at 12:48:14PM +0700, Robert Elz wrote:
>     Date:        Wed, 23 Sep 2020 21:47:10 -0700
>     From:        Vito Caputo <vcaputo@pengaru.com>
>     Message-ID:  <20200924044710.xpltp22bpxoxieos@shells.gnugeneration.com>
>   | It's useful if you're doing something like say, aggregating data from
>   | multiple piped sources into a single bytestream.  With the default
>   | pipe behavior, you'd have the output interleaved at random boundaries.
> If that's happening, then either the pipe implementation is badly broken,
> or the applications using it aren't doing what you'd like them to do.
> Writes (<= the pipe buffer size) have always (since ancient unix, probably
> since pipes were first created) been atomic - nothing will randomly split
> the data.
> What the new option is offering (as best I can tell from the discussion
> here, I am not a linux user) is passing those boundaries through the pipe
> to the reader - that hasn't been a pipe feature, but it is exactly what a
> unix domain datagram socket provides (these days pipes are sometimes
> implemented using unix domain connection oriented sockets ... I'm guessing
> that the option simply changes the transport protocol used with an
> implementation that works that way).

Apparently I was incomplete in describing my conjured example.

The aggregator in this case is a process connected to multiple pipes,
not a pipe with multiple writer processes.

What you describe is correct WRT multiple writers to a shared pipe.

In my example, the aggregator can trivially read the separate records
at the write boundaries from each of the connected packetized pipes.
The reads return at the write boundaries.  Without packetized pipes
you'd need to parse the contents to search for record boundaries.

Imagine it's like an inverted `tee` for input instead of output.
Without packetized pipes, this hypethetical program couldn't
interleave the collected inputs at record boundaries without parsing
the contents.  Presumably this is *why* we don't already have an input
version of `tee`.  I'd like to work towards changing that.

>   | With packetized pipes, if your sources write say, newline-delimited
>   | text records, kept under PIPE_BUF length, the aggregated output would
>   | always interleave between the lines, never in the middle of them.
> That happens with regular pipes.

See above, the aggregator is a process, not a shared pipe.

>   | If we added this to the shell, I suppose the next thing to explore
>   | would be how to get all the existing core shell utilities to detect a
>   | packetized pipe on stdout and switch to a line-buffered mode instead
>   | of block-buffered, assuming they're using stdio.
> I suspect that is really all you need - a mechanism to request line
> buffered output rather than blocksize buffered.   You don't need to
> go fiddling with pipes for that, and abusing the pipe interface as a
> way to pass a "line buffer this output please" request to the application
> seems like the wrong way to achieve that to me.

This is probably true, though if a packetized pipe were introspectable
we could request the behavior via the ||| construction, while
simultaneously enabling record boundaries regardless of how the
contents are delimited.  If consumers knew about packetized pipes,
they could treat the separately returned reads as records
independently of what's inside.

> This isn't a criticism of the datagram packet pipe idea - there are
> applications for that (pipe is easier to use than manually setting up
> a pair of unix domain datagram sockets) but that is for specialised
> applications, where for whatever reason the receiver needs to read just
> one packet at a time (usually because of a desire to have multiple
> reading applications, each taking the next request, and then processing
> it ... if there is just one receiving process all that is needed is
> to stick a record length before each packet sent to a normal pipe, and
> let the receiver process the records from the aggregations it receives).

Thanks for the thoughtful response,
Vito Caputo

reply via email to

[Prev in Thread] Current Thread [Next in Thread]