Re: Storing NUL in variables

bug-bash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Storing NUL in variables

From:	Pierre Gaston
Subject:	Re: Storing NUL in variables
Date:	Sat, 10 Jun 2017 19:33:10 +0300

On Sat, Jun 10, 2017 at 2:06 AM, George <tetsujin@scope-eye.net> wrote:

> On Fri, 2017-06-09 at 20:58 +0300, Pierre Gaston wrote:
>
> On Fri, Jun 9, 2017 at 8:40 PM, Peter & Kelly Passchier 
> <peterkelly@passchier.net> wrote:
>
>
>
> On 09/06/2560 23:38, L A Walsh wrote:
>
>
> Chet Ramey wrote:
>
>
>
>  Should mapfile silently drop the NULs?
>
>
>
> Maybe add a flag to ignore NUL bytes that could be used in the 'read'
> statement as well?  If not specified, keep same behavior?
>
>
>
> That sounds like it might be useful.
> It might be more desirable to change it to a newline instead of dropping
> it? (Or both, with different flags??)
>
>
>
>
> I feel this kind of magic behavior would result in hackish scripts or fill
> a somewhat rare niche at best.
> I'd rather have bash to fully handle arrays of byte, or nothing.
>
>
>
> I think allowing shell variables to contain NUL would be lovely. How about
> we make that happen?  :)
> (I would be up for writing a patch to do it, of course, though I have a
> few other things in the pipeline... A feature like this could take a fair
> bit of work depending on how far the implementation goes in supporting
> various things.)
>
> Of course such variables couldn't be exported (the NULs would be lost if
> the data were stored in an environment variable) and for compatibility,
> variables should probably support containment of NULs only if the caller
> specifically requests it with an argument to "declare" or "read".
>
> ...And then there is the problem of how to use such variables. They can't
> be exported as environment variables or passed as command arguments, or
> used as file names...  Essentially they'd be limited to use in I/O within
> the shell, and within a handful of built-in commands or shell functions
> equipped to properly handle that data.
>
> I think that this approach, capturing an arbitrary byte stream and then
> taking further actions to process or encode it, is preferable to the
> alternative of capturing the byte stream and simultaneously encoding it
> into a text format. In principle commands like "read" shouldn't transform
> the data they're given, they should just store it. (I think the fact that
> read requires the option "-r" to read data without transforming it is kind
> of unfortunate...)
>
> (That said, one could argue that it would be equally reasonable, or even
> more reasonable to implement an operation that simultaneously reads and
> encodes the data, and another that decodes the data and writes it out, and
> then any commands designed to perform operations on byte stream data in the
> shell (re-encode it in a different format, etc.) should simply use that
> first encoding as a common format for exchanging the data..  Given the
> limitations of the shell with respect to its ability to handle NUL in
> various contexts, I think it's a reasonable argument. I tend to prefer the
> idea of providing true shell support for capturing a byte stream because it
> makes it easier to write code that handles the data without having to
> build-in a parser to interpret the data first.)
>
> One option that might make a feature like this integrate into the shell
> better would be to store a captured byte stream as an integer array rather
> than as an atomic variable. The back-end implementation in this case could
> be very efficient, and the stored data would be manipulable using existing
> array syntax. The main limitation perhaps would be that one could not
> create an array of these arrays.
>

Without too much thinking about it, I'd  propose something like this:

- extend readarray (or maybe provide another builtin)  to read bytes with
an interface like the one of dd (block size, offset, skip) and store the
bytes in the array. eg:  readarray -b bs=1024 cs=100 byte_array <file

- provide another builtin to write the array to a fd, eg "write bs=1024
cs=100 byte_array" (I don't really see a good way to extend printf or echo
for this)

-  setting an array directly would store the bytes eg a[0]=0 would put a
null bye at the first index

-  conversely ${a[0]} would expend to "0"

- I think the current array could even be use, but that would not be very
efficient, and there is the question of what to do with sparse arrays

- I think I would like some efficient way to copy range of bytes from one
array to the other, maybe this could be done reusing the above "write
builtin" like: write offset=100 seek=50 -v dest_array -s source array


I think that could even been done with loadables builtins, making all the
byte arrays "special_variables"

My 2 cents....of course I probably should have checked what zsh does as I
think it supports nullbytes in variable.
Pierre

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Trailing newlines disappear, (continued)

Prev by Date: Re: Buffer corruption when the terminal is resized.
Next by Date: Re: Buffer corruption when the terminal is resized.
Previous by thread: Storing NUL in variables
Next by thread: Re: Storing NUL in variables
Index(es):
- Date
- Thread