bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bash scripting and large files: input with the read builtin from a r


From: Chet Ramey
Subject: Re: Bash scripting and large files: input with the read builtin from a redirection gives unexpected result with files larger than 2GB.
Date: Sun, 04 Mar 2012 14:03:49 -0500
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0) Gecko/20111105 Thunderbird/8.0

On 3/2/12 6:47 AM, Jean-François Gagné wrote:

> uname output: Linux xxxxxxxx 2.6.32-5-amd64 #1 SMP Tue Jun 14 09:42:28 UTC 
> 2011 x86_64 GNU/Linux
> Machine Type: x86_64-pc-linux-gnu
> 
> Bash Version: 4.1
> Patch Level: 5
> Release Status: release
> 
> Description:
> When reading data with the 'read' buildin from a redirection, read has 
> unexpected behavior after reading 2G of data.  
> 
> Repeat-By:
> 
> 
> yes "0123456789abcdefghijklmnopqrs" | head -n 100000000 > file
> while read line; do file=${line:0:10}; echo $file; done < file | uniq -c
> 
> 
> results in
> 
> 
> 71582790 0123456789
>       1 mnopqrs
>       3 0123456789
>       1 mnopqrs
>       3 0123456789
>       1 mnopqrs
>       3 0123456789
>       1 mnopqrs
>       3 0123456789
> ...
> 
> So the problem happens after reading 71.582.790 x30 = 2.147.483.700 bytes of 
> data, just a little over 2^31.

Compile and run the attached program.  If it prints out `4', which it does
on all of the Debian systems I've tried, file offsets are limited to 32
bits, and accessing files greater than 2 GB is going to be unreliable.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    address@hidden    http://cnswww.cns.cwru.edu/~chet/

Attachment: offt.c
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]