[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: while read line; do; done; cannot handle big files
From: |
Chet Ramey |
Subject: |
Re: while read line; do; done; cannot handle big files |
Date: |
Wed, 29 Apr 2009 17:08:25 -0400 |
User-agent: |
Thunderbird 2.0.0.21 (Macintosh/20090302) |
schomake@ai.rug.nl wrote:
> Bash Version: 3.1
> Patch Level: 17
> Release Status: release
>
> Description:
> [Detailed description of the problem, suggestion, or complaint.]
>
> The script
>
> while read line
> do
> echo "$line"
> done < larger-than-2GB-file.txt
>
> will fail, returning erratic (shifted content) in $line after the 2GB has
> been read.
>
> Repeat-By:
> [Describe the sequence of events that causes the problem
> to occur.]
> The input file is: big.dat
> with 4357 white-space delimited items on each text record.
> The first item is a long (>80 bytes) alphanumeric ASCII string, the
> remaining 4356 items are ASCII floating-point values. With exception
> of the NEWLINEs (ASCII=10,dec) the file does not contain byte values
> below decimal 32 or above 127, decimal.
>
> wc big.dat
> 115813 504597241 2528513389 big.dat
> lines items bytes
>
>
> Fix:
> [Description of how to fix the problem. If you don't know a
> fix for the problem, don't include this section.]
>
> Wrote a program in C to check whether the file was corrupt.
> It was not.
Not sure what to tell you. The code that read uses to read from files
only grabs 128 bytes at a time, uses the correct types
(size_t/ssize_t/off_t), and should be compiled with whatever options
AC_SYS_LARGEFILE says.
Output uses stdio and file descriptors opened with open(2) (well, in your
example, it just sends data to stdout).
Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
Chet Ramey, ITS, CWRU chet@case.edu http://cnswww.cns.cwru.edu/~chet/