bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bash scripting and large files: input with the read builtin from a redir


From: Jean-François Gagné
Subject: Bash scripting and large files: input with the read builtin from a redirection gives unexpected result with files larger than 2GB.
Date: Fri, 2 Mar 2012 03:47:32 -0800 (PST)

Configuration Information [Automatically generated, do not change]:

Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include -I../bash/lib   -g -O2 -Wall
uname output: Linux xxxxxxxx 2.6.32-5-amd64 #1 SMP Tue Jun 14 09:42:28 UTC 2011 
x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.1
Patch Level: 5
Release Status: release

Description:
When reading data with the 'read' buildin from a redirection, read has 
unexpected behavior after reading 2G of data.  

Repeat-By:


yes "0123456789abcdefghijklmnopqrs" | head -n 100000000 > file
while read line; do file=${line:0:10}; echo $file; done < file | uniq -c


results in


71582790 0123456789
      1 mnopqrs
      3 0123456789
      1 mnopqrs
      3 0123456789
      1 mnopqrs
      3 0123456789
      1 mnopqrs
      3 0123456789
...

So the problem happens after reading 71.582.790 x30 = 2.147.483.700 bytes of 
data, just a little over 2^31.

but  the following:

cat file | while read line; do file=${line:0:10}; echo $file; done | uniq -c

works fine:

100000000 0123456789

 


Jean-François Gagné


reply via email to

[Prev in Thread] Current Thread [Next in Thread]