[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problem with reading file and executing other stuffs?

From: Hugh Sasse
Subject: Re: Problem with reading file and executing other stuffs?
Date: Mon, 12 Nov 2007 15:22:19 +0000 (WET)

On Mon, 12 Nov 2007, Horinius wrote:

> Hugh Sasse wrote:
> > 
> > OK, if it is in fields, like /etc/passwd, then awk is probably more
> > suited to this problem than reading it directly with shell script.
> > 
> > If it has some delimited keyword, but each line has variable structure,
> > then you'd be better using sed.
> > 
> The files contain something like:
> aaa xxx xxx xxxxx xxxxx xxxx
> bbb xxx xxx xxx xxx xx xxxx xx
> ccc xx xxxxx xxxx xxxxx xxxx
> aaa, bbb, ccc are the known unique elements.  No, they don't have a fixed
> size.  And no, there's no delimited keyword except the first space after
> them.  Those xxx are sequences of characters that can be anything, from
> numbers to letters and different length.
> The elements are known and unique, and I need to extract the whole line
> beginning with such elements.  That's why I used the example of "database
> table".  Is awk suitable?  I know nothing about awk.

# Here is an example using gawk with its input in a here document.

gawk '/^aaa/ {print "got aaa" $0} \
     /^bbb/ {print "got bbb"; \
             print "$1 is" $1; \
             print "$2 is" $2; \
             print "$3 is" $3; \
            } \
     /^ccc/ {print "got ccc " NF " fields"}' <<END
aaa xxx xxx xxxxx xxxxx xxxx
bbb xxx xxx xxx xxx xx xxxx xx
ccc xx xxxxx xxxx xxxxx xxxx

> Hugh Sasse wrote:
> > 
> > Both of these operate linewise on their input, and can use regular
> > expressions and actions in braces to produce some textual response.
> > You can pass that response to `xargs -n 1` or  something.
> > 
> I'm not sure I understand since I know nothing about awk.  But this could be
> postponed to a later time for discussion if adequate.

As you can see from the above, it is easy to print stuff in awk.
That output is picked up for use as script by xargs.  xargs reads
lines, and (possibly appends them to supplied argument lists and)
executes the commands.
> Hugh Sasse wrote:
> > 
> > Yes, agreed.  Throw us a few example lines, fictionalised, then we may
> > be able to give you an example of an approach with greater simplicity.
> > 
> Put it in a simple way, the pseudo algo of extracting lines is like this:
> n = number of lines in the file (which is also the number of elements to
> process)
> element = array(1 to n) of known elements
> for i = 1 to n
>    use grep or whatever to extract a whole line beginning with element(i)
>    //process the line
> end

perfect for awk.
> Here, grep has to parse the whole file to extract one line.  In other words,
> if there're 3 elements, grep has to parse 3 lines for every element.  Thus
> it has to parse 9 lines during the whole algo.  Therefore, if there're n
> elements, grep has to parse n lines for n times.  Thus O(n^2).

Awk does it line by line, so it is O(n)
> Even if grep stops at the first occurence of the element, grep has to parse
> n/2 lines in average.  So the time is proportional to n^2/2, so the
> complexity is still O(n^2).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]