bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: use of TZ by mktime()/strftime()


From: Ed Morton
Subject: Re: use of TZ by mktime()/strftime()
Date: Wed, 10 Aug 2022 17:36:10 -0500
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.0

On 8/10/2022 3:13 PM, Neil R. Ormos wrote:
Ed Morton wrote:
Neil R. Ormos wrote:
I make an external call to date(1), instead of
mktime(), when I can't be sure that the input
is well-behaved.  I'm sure it's more expensive
then mktime(), but the overhead seems a
tolerable price to pay [...]
Thanks Neil, yeah I've done the same at times
for small input files but it is an order of
magnitude slower than using builtin time
functions so if/when I don't NEED to do that
then I avoid it. [...]
To put some real-world numbers to this...

Benchmarking my conversion function on a pair of old, slow systems, I get ~600 
and ~900 calls per second.

I can see how the external call approach would be unsuitable for operating on 
larger input files.


Right. Some more numbers if anyone cares - take the 10 line input file I provided earlier in this thread, create a 1000 line input file from it by doing `awk '{for (i=1;i<=100;i++ ) print}' file > file1k`, then time the execution of these 2 script on it, the first doing the calculation internally and the second calling `date`:

   --------
   $ cat internal.awk
   BEGIN {
        tzmap["EST"] = "US/Eastern"
        tzmap["EDT"] = "-04:00"
        tzmap["BST"] = "+01:00"
        tzmap["IST"] = "Asia/Calcutta"
   }
   {
        dt = gensub(/\s+\S+$/,"",1); gsub(/[-:T]/," ",dt)
        tz = ( $NF in tzmap ? tzmap[$NF] : $NF )
        if ( match(tz,/^([-+]?)([0-9]{2}):?([0-9]{2})$/,a) ) {
            tz = (a[1] == "-" ? "+" : "-") a[2] ":" a[3]
        }
        ENVIRON["TZ"] = tz

        epochSecs = mktime(dt)

        print $0, epochSecs
   }

   --------
   $ cat external.awk
   {
        cmd = "date -d \047" $0 "\047 +%s"
        epochSecs = ( (cmd | getline line) > 0 ? line : -1 )
        close(cmd)
        print $0, epochSecs
   }

Here's the timing when run on my Mac (both scripts produced the same output):

   $ time awk -f internal.awk file1k > file_out1k.int

   real       0m0.024s
   user       0m0.014s
   sys        0m0.005s

   $ time awk -f external.awk file1k > file_out1k.ext

   real       0m10.140s
   user       0m3.606s
   sys        0m3.815s

Regards,

    Ed.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]