[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

awk portability notes

From: Ralf Wildenhues
Subject: awk portability notes
Date: Sun, 03 Dec 2006 10:04:51 +0100

The Autoconf change to use portable awk in config.status made us aware
of a number of portability issues that could be documented in gawk.texi
(that I haven't found there yet).
1) Solaris awk does not support the syntax `if (index in array)', but
only `for (index in array)'.  SVID describes this feature; does that
imply that SVR3.1 (or SVR4) awk had it? How can I find out for sure?
2) Solaris awk does not support regexps as value of `FS', which is
documented in the V7/SVR3.1 node.  However, it may be useful to know
that this awk accepts a string value for `FS', of which only the first
character is important.
3) `$0' is not assignable in Solaris awk, and `$ 0' is not the same as
`$0' for it; see this message for more information:
4) next is defined by POSIX; the awkcard seems to imply otherwise by

Further, the Autoconf manual currently lists a number of issues to be
expected with some awk implementations, not all of which are listed in
the gawk manual.  For reference here are the items that I did not find
equivalent information in gawk.texi for.  Would you be interested in
listing them as well, and if yes, where?
|   Some Awk implementations, such as HP-UX 11.0's native one,
|   mishandle anchors:
|        $ echo xfoo | $AWK '/foo|^bar/ { print }'
|        $ echo bar | $AWK '/foo|^bar/ { print }'
|        bar
|        $ echo xfoo | $AWK '/^bar|foo/ { print }'
|        xfoo
|        $ echo bar | $AWK '/^bar|foo/ { print }'
|        bar
|   Either do not depend on such patterns (i.e., use `/^(.*foo|bar)/',
| or use a simple test to reject such implementations.
|   AIX version 5.2 has an arbitrary limit of 399 on the length of
| regular expressions and literal strings in an Awk program.
|   Traditional Awk has a limit of 99 fields in a record.  You may be
| able to circumvent this problem by using `split'.
|   Traditional Awk has a limit of at most 99 bytes in a number
|   formatted by `OFMT'; for example, `OFMT="%.300e"; print 0.1;'
| typically dumps core.
|   The original version of Awk had a limit of at most 99 bytes per
|   `split' field, 99 bytes per `substr' substring, and 99 bytes per
|   run of non-special characters in a `printf' format, but these bugs
| have been fixed on all practical hosts that we know of.

Here's a suggested patch for the list above.  I don't have access to old
awks other than the Solaris one, so corrections are very welcome.
2006-12-03 Ralf Wildenhues <address@hidden>
        * next is POSIX.
        * gawk.texi: V7/SVR3.1: Mention assignable `$0', `var in index'
as expression. Specify `FS' limitation.
Index: doc/
RCS file: /sources/gawk/gawk-stable/doc/,v
retrieving revision
diff -u -r1.1.1.1
--- doc/      11 Aug 2006 12:05:48 -0000
+++ doc/      3 Dec 2006 08:52:54 -0000
@@ -1168,7 +1168,7 @@
co-process pipe into \*(FCgetline\*(FR; set \*(FIv\*(FR.
.ti -.2i
stop processing the current input
record. Read next input record and
Index: doc/gawk.texi
RCS file: /sources/gawk/gawk-stable/doc/gawk.texi,v
retrieving revision 1.5
diff -u -r1.5 gawk.texi
--- doc/gawk.texi       15 Sep 2006 13:49:28 -0000      1.5
+++ doc/gawk.texi       3 Dec 2006 08:53:18 -0000
@@ -22776,10 +22776,17 @@
and @code{SUBSEP} built-in variables (@pxref{Built-in Variables}).
+Assignable @code{$0}.
The conditional expression using the ternary operator @samp{?:}
(@pxref{Conditional Exp}).
+The expression @address@hidden in @var{array}} outside of @samp{for}
+statements (@pxref{Reference to Elements}).
The exponentiation operator @samp{^}
(@pxref{Arithmetic Ops}) and its assignment operator
form @samp{^=} (@pxref{Assignment Ops}).
@@ -22792,7 +22799,8 @@
Regexps as the value of @code{FS}
(@pxref{Field Separators}) and as the
third argument to the @code{split} function
-(@pxref{String Functions}).
+(@pxref{String Functions}), rather than using only the first character
+of @code{FS}.
Dynamic regexps as operands of the @samp{~} and @samp{!~} operators

reply via email to

[Prev in Thread] Current Thread [Next in Thread]