bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/2] maint.mk: Split long argument lists


From: Roman Bolshakov
Subject: Re: [PATCH v2 1/2] maint.mk: Split long argument lists
Date: Thu, 13 Dec 2018 00:13:42 +0300
User-agent: NeoMutt/20180716

On Tue, Dec 04, 2018 at 04:10:08PM -0600, Eric Blake wrote:
> On 12/3/18 9:00 AM, Roman Bolshakov wrote:
> > $(VC_LIST_EXCEPT) is usually expanded into arguments for a command.
> > When a project contains too many, some operating systems can't pass all
> > the arguments because they hit the limit of arguments. FreeBSD and macOS
> > are known to have the limit of 256k of arguments.
> > 
> > More on the issue:
> > http://lists.gnu.org/archive/html/bug-gnulib/2015-08/msg00019.html
> > https://www.redhat.com/archives/libvir-list/2015-August/msg00758.html
> > 
> > xargs without flags can be used to limit number of arguments. The
> > default number of arguments (max-args for "-n" flag) is
> > platform-specific. If argument length exceeds default value for "-s"
> > flag (max-chars), xargs will feed less arguments than max-args.
> > 
> > Signed-off-by: Roman Bolshakov <address@hidden>
> > ---
> >   top/maint.mk | 53 +++++++++++++++++++++++++++++++++-------------------
> >   1 file changed, 34 insertions(+), 19 deletions(-)
> > 
> > diff --git a/top/maint.mk b/top/maint.mk
> > index 4889ebacc..36a5df262 100644
> > --- a/top/maint.mk
> > +++ b/top/maint.mk
> > @@ -303,18 +303,22 @@ define _sc_search_regexp
> >                                                                     \
> >      : Filter by content;                                                   
> > \
> >      test -n "$$files" && test -n "$$containing"                            
> > \
> > -     && { files=$$(grep -l "$$containing" $$files); } || :;                
> > \
> 
> If $$files is large enough to overflow the command-line limit for execv*()
> here...
> 
> > +     && { files=$$(echo "$$files"                                  \
> 
> ...then why is it not large enough to overflow the limit for 'echo' here?
> Or is it because you are implicitly relying on 'echo' being a shell-builtin
> that does not suffer from the same command-line limit because there is no
> execv*() involved?
> 

Yes, the built-in doesn't suffer from execve limitations :)

> > +       | xargs grep -l "$$containing" /dev/null); } || :;          \
> 
> At any rate, as long as the rewrite worked for you, it does look like you
> are benefitting from 'echo' being a shell builtin that does not suffer from
> the same limits as an external program.  If command-line arguments start
> affecting echo, we'd have to come up with something hairier like:
> 
> xargs <<EOF grep -l "$$containing" /dev/null
> $$files
> EOF
> 

AFAIK it could happen only if build host is very very low on memory. I
don't know if heredocs would help with that.

> >      test -n "$$files" && test -n "$$non_containing"                        
> > \
> > -     && { files=$$(grep -vl "$$non_containing" $$files); } || :;   \
> > +     && { files=$$(echo "$$files"                                  \
> > +       | xargs grep -vl "$$non_containing" /dev/null); } || :;             
> > \
> 
> Did you need /dev/null on this conversion?
> 

No, I've dropped /dev/null for grep invocations with -l flag in v3.

> > @@ -323,9 +327,10 @@ define _sc_search_regexp
> >   endef
> >   sc_avoid_if_before_free:
> > -   @$(srcdir)/$(_build-aux)/useless-if-before-free                 \
> > -           $(useless_free_options)                                 \
> > -       $$($(VC_LIST_EXCEPT) | grep -v useless-if-before-free) &&   \
> > +   @$(VC_LIST_EXCEPT) | grep -v useless-if-before-free             \
> > +            | xargs                                                        
> > \
> > +         $(srcdir)/$(_build-aux)/useless-if-before-free            \
> > +         $(useless_free_options)            &&                     \
> 
> Spacing before the && looks odd here.
> 

Ok, I've left the first tab there and placed | and && like you advised
below. Hope this helps.

> > @@ -845,7 +853,10 @@ sc_prohibit_always-defined_macros:
> >       case $$(echo all: | grep -l -f - Makefile) in Makefile);; *)  \
> >         echo '$(ME): skipping $@: you lack GNU grep' 1>&2; exit 0;; \
> >       esac;                                                         \
> > -     $(def_sym_regex) | grep -E -f - $$($(VC_LIST_EXCEPT))         \
> > +     sym_regexes=$$(mktemp);                                       \
> 
> Are we guaranteed that 'mktemp' is portably present on all developer's
> machines?
> 

That's a good question. It's not present on Solaris 9. What platforms
gnulib should cover?

> > +     $(def_sym_regex) > $$sym_regexes;                             \
> > +     $(VC_LIST_EXCEPT) | xargs                                     \
> > +       grep -E -f $$sym_regexes /dev/null                          \
> >         && { echo '$(ME): define the above via some gnulib .h file' \
> >               1>&2;  exit 1; } || :;                                \
> 
> Even worse, you aren't cleaning up your temporary file.  This conversion
> needs to be rewritten in a better manner.  I suggest:
> 
> $(VC_LIST_EXCEPT) | xargs sh -c \
>    '$(def_sym_regex) | grep -E -f - "$$@"' dummy /dev/null
> 
> which uses 'sh' as an intermediary to let you still feed $(def_sym_regex) as
> the stdin to the grep process (the dummy argument supplies $0 to sh, then
> the /dev/null is the usual placeholder to ensure grep sees more than one
> file at a time to avoid its output changing styles).
> 

Ok, I was struggling to find a portable solution. Thank you for
proposing this one (and explaining it). The macro can't be expanded
without breaking shell parser though. I don't know... the approach with
mktemp is more readable but I'm not sure if that's what we want to use
because of portability concerns.

> > @@ -927,7 +938,8 @@ require_exactly_one_NL_at_EOF_ =                        
> >                 \
> >       }                                                                     
> > \
> >     END { exit defined $$fail }
> >   sc_prohibit_empty_lines_at_EOF:
> > -   @perl -le '$(require_exactly_one_NL_at_EOF_)' $$($(VC_LIST_EXCEPT)) \
> > +   @$(VC_LIST_EXCEPT) | xargs -I{}                                 \
> > +     perl -le '$(require_exactly_one_NL_at_EOF_)' {}               \
> 
> Why did you need -I{} here? If your only use of {} in the 'initial-args' to
> xargs is in the final position, this should be the same as:
> 
> $(VC_LIST_EXCEPT) | xargs perl -le '$(require_exactly_one_NL_at_EOF_)'
> 

Right, there's no need, thanks.

> > @@ -1033,7 +1046,8 @@ sc_prohibit_test_double_equal:
> >   # definition of LDADD from the appropriate Makefile.am and exits 0
> >   # when it contains "ICONV".
> >   sc_proper_name_utf8_requires_ICONV:
> > -   @progs=$$(grep -l 'proper_name_utf8 ''("' $$($(VC_LIST_EXCEPT)));\
> > +   @progs=$$($(VC_LIST_EXCEPT) | xargs                             \
> > +          grep -l 'proper_name_utf8 ''("' /dev/null);                      
> > \
> 
> You don't need /dev/null here because of grep -l.
> 

Fixed, thanks.

> > @@ -1155,8 +1169,8 @@ sc_po_check:
> >     @if test -f $(po_file); then                                    \
> >       grep -E -v '^(#|$$)' $(po_file)                               \
> >         | grep -v '^src/false\.c$$' | sort > address@hidden;                
> >         \
> > -     files=$$(perl $(perl_translatable_files_list_)                \
> > -       $$($(VC_LIST_EXCEPT)) $(generated_files));                  \
> > +     files=$$($(VC_LIST_EXCEPT) | xargs -I{}                       \
> > +       perl $(perl_translatable_files_list_) {} $(generated_files)); \
> >       grep -E -l '$(_gl_translatable_string_re)' $$files            \
> 
> This one is wrong; if xargs splits the list, it ends up running perl on the
> tail end of $(generated_files) more than once.  Better would be:
> 
> { $(VC_LIST_EXCEPT); echo $(generated_files); } | \
>   xargs perl $(perl_translatable_files_list_)) | \
>   xargs grep -E -l '$(_gl_translatable_string_re)' \
> 
> >         | $(SED) 's|^$(_dot_escaped_srcdir)/||' | sort -u > address@hidden; 
> > \
> 

I agree that's much better, thanks.

> Also, this points out that we tend to use the style:
> 
>    command 1 \
>      | xargs command 2 \
>      | command 3
> 
> rather than
> 
>    command 1 | xargs \
>      command 2 | \
>      command 3
> 
> so while I didn't point out many spots where your style is different (by
> splitting xargs from its arguments), it may be worth reformatting to include
> xargs on the same line as its arguments rather than separately.
> 

Ok, sure I've changed a bit with formatting to this one (it also
illustrates that I split command 2 because otherwise I have to move
backslash for every macro/routine I'm changing, but in some cases I have
to move backslash even beyond 79th character):

    command 1       \
      | xargs       \
          command 2 \
      | command 3

> > @@ -1230,7 +1244,8 @@ sc_cross_check_PATH_usage_in_tests:
> >              exit 1; };                                             \
> >       good=$$(grep -E '$(_hv_regex_strong)' $(_hv_file));           \
> >       grep -LFx "$$good"                                            \
> > -           $$(grep -lE '$(_hv_regex_weak)' $$($(VC_LIST_EXCEPT)))  \
> > +           $$($(VC_LIST_EXCEPT) | xargs                            \
> > +             grep -lE '$(_hv_regex_weak)' /dev/null)               \
> 
> Are you sure that the inner $$() won't produce too many arguments for the
> outer grep -LFx, or is this another case where you'll want to rewrite to:
> 
> $(VC_LIST_EXCEPT) | xargs grep -lE '$(_hv_regx_weak)' \
>   | xargs grep -LFx "$$good"
> 

It doesn't produce too many arguments (otherwise I'd catch this) but I
rewrote it as you advised because it's easier to properly indent.

Thank you,
Roman



reply via email to

[Prev in Thread] Current Thread [Next in Thread]