bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 4/5] Efficient license compatibility checks with --test/--create-


From: Ralf Wildenhues
Subject: [PATCH 4/5] Efficient license compatibility checks with --test/--create-testdir.
Date: Sun, 17 Jan 2010 10:51:33 +0100
User-agent: Mutt/1.5.20 (2009-10-28)

For licence compatibility testing of requested modules in --test
or --create-testdir modes, walk the directed dependency graph;
in postorder traversal, annotate the set of potential license
incompatibilites in the respective subtrees.  Warn about circular
dependencies, where the algorithm does not currently give precise
answers.

Cache as many data as possible, and (ab)use function positional
parameters as cheap emulation of function-local variables for
recursion.

* gnulib-tool (func_license_compatibility): New function.
(func_create_testdir): Use it, simplify accordingly.  For
license incompatibilities, output the names of all modules
in the problematic dependency subgraph, not just the offending
modules.

Signed-off-by: Ralf Wildenhues <address@hidden>
---

Hello,

The --test/--create-testdir code currently tries to find license
incompatibilities in the following manner:

For each requested module, it looks at the transitive closure of its
dependencies (excluding tests and their deps), and checks whether any of
the licenses conflict.  This is done in order to expose, for example, an
LGPL module depending on a GPLed build tool depending on a GPL module
(we have to assume that the LGPL module may be using code from the
latter GPL module).

With lots of requested modules (or all of them), the code walks the
dependency graph many times; very expensive, and the prime offender for
<http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/20351/focus=20361>.

The directed graph of git gnulib modules and dependencies currently
consists of roughly 1900 modules and 4300 listed dependencies, so it
is fairly sparse.  It is almost a tree: it contains 13 cycles, of which
7 are typos, and the rest are shown here:

  close -> fclose -> close

  mbrtowc -> mbsinit -> mbrtowc

  xalloc -> xalloc-die -> xalloc

                                 ,-> chdir-long ----------v
         ,-------------> save-cwd -> xgetcwd -> getcwd -> openat
  openat -> fdopendir ---^                                ^
         \           `-> openat-die ----------------------'
          `--------------^

My approach to license compatibility testing assumes a tree; it warns
about, and produces less precise results in the presence of, cycles.

With respect to the cycles involving xalloc and openat, there is no
problem in this approach: these two modules are GPL, thus any
incompatibility would still be noted, but not all GPL dependencies
might be listed.

However, maybe the first three cycles can be eliminated or somehow
avoided nonetheless:

- close depending on fclose looks like it could be avoided if
  gl_REPLACE_FCLOSE is removed from close.m4,
- xalloc-die probably doesn't need to depend on xalloc if some files
  are listed twice.
- modules in small cycles might just be merged (and one be made an
  empty alias of the other).


I further chose to change the output of the license checking, to produce
not only the pairs (requested_module, incompatible_dependency) but all
modules in the subgraph containing the incompatibility; it might be
easier to find the dependency chain leading to the issue that way.
(This can trivially be reverted if it is undesirable.)

The new output currently looks like this:

  warning: cannot determine license compatibility: circular dependency of 
mbrtowc
  warning: cannot determine license compatibility: circular dependency of xalloc
  warning: cannot determine license compatibility: circular dependency of close
  warning: cannot determine license compatibility: circular dependency of openat
  warning: cannot determine license compatibility: circular dependency of openat
  warning: cannot determine license compatibility: circular dependency of openat
  warning: cannot determine license compatibility: circular dependency of 
unistdio/u16-sprintf
  warning: cannot determine license compatibility: circular dependency of 
unistdio/u16-u16-sprintf
  warning: cannot determine license compatibility: circular dependency of 
unistdio/u32-sprintf
  warning: cannot determine license compatibility: circular dependency of 
unistdio/u32-u32-sprintf
  warning: cannot determine license compatibility: circular dependency of 
unistdio/u8-sprintf
  warning: cannot determine license compatibility: circular dependency of 
unistdio/u8-u8-sprintf
  warning: cannot determine license compatibility: circular dependency of 
unistdio/ulc-sprintf
  warning: module euidaccess depends on a module with an incompatible license 
through:
    euidaccess
    exit
    exitfail
    getgroups
    group-member
    xalloc
    xalloc-die
  warning: module fts-lgpl depends on a module with an incompatible license 
through:
    chdir-long
    chown
    d-ino
    exit
    exitfail
    fdopendir
    fts-lgpl
    getcwd
    lchown
    openat
    openat-die
    save-cwd
    unistd-safer
    xalloc
    xalloc-die
    xgetcwd
  warning: module obstack depends on a module with an incompatible license 
through:
    exit
    exitfail
    obstack
  warning: module regex depends on a module with an incompatible license 
through:
    langinfo
    nl_langinfo
    regex

(BTW, speak up if you want me to post some of the debugging code that I
used for this.)

Cheers,
Ralf

 ChangeLog   |   16 ++++++
 gnulib-tool |  157 +++++++++++++++++++++++++++++++++++++++++++++--------------
 2 files changed, 137 insertions(+), 36 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 260821a..a7d0163 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,21 @@
 2010-01-17  Ralf Wildenhues  <address@hidden>
 
+       Efficient license compatibility checks with --test/--create-testdir.
+       For licence compatibility testing of requested modules in --test
+       or --create-testdir modes, walk the directed dependency graph;
+       in postorder traversal, annotate the set of potential license
+       incompatibilites in the respective subtrees.  Warn about circular
+       dependencies, where the algorithm does not currently give precise
+       answers.
+       Cache as many data as possible, and (ab)use function positional
+       parameters as cheap emulation of function-local variables for
+       recursion.
+       * gnulib-tool (func_license_compatibility): New function.
+       (func_create_testdir): Use it, simplify accordingly.  For
+       license incompatibilities, output the names of all modules
+       in the problematic dependency subgraph, not just the offending
+       modules.
+
        gnulib-tool: small sed optimization.
        * gnulib-tool (func_get_automake_snippet): Merge multiple sed
        invocations.
diff --git a/gnulib-tool b/gnulib-tool
index 5ab640f..00b4b08 100755
--- a/gnulib-tool
+++ b/gnulib-tool
@@ -3996,6 +3996,101 @@ s,//*$,/,'
   echo "  - invoke ${macro_prefix}_INIT in $configure_ac."
 }
 
+
+# func_license_compatibility module...
+# Annotate module graph according to which licenses they are compatible with.
+# This is not precise in presence of circular dependencies.
+#
+# NOTE: This function calls itself recursively.  Since sh doesn't have
+#       local variables (without unportable `local'), the only variables
+#       that make it across the recursion are the positional parameters.
+#
+# Input:
+#       $1 ...                  list of modules to check
+#
+# Output:
+#       ${module}_gpl2_incomp   nonempty if its transitive dependency closure
+#                               contains a module incompatible with GPLv2+
+#       ${module}_lgpl_incomp   likewise for LGPL
+#       ${module}_lgpl2_incomp  likewise for LGPLv2+
+#       ${module}_license_seen  nonempty if this module has already been seen
+func_license_compatibility ()
+{
+  for lic_module
+  do
+    func_cache_var $lic_module
+    lic_cache=$cachevar
+    eval "lic_seen=\$${lic_cache}_license_seen
+          lic_circular=\$${lic_cache}_circular
+          ${lic_cache}_circular=yes"
+    if test -n "$lic_seen"; then
+      continue
+    fi
+    if test -n "$lic_circular"; then
+      echo "warning: cannot determine license compatibility: circular 
dependency of $lic_module"
+      continue
+    fi
+    # Don't consider $module to depend on $module-tests here.  This is needed
+    # becauses tests are implicitly GPL and may depend on GPL modules; we don't
+    # want a warning in this case.
+    func_get_dependencies $lic_module
+
+    # Save the module name, cache var name, and list of dependencies in the
+    # positional parameters, for recursion.
+    lic_deps=$module_deps
+    set $lic_cache $lic_module $lic_deps
+    func_license_compatibility $lic_deps
+    lic_cache=$1
+    lic_module=$2
+    shift; shift
+    lic_deps=$*
+    gpl2_incomp= lgpl_incomp= lgpl2_incomp=
+    for lic_dep in $lic_deps; do
+      func_cache_var $lic_dep
+      eval '
+        if test -n "$'$cachevar'_gpl2_incomp"; then
+          func_append gpl2_incomp "$nl$'$cachevar'_gpl2_incomp"
+        fi
+        if test -n "$'$cachevar'_lgpl_incomp"; then
+          func_append lgpl_incomp "$nl$'$cachevar'_lgpl_incomp"
+        fi
+        if test -n "$'$cachevar'_lgpl2_incomp"; then
+          func_append lgpl2_incomp "$nl$'$cachevar'_lgpl2_incomp"
+        fi'
+    done
+    func_get_license $lic_module
+    case $module_license in
+      'GPLed build tool') ;;
+      'public domain' | 'unlimited' | 'unmodifiable license text') ;;
+      'LGPLv2+') ;;
+      'GPLv2+')
+        func_append lgpl_incomp "$nl$lic_module"
+        func_append lgpl2_incomp "$nl$lic_module"
+        ;;
+      'LGPL')
+        func_append lgpl2_incomp "$nl$lic_module"
+        ;;
+      *)
+        func_append gpl2_incomp "$nl$lic_module"
+        func_append lgpl_incomp "$nl$lic_module"
+        func_append lgpl2_incomp "$nl$lic_module"
+        ;;
+    esac
+    if test -n "$gpl2_incomp"; then
+      eval ${lic_cache}'_gpl2_incomp=`echo "$lic_module$gpl2_incomp" | sort 
-u`'
+    fi
+    if test -n "$lgpl_incomp"; then
+      eval ${lic_cache}'_lgpl_incomp=`echo "$lic_module$lgpl_incomp" | sort 
-u`'
+    fi
+    if test -n "$lgpl2_incomp"; then
+      eval ${lic_cache}'_lgpl2_incomp=`echo "$lic_module$lgpl2_incomp" | sort 
-u`'
+    fi
+    eval "${lic_cache}_license_seen=yes
+          ${lic_cache}_circular="
+  done
+}
+
+
 # func_create_testdir testdir modules
 # Input:
 # - local_gnulib_dir  from --local-dir
@@ -4021,46 +4116,36 @@ func_create_testdir ()
   # $module-tests. Need this becauses tests are implicitly GPL and may depend
   # on GPL modules - therefore we don't want a warning in this case.
   inctests=""
+
+  func_license_compatibility $saved_modules
   for requested_module in $saved_modules; do
     func_get_license "$requested_module"
     requested_license=$module_license
     if test "$requested_license" != GPL; then
-      # Here we use func_modules_transitive_closure, not just
-      # func_get_dependencies, so that we also detect weird situations like
-      # an LGPL module which depends on a GPLed build tool module which depends
-      # on a GPL module.
-      modules="$requested_module"
-      func_modules_transitive_closure
-      for module in $modules; do
-        func_get_license "$module"
-        license=$module_license
-        case "$license" in
-          'GPLed build tool') ;;
-          'public domain' | 'unlimited' | 'unmodifiable license text') ;;
-          *)
-            case "$requested_license" in
-              GPLv2+)
-                case "$license" in
-                  GPLv2+ | LGPLv2+) ;;
-                  *) echo "warning: module $requested_module depends on a 
module with an incompatible license: $module" 1>&2 ;;
-                esac
-                ;;
-              LGPL)
-                case "$license" in
-                  LGPL | LGPLv2+) ;;
-                  *) echo "warning: module $requested_module depends on a 
module with an incompatible license: $module" 1>&2 ;;
-                esac
-                ;;
-              LGPLv2+)
-                case "$license" in
-                  LGPLv2+) ;;
-                  *) echo "warning: module $requested_module depends on a 
module with an incompatible license: $module" 1>&2 ;;
-                esac
-                ;;
-            esac
-            ;;
-        esac
-      done
+      func_cache_var $requested_module
+      eval "gpl2_state=\$${cachevar}_gpl2_incomp
+            lgpl_state=\$${cachevar}_lgpl_incomp
+            lgpl2_state=\$${cachevar}_lgpl2_incomp"
+      case "$requested_license" in
+        GPLv2+)
+          if test -n "$gpl2_state"; then
+            echo "warning: module $requested_module has an incompatible 
license dependency subgraph:" 1>&2
+            echo "$gpl2_state" | sed 's/^/  /' 1>&2
+          fi
+          ;;
+        LGPL)
+          if test -n "$lgpl_state"; then
+            echo "warning: module $requested_module has an incompatible 
license dependency subgraph:" 1>&2
+            echo "$lgpl_state" | sed 's/^/  /' 1>&2
+          fi
+          ;;
+        LGPLv2+)
+          if test -n "$lgpl2_state"; then
+            echo "warning: module $requested_module has an incompatible 
license dependency subgraph:" 1>&2
+            echo "$lgpl2_state" | sed 's/^/  /' 1>&2
+          fi
+          ;;
+      esac
     fi
   done
   modules="$saved_modules"
-- 
1.6.5.1.31.gad12b





reply via email to

[Prev in Thread] Current Thread [Next in Thread]