bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new module: update-copyright [Re: copyright years: mass-update every


From: Joel E. Denny
Subject: Re: new module: update-copyright [Re: copyright years: mass-update every January 1
Date: Fri, 31 Jul 2009 09:52:12 -0400 (EDT)
User-agent: Alpine 1.00 (DEB 882 2007-12-20)

On Thu, 30 Jul 2009, Joel E. Denny wrote:

> On Thu, 30 Jul 2009, Jim Meyering wrote:

> > There remains at least one infelicity: if someone discusses
> > the Copyright (C) notation (e.g., as on this line), and later
> > has the copyright-with-dates line, the prefixes may not match.
> > We could require that the Copyright...holder (and hence prefix)
> > are all in proximity, but that may not be worth the trouble.

In the first patch below, I ended up documenting that limitation instead 
of fixing it.  It should be very straightforward to add a proximity check 
in the initial search for the copyright.  However, I haven't yet found a 
real-world test case, so I'm waiting.  For now, the worst case is that an 
affected file just won't be updated, and we'll be warned.

> The current behavior is a little unintuitive in other similar ways as 
> well.  For example, if "Copyright (C)" isn't recognized anywhere and thus 
> no comment sequence is recognized, the script still looks for an 
> occurrence of the copyright year and holder to adjust.  Thus, the script 
> handles "Copyright @copyright{}" only when it's not in comments.  I'd 
> rather it be consistent by not handling it at all or handling it 
> completely.

Also, it bothered me that 2009 might be added to the phrase " 98 Free 
Software Foundation", for example, even if it had nothing to do with a 
copyright statement.  I've fixed this in the first patch below.

> I'm working on a patch to fix all of the above and to automatically format 
> the result to 72 columns.  I'll try to post a patch soon.

I've also implemented reformatting in the first patch.

In the next two patches below, I've added handling for DOS EOLs and for 
leading tabs in, for example, ChangeLogs.  These are helpful in Bison, at 
least.

>From 2727188acb1f45b83f5072bfdac740715c78444c Mon Sep 17 00:00:00 2001
From: Joel E. Denny <address@hidden>
Date: Fri, 31 Jul 2009 09:11:53 -0400
Subject: [PATCH] update-copyright: automatically format copyright statements

* build-aux/update-copyright: Implement that.
Also, be a little more predictable and safer by always failing
when the full copyright format is not perfectly recognized as an
unbroken whole.  Discussed at
<http://lists.gnu.org/archive/html/bug-gnulib/2009-07/msg00131.html>.
Rewrite documentation.
---
 ChangeLog                  |   10 +++
 build-aux/update-copyright |  160 +++++++++++++++++++++++++++++++++-----------
 2 files changed, 130 insertions(+), 40 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 239faa6..a9e8cfc 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2009-07-31  Joel E. Denny  <address@hidden>
+
+       update-copyright: automatically format copyright statements
+       * build-aux/update-copyright: Implement that.
+       Also, be a little more predictable and safer by always failing
+       when the full copyright format is not perfectly recognized as an
+       unbroken whole.  Discussed at
+       <http://lists.gnu.org/archive/html/bug-gnulib/2009-07/msg00131.html>.
+       Rewrite documentation.
+
 2009-07-29  Matt Kraai  <address@hidden>
 
        getloadavg: check whether n_name is a pointer, for QNX 6.4.1
diff --git a/build-aux/update-copyright b/build-aux/update-copyright
index e35d51b..59ce6b6 100755
--- a/build-aux/update-copyright
+++ b/build-aux/update-copyright
@@ -1,7 +1,7 @@
 #!/usr/bin/perl -0777 -pi
 # Update an FSF copyright year list to include the current year.
 
-my $VERSION = '2009-07-30.13:24'; # UTC
+my $VERSION = '2009-07-31.12:44'; # UTC
 
 # Copyright (C) 2009 Free Software Foundation
 #
@@ -20,63 +20,114 @@ my $VERSION = '2009-07-30.13:24'; # UTC
 
 # Written by Jim Meyering
 
-# In the copyright statement in each file, "Copyright (C)" must appear
-# at the beginning of the line except that it may be preceded by any
-# sequence (e.g., a comment) of no more than 5 characters.  Iff that
-# prefix is present, the same prefix should appear at the beginning
-# of each remaining line within the copyright statement so that it
-# can be parsed correctly.
+# The arguments to this script should be names of files that contain FSF
+# copyright statements to be updated.  For example, you may wish to
+# place a target like the following in your top-level makefile in your
+# project:
 #
-# For example, these are fine:
+#   .PHONY: update-copyright
+#   update-copyright:
+#       if test -d .git; then                                   \
+#         git grep -l -w Copyright                              \
+#           | grep -v -E '(^|/)(COPYING|ChangeLog)'             \
+#           | xargs $(srcdir)/build-aux/$@;                     \
+#       fi
+#
+# In the second grep, you can build a list of files to skip within your
+# project.
+#
+# Iff an FSF copyright statement is discovered in a file and the final
+# year is not the current year, the statement is updated for the new
+# year and reformatted to fit within 72 columns.  A warning is printed
+# for every file for which no FSF copyright statement is discovered.
+#
+# Each file's FSF copyright statement must be formated correctly in
+# order to be recognized, and it must appear before other text that
+# looks like the start of a copyright statement.  For example, each of
+# these by itself is fine:
+#
+#   Copyright (C) 1990-2005, 2007-2009 Free Software Foundation,
+#   Inc.
 #
 #   # Copyright (C) 1990-2005, 2007-2009 Free Software
 #   # Foundation, Inc.
 #
 #   /*
-#    * Copyright (C) 1990-2005, 2007-2009 Free Software
+#    * Copyright (C) 90,2005,2007-2009 Free Software
 #    * Foundation, Inc.
 #    */
 #
-# The following format is not recognized:
+# However, the following format is not recognized because the line
+# prefix changes after the first line:
 #
 #   /* Copyright (C) 1990-2005, 2007-2009 Free Software
 #    * Foundation, Inc.  */
 #
-# A warning is printed for every file for which the copyright format is
-# not recognized.  The culprit may be that the above preconditions are
-# not obeyed as in the previous example, or it may simply be that the
-# stated copyright holder is not the Free Software Foundation.
+# The following copyright statement is not recognized because the
+# copyright holder is not the FSF:
 #
-# You may wish to place a target like the following in your top-level
-# makefile in your project:
+#   Copyright (C) 1990-2005, 2007-2009 Acme, Inc.
 #
-#   .PHONY: update-copyright
-#   update-copyright:
-#       if test -d .git; then                                   \
-#         git grep -l -w Copyright                              \
-#           | grep -v -E '(^|/)(COPYING|ChangeLog)'             \
-#           | xargs $(srcdir)/build-aux/$@;                     \
-#       fi
+# Moreover, any FSF copyright statement following either of the previous
+# copyright statements might not be recognized.
 #
-# You can build a list of files to skip in the second grep.
+# The exact conditions that a file's FSF copyright statement must meet
+# to be recognized are listed below.  They may seem slightly complex,
+# but you need not worry if some file in your project accidentally
+# breaks one.  The worse that can happen is a warning that the file was
+# not updated.
+#
+#   1. The format is "Copyright (C)" (where "(C)" can be "(c)"), then a
+#      list of copyright years, and then the name of the copyright
+#      holder, which is "Free Software Foundation, Inc.".
+#   2. "Copyright (C)" appears at the beginning of a line except that it
+#      may be prefixed by any sequence (e.g., a comment) of no more than
+#      5 characters.
+#   3. The prefix of "Copyright (C)" is the same as the prefix on the
+#      file's first occurrence of "Copyright (C)" that matches condition
+#      #2.  Stated more simply, if something that looks like the start
+#      of a copyright statement appears earlier than the FSF copyright
+#      statement, the FSF copyright statement might not be recognized.
+#      This condition might be removed in the future.
+#   4. Iff a prefix is present before "Copyright (C)", the same prefix
+#      appears at the beginning of each remaining line within the FSF
+#      copyright statement.
+#   5. Blank lines, even if preceded by the prefix, do not appear
+#      within the FSF copyright statement.
+#   6. Each copyright year is 2 or 4 digits, and years are separated by
+#      commas or dashes.  Whitespace may occur after commas.
 
 use strict;
 use warnings;
 
 my ($sec, $min, $hour, $mday, $month, $year) = localtime (time());
 my $this_year = $year + 1900;
-my $holder = 'Free Software Foundation';
-
-my $prefix = '';
-if (/(?:^|\n)(.{0,5})Copyright \([cC]\)/) {
-  $prefix = quotemeta $1;
-}
-$holder = " $holder";
-$holder =~ s/\s/\\s*(?:\\s|\\n$prefix)\\s*/g;
+my $copyright = 'Copyright \([cC]\)';
+my $holder = 'Free Software Foundation, Inc.';
+my $prefix_max = 5;
+my $margin = 72;
 
-if (/([- ])((?:\d\d)?\d\d)($holder)/s)
+my $leading;
+my $prefix;
+my $ws;
+my $old;
+if (/(^|\n)(.{0,$prefix_max})$copyright/)
+  {
+    $leading = $1;
+    $prefix = $2;
+    $ws = '[ \t\r\f]'; # \s without \n
+    $ws = "(?:$ws*(?:$ws|\\n" . quotemeta($prefix) . ")$ws*)";
+    $holder =~ s/\s/$ws/g;
+    $old =
+      quotemeta("$leading$prefix") . "($copyright$ws"
+      . "(?:(?:\\d\\d)?\\d\\d(,$ws?|-))*"
+      . "((?:\\d\\d)?\\d\\d)$ws$holder)";
+  }
+if (defined($old) && /$old/)
   {
-    my ($sep, $last_c_year, $rest) = ($1, $2, $3);
+    my $new = $1;
+    my $sep = $2 ? $2 : "";
+    my $last_c_year = $3;
 
     # Handle two-digit year numbers like "98" and "99".
     $last_c_year <= 99
@@ -84,24 +135,53 @@ if (/([- ])((?:\d\d)?\d\d)($holder)/s)
 
     if ($last_c_year != $this_year)
       {
+        # Update the year.
         if ($sep eq '-' && $last_c_year + 1 == $this_year)
           {
-            s//-$this_year$rest/;
+            $new =~ s/$last_c_year/$this_year/;
           }
-        elsif ($sep eq ' ' && $last_c_year + 1 == $this_year)
+        elsif ($sep ne '-' && $last_c_year + 1 == $this_year)
           {
-            s// $last_c_year-$this_year$rest/;
+            $new =~ s/$last_c_year/$last_c_year-$this_year/;
           }
         else
           {
-            s//$sep$last_c_year, $this_year$rest/;
+            $new =~ s/$last_c_year/$last_c_year, $this_year/;
           }
+
+        # Normalize all whitespace including newline-prefix sequences.
+        $new =~ s/$ws/ /g;
+
+        # Put spaces after commas.
+        $new =~ s/, ?/, /g;
+
+        # Format within margin.
+        my $new_wrapped;
+        my $text_margin = $margin - length($prefix);
+        while (length($new))
+          {
+            if (($new =~ s/^(.{1,$text_margin})(?: |$)//)
+                || ($new =~ s/^([\S]+)(?: |$)//))
+              {
+                my $line = $1;
+                $new_wrapped .= $new_wrapped ? "\n" : $leading;
+                $new_wrapped .= "$prefix$line";
+              }
+            else
+              {
+                # Should be unreachable, but we don't want an infinite
+                # loop if it can be reached.
+                die;
+              }
+          }
+
+        # Replace the old copyright statement.
+        s/$old/$new_wrapped/;
       }
   }
 else
   {
-    print STDERR
-      "$ARGV: warning: external copyright holder or parse failure\n";
+    print STDERR "$ARGV: warning: FSF copyright statement not found\n";
   }
 
 # Local variables:
-- 
1.5.4.3

>From 573b87cde2a8907d7d5b0eff02b2463ccb2b811f Mon Sep 17 00:00:00 2001
From: Joel E. Denny <address@hidden>
Date: Fri, 31 Jul 2009 09:32:30 -0400
Subject: [PATCH] update-copyright: support EOL=\r\n

* build-aux/update-copyright: Implement that.
---
 ChangeLog                  |    5 +++++
 build-aux/update-copyright |    5 ++++-
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index a9e8cfc..e215618 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2009-07-31  Joel E. Denny  <address@hidden>
 
+       update-copyright: support EOL=\r\n
+       * build-aux/update-copyright: Implement that.
+
+2009-07-31  Joel E. Denny  <address@hidden>
+
        update-copyright: automatically format copyright statements
        * build-aux/update-copyright: Implement that.
        Also, be a little more predictable and safer by always failing
diff --git a/build-aux/update-copyright b/build-aux/update-copyright
index 59ce6b6..777d3e5 100755
--- a/build-aux/update-copyright
+++ b/build-aux/update-copyright
@@ -107,6 +107,9 @@ my $holder = 'Free Software Foundation, Inc.';
 my $prefix_max = 5;
 my $margin = 72;
 
+# Unless the file consistently uses "\r\n" as the EOL, use "\n" instead.
+my $eol = /(?:^|[^\r])\n/ ? "\n" : "\r\n";
+
 my $leading;
 my $prefix;
 my $ws;
@@ -164,7 +167,7 @@ if (defined($old) && /$old/)
                 || ($new =~ s/^([\S]+)(?: |$)//))
               {
                 my $line = $1;
-                $new_wrapped .= $new_wrapped ? "\n" : $leading;
+                $new_wrapped .= $new_wrapped ? $eol : $leading;
                 $new_wrapped .= "$prefix$line";
               }
             else
-- 
1.5.4.3

>From 95cccf0e13c5b9d852a0b7d7361386a064e0d352 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <address@hidden>
Date: Fri, 31 Jul 2009 09:38:05 -0400
Subject: [PATCH] update-copyright: handle leading tabs in line prefix

* build-aux/update-copyright: Count leading tabs as 8 spaces
when computing margin.  This helps with the formatting of
ChangeLogs, for example.
Fix documentation a little.
---
 ChangeLog                  |    8 ++++++++
 build-aux/update-copyright |   10 +++++++---
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index e215618..ef6e40d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,13 @@
 2009-07-31  Joel E. Denny  <address@hidden>
 
+       update-copyright: handle leading tabs in line prefix
+       * build-aux/update-copyright: Count leading tabs as 8 spaces
+       when computing margin.  This helps with the formatting of
+       ChangeLogs, for example.
+       Fix documentation a little.
+
+2009-07-31  Joel E. Denny  <address@hidden>
+
        update-copyright: support EOL=\r\n
        * build-aux/update-copyright: Implement that.
 
diff --git a/build-aux/update-copyright b/build-aux/update-copyright
index 777d3e5..5b2a465 100755
--- a/build-aux/update-copyright
+++ b/build-aux/update-copyright
@@ -22,7 +22,7 @@ my $VERSION = '2009-07-31.12:44'; # UTC
 
 # The arguments to this script should be names of files that contain FSF
 # copyright statements to be updated.  For example, you may wish to
-# place a target like the following in your top-level makefile in your
+# place a target like the following in the top-level makefile in your
 # project:
 #
 #   .PHONY: update-copyright
@@ -74,8 +74,8 @@ my $VERSION = '2009-07-31.12:44'; # UTC
 # The exact conditions that a file's FSF copyright statement must meet
 # to be recognized are listed below.  They may seem slightly complex,
 # but you need not worry if some file in your project accidentally
-# breaks one.  The worse that can happen is a warning that the file was
-# not updated.
+# breaks one.  The worst that can happen is that a file is not updated
+# and a warning is issued.
 #
 #   1. The format is "Copyright (C)" (where "(C)" can be "(c)"), then a
 #      list of copyright years, and then the name of the copyright
@@ -106,6 +106,7 @@ my $copyright = 'Copyright \([cC]\)';
 my $holder = 'Free Software Foundation, Inc.';
 my $prefix_max = 5;
 my $margin = 72;
+my $tab_width = 8;
 
 # Unless the file consistently uses "\r\n" as the EOL, use "\n" instead.
 my $eol = /(?:^|[^\r])\n/ ? "\n" : "\r\n";
@@ -161,6 +162,9 @@ if (defined($old) && /$old/)
         # Format within margin.
         my $new_wrapped;
         my $text_margin = $margin - length($prefix);
+        if ($prefix =~ /^(\t+)/) {
+          $text_margin -= length($1) * ($tab_width-1);
+        }
         while (length($new))
           {
             if (($new =~ s/^(.{1,$text_margin})(?: |$)//)
-- 
1.5.4.3





reply via email to

[Prev in Thread] Current Thread [Next in Thread]