bug-guile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#17296: Uh, wrong?


From: David Kastrup
Subject: bug#17296: Uh, wrong?
Date: Tue, 03 Jun 2014 19:30:23 +0200

I see that a patch has been committed.  It is conflicting with another
patch of mine and it does not follow from the srfi-1 specification.

To wit:

<URL:http://srfi.schemers.org/srfi-1/srfi-1.html#ImproperLists>

    Most procedures are defined only on proper lists -- that is, finite,
    nil-terminated lists. The procedures that will also handle circular
    or dotted lists are specifically marked. While this design decision
    restricts the domain of possible arguments one can pass to these
    procedures, it has the benefit of allowing the procedures to catch
    the error cases where programmers inadvertently pass scalar values
    to a list procedure by accident, e.g., by switching the arguments to
    a procedure call.

Note "allowing to catch".

Then we have

    Errors

    Note that statements of the form "it is an error" merely mean "don't
    do that." They are not a guarantee that a conforming implementation
    will "catch" such improper use by, for example, raising some kind of
    exception. Regrettably, R5RS Scheme requires no firmer guarantee
    even for basic operators such as car and cdr, so there's little
    point in requiring these procedures to do more. Here is the relevant
    section of the R5RS:

        When speaking of an error situation, this report uses the phrase
        "an error is signalled" to indicate that implementations must
        detect and report the error. If such wording does not appear in
        the discussion of an error, then implementations are not
        required to detect or report the error, though they are
        encouraged to do so. An error situation that implementations are
        not required to detect is usually referred to simply as "an
        error."

        For example, it is an error for a procedure to be passed an
        argument that the procedure is not explicitly specified to
        handle, even though such domain errors are seldom mentioned in
        this report. Implementations may extend a procedure's domain of
        definition to include such arguments.

So let's see how we have defined stuff here:

    list        A proper (finite, nil-terminated) list
    clist       A proper or circular list
    flist       A finite (proper or dotted) list 

For length+, we have

    length  list -> integer
    length+ clist -> integer or #f

    Both length and length+ return the length of the argument. It is an
    error to pass a value to length which is not a proper list (finite
    and nil-terminated). In particular, this means an implementation may
    diverge or signal an error when length is applied to a circular
    list.

    length+, on the other hand, returns #F when applied to a circular
    list.

    The length of a proper list is a non-negative integer n such that
    cdr applied n times to the list produces the empty list.

So the behavior for length+ on a dotted list is strictly unspecified.
It is not even stated "it is an error".

Functions like fold state:

     fold kons knil clist1 clist2 ... -> value

    The fold operation terminates when the shortest list runs out of values:

    (fold cons* '() '(a b c) '(1 2 3 4 5)) => (c 3 b 2 a 1)

    At least one of the list arguments must be finite.

Note the wording: "at least one of the list arguments must be finite",
not "at least one of the list arguments must be proper".  The definition
of the recursion for the single-list case again leaves the case of a
dotted list unspecified.

If we take a look at the reference implementation at
<URL:http://srfi.schemers.org/srfi-1/srfi-1-reference.scm>, we get

    (define (length+ x)                 ; Returns #f if X is circular.
      (let lp ((x x) (lag x) (len 0))
        (if (pair? x)
            (let ((x (cdr x))
                  (len (+ len 1)))
              (if (pair? x)
                  (let ((x   (cdr x))
                        (lag (cdr lag))
                        (len (+ len 1)))
                    (and (not (eq? x lag)) (lp x lag len)))
                  len))
            len)))

which _clearly_ treats dotted lists like regular lists with regard to
giving the length of the spine.

Incidentally, the reference implementation also contains

    ;;; R4RS, so commented out.
    ;(define (length x)                 ; LENGTH may diverge or
    ;  (let lp ((x x) (len 0))          ; raise an error if X is
    ;    (if (pair? x)                  ; a circular list. This version
    ;        (lp (cdr x) (+ len 1))             ; diverges.
    ;        len)))

which again would work on dotted lists just fine.  But I am not all that
interested in meddling with that.

At any rate, what I am getting at is that I was going to submit the
following patch as a part of a series fixing other bugs, bugs that I
need a working "get the length of a dotted list" operator for.  We don't
have any such operator in GUILE, and that's awkward.

commit 9539272d26f2954a253ed1365a6704ed197a79be
Author: David Kastrup <address@hidden>
Date:   Mon Jun 2 15:05:55 2014 +0200

    Let length+ return the length of dotted lists rather than #f
    
    * libguile/srfi-1.c (scm_srfi1_length_plus): Previously, length+
      returned #f for dotted lists.  This leaves the user with no efficient
      means for determining the length of dotted lists.  While the Scheme
      standard does not prescribe a behavior here, the reference
      implementation at
      <URL:http://srfi.schemers.org/srfi-1/srfi-1-reference.scm> indeed
      returns the spine length (number of successive pairs in the cdr-chain)
      of dotted lists rather than #f, providing a good endorsement of this
      behavior.
    
      As one consequence, the multi-list implementations for map, fold, and
      for-each will happen to accept dotted lists as the shortest list.
      Previously, this caused an error late during processing.

diff --git a/libguile/srfi-1.c b/libguile/srfi-1.c
index aaa3efe..0db6388 100644
--- a/libguile/srfi-1.c
+++ b/libguile/srfi-1.c
@@ -614,8 +614,32 @@ SCM_DEFINE (scm_srfi1_length_plus, "length+", 1, 0, 0,
            "circular.")
 #define FUNC_NAME s_scm_srfi1_length_plus
 {
-  long len = scm_ilength (lst);
-  return (len >= 0 ? SCM_I_MAKINUM (len) : SCM_BOOL_F);
+  /* This uses the "tortoise and hare" algorithm to detect "infinitely
+     long" lists (i.e. lists with cycles in their cdrs), and returns #f
+     if it does find one.
+
+     Dotted lists are treated just like regular lists, returning the
+     length of the spine.  This is in conformance with the reference
+     implementation though not explicitly defined in the standard. */
+  long i = 0;
+  SCM tortoise = lst;
+  SCM hare = lst;
+
+  do {
+    if (!scm_is_pair (hare)) return scm_from_long (i);
+    hare = SCM_CDR(hare);
+    i++;
+    if (!scm_is_pair (hare)) return scm_from_long (i);
+    hare = SCM_CDR(hare);
+    i++;
+    /* For every two steps the hare takes, the tortoise takes one.  */
+    tortoise = SCM_CDR(tortoise);
+  }
+  while (!scm_is_eq (hare, tortoise));
+
+  /* If the tortoise ever catches the hare, then the list must contain
+     a cycle.  */
+  return SCM_BOOL_F;
 }
 #undef FUNC_NAME
 
diff --git a/module/srfi/srfi-1.scm b/module/srfi/srfi-1.scm
index 0806e73..bc72048 100644
--- a/module/srfi/srfi-1.scm
+++ b/module/srfi/srfi-1.scm
@@ -474,7 +474,7 @@ that result.  See the manual for details."
                      (or len1 len2))))
        (unless len
          (scm-error 'wrong-type-arg "fold"
-                    "Args do not contain a proper (finite) list: ~S"
+                    "Args do not contain a finite list: ~S"
                     (list (list list1 list2)) #f))
        (let fold2 ((knil knil) (list1 list1) (list2 list2) (len len))
          (if (zero? len)
@@ -601,7 +601,7 @@ has just one element then that's the return value."
                      (or len1 len2))))
        (unless len
          (scm-error 'wrong-type-arg "map"
-                    "Args do not contain a proper (finite) list: ~S"
+                    "Args do not contain a finite list: ~S"
                     (list (list l1 l2)) #f))
        (let map2 ((l1 l1) (l2 l2) (len len))
          (if (zero? len)
@@ -620,7 +620,7 @@ has just one element then that's the return value."
                       rest)))
        (if (not len)
            (scm-error 'wrong-type-arg "map"
-                      "Args do not contain a proper (finite) list: ~S"
+                      "Args do not contain a finite list: ~S"
                       (list (cons l1 rest)) #f))
        (let mapn ((l1 l1) (rest rest) (len len))
          (if (zero? len)
@@ -649,7 +649,7 @@ has just one element then that's the return value."
                      (or len1 len2))))
        (unless len
          (scm-error 'wrong-type-arg "for-each"
-                    "Args do not contain a proper (finite) list: ~S"
+                    "Args do not contain a finite list: ~S"
                     (list (list l1 l2)) #f))
        (let for-each2 ((l1 l1) (l2 l2) (len len))
          (unless (zero? len)
@@ -667,7 +667,7 @@ has just one element then that's the return value."
                       rest)))
        (if (not len)
            (scm-error 'wrong-type-arg "for-each"
-                      "Args do not contain a proper (finite) list: ~S"
+                      "Args do not contain a finite list: ~S"
                       (list (cons l1 rest)) #f))
        (let for-eachn ((l1 l1) (rest rest) (len len))
          (if (> len 0)
diff --git a/test-suite/tests/srfi-1.test b/test-suite/tests/srfi-1.test
index d40f8e1..9364ea2 100644
--- a/test-suite/tests/srfi-1.test
+++ b/test-suite/tests/srfi-1.test
@@ -1187,19 +1187,21 @@
     (pass-if-exception "proc arg count 4" exception:wrong-num-args
       (fold (lambda (x y z prev) x) 1 '(1 2 3) '(1 2 3)))
 
-    (pass-if-exception "improper first 1" exception:wrong-type-arg
-      (fold + 1 1 '(1 2 3)))
-    (pass-if-exception "improper first 2" exception:wrong-type-arg
-      (fold + 1 '(1 . 2) '(1 2 3)))
-    (pass-if-exception "improper first 3" exception:wrong-type-arg
-      (fold + 1 '(1 2 . 3) '(1 2 3)))
-
-    (pass-if-exception "improper second 1" exception:wrong-type-arg
-      (fold + 1 '(1 2 3) 1))
-    (pass-if-exception "improper second 2" exception:wrong-type-arg
-      (fold + 1 '(1 2 3) '(1 . 2)))
-    (pass-if-exception "improper second 3" exception:wrong-type-arg
-      (fold + 1 '(1 2 3) '(1 2 . 3)))
+    ;; For multiple list arguments, dotted lists are permitted by this
+    ;; implementation and a non-list is a zero-length dotted list
+    (pass-if "improper first 1"
+      (= 1 (fold + 1 1 '(1 2 3))))
+    (pass-if "improper first 2"
+      (= 3 (fold + 1 '(1 . 2) '(1 2 3))))
+    (pass-if "improper first 3"
+      (= 7 (fold + 1 '(1 2 . 3) '(1 2 3))))
+
+    (pass-if "improper second 1"
+      (= 1 (fold + 1 '(1 2 3) 1)))
+    (pass-if "improper second 2"
+      (= 3 (fold + 1 '(1 2 3) '(1 . 2))))
+    (pass-if "improper second 3"
+      (= 7 (fold + 1 '(1 2 3) '(1 2 . 3))))
 
     (pass-if (= 6 (fold + 1 '(2) '(3))))
     (pass-if (= 15 (fold + 1 '(2 3) '(4 5))))
I am not saying that your implementation of count+ is wrong.  It meets
the specs of srfi-1.  But so did the previous implementation.  And so
does mine.

And since there is otherwise no actual length operator of any kind
working for dotted lists, I want one.  The previous implementation was
actually trivially the same as

(define (length* lst)
  (catch 'wrong-type-arg
    (lambda () (length lst))
    (lambda _ #f)))

and consequently was trivial to write using existing length functions in
the C API.

And I need my variant of length+ to fix things like take-right which
currently bomb out on large lists with a VM error and are _required_ to
deal with _both_ proper and dotted lists.

Which is awkward to do without a length operator working on both proper
and dotted lists.

-- 
David Kastrup

reply via email to

[Prev in Thread] Current Thread [Next in Thread]