Syntax locations are ambiguous: can we track source 'offset' and 'length

From: Vivien Kraus
Subject: Syntax locations are ambiguous: can we track source 'offset' and 'length'?
Date: Tue, 03 Aug 2021 00:50:24 +0200
Dear guilers,

I’m playing with syntaxes as first-class objects, and I notice that the
syntax source location is ambiguous:

    (call-with-input-string "(a\r b)" read-syntax) ()
  ((a b)
   (values (syntax-source #'a) (syntax-source #'b))))


$1 = ((line . 0) (column . 1))
$2 = ((line . 0) (column . 1))

This is obviously because of #\return.

I am trying to use the guile reader to read scheme comments, in
addition to the syntax elements. I know with syntax-source where a
syntax object starts, and I can know where it ends by using a spying
soft port and re-reading it. However, the #\return ambiguity makes all
my efforts pointless.

In (system base lalr), the source location contains a filename, line
and column, but also an offset and length. However, these last 2 get
dropped in source-location->source-properties. I could not find out
whether this is relevant to read-syntax.

So, is there a way to track source offset and length for syntax

Best regards,


