bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Hurd term server (was: Hurd GCC ping)


From: Thomas Schwinge
Subject: Hurd term server (was: Hurd GCC ping)
Date: Thu, 9 Oct 2014 14:02:39 +0200
User-agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/24.3.1 (i586-pc-linux-gnu)

Hi!

[CCing the Hurd developers having written or worked on the term server.
Would appreciate your comments, if you have any.]

On Wed, 20 Aug 2014 01:24:36 +0200, I wrote:
> Matthias Klose has recently re-enable the GCC testsuite for GNU/Hurd, and
> while it now runs to completion (hooray!) there are a number of
> unexpected test failures (search for »FAIL:«):
> 
> On Fri, 15 Aug 2014 14:32:01 +0200, Matthias Klose <doko@ubuntu.com> wrote:
> > https://buildd.debian.org/status/fetch.php?pkg=gcc-snapshot&arch=hurd-i386&ver=20140814-1&stamp=1408093330
> 
> I once began analyzing this.  After one system upgrade, many months ago
> (exact timing lost; hardware failure, system set up again, etc.), most of
> these FAILs suddenly appeared (especially those testing the basic
> functionality of GCC, which he rightly considered worrying).  When
> running the FAILing test cases manually, there are no failures.  What
> strikes out is that often it's only the later checks for warnings/errors
> of one test case that FAIL (where the previous ones have PASSed), and I
> think I concluded that GCC's output on stdout gets truncated once it's
> reached a limit of 1 KiB of data (or similar) -- but only if running the
> testing through »make check«, DejaGnu (runtest), and not when running GCC
> (xgcc) manually, where everything works fine.  Also. I got different
> results whether I was running in a screen session or not, and/or when I
> had been running the testsuite and/or a screen session on the same PTY
> before, or a fresh one.  (To sum it up: a mess to diagnose.)

The failure mode is that the (expected) errors output (as "seen" by the
testing framework) is truncated:

    spawn [xgcc]
    
/media/erich/home/thomas/tmp/gcc/trunk/gcc/testsuite/gcc.dg/cpp/pr33466.c:8:18: 
error: invalid suffix "rh" on floating constant
    [...]
    
/media/erich/home/thomas/tmp/gcc/trunk/gcc/testsuite/gcc.dg/cpp/pr33466.c:53:19:
 error: invalid suffix "ddf"[truncated here]

This is expected to show additional errors until line 64, but is cut off
after 6000-something characters.

> It may be something "simple" like the SA_RESTART bug we recently fixed in
> dash: maybe something similar to that in GCC, or DejaGnu (runtest,
> expect, TCL), or screen, or something "funny" happening in the Hurd's PTY
> machinery (or FIFO?)...

Turns out it is an issue more of the latter kind...  That is, an
"incompatibility" of some kind, deep in TCL's buffering implementation
when reading from PTYs and/or the expect program's usage of these TCL
interfaces -- I cannot claim to understand this code.  Anyhow, what can
be observed with the Linux kernel, when stracing the following code:

    #!/usr/bin/expect -f
    
    # Doesn't seem to matter.
    #stty -cooked
    stty cooked
    
    #spawn sh -c "/media/erich/home/thomas/tmp/gcc/755295.build/gcc/xgcc.real 
-B/media/erich/home/thomas/tmp/gcc/755295.build/gcc/ 
/media/erich/home/thomas/tmp/gcc/755295/gcc/testsuite/gcc.dg/cpp/pr33466.c  
-fno-diagnostics-show-caret -fdiagnostics-color=never -std=gnu99 -S -o 
pr33466.s 2> /tmp/e; cat < /tmp/e"
    spawn sh -c "cat < /tmp/e"
    #spawn sh -c "for i in \$(seq 1 99); do echo \$i \$(seq 0 50); done > 
/tmp/d; cat < /tmp/d"
    #spawn sh -c "for i in \$(seq 1 99); do echo \$i 
--------------------------------------------------------------------------; 
done > /tmp/d; cat < /tmp/d"
    #spawn printf "%4095d\r\nabc" 1
    #spawn printf "%4096d%4096d\r\nabc" 1 2
    interact

..., is that the read syscall (reading from the spawned process) never
returns more than 4095 bytes (that is, does a "short read"), even though
4096 bytes have been requested.  The buffering implementation in TCL
recognizes this, and presumably assumes that there is a chance for the
next read syscall to block, and so first returns that data for the except
script to process.  On GNU Hurd in turn, the term server returns the full
4096 bytes, and the buffering implementation in TCL continues to read
another 2000-something bytes (its buffer having been configured for
6000-something bytes), and then returns all that data to the except
script, which that does process fine, but then fails to continue reading
the next chunk of data.  This is what is causing the truncation of the
error messages.

Now, my knowledge about Unix TTY/PTYs is terribly limited.  However, I
have read somewhere that indeed not the full 4096 bytes, even if
available, can be returned for there must be one character reserved for a
trailing newline (or similar...).  If there is such a protocol to be
obeyed, then our non-conformance might be what is confusing the buffering
implementation in TCL.  (Alternatively, if every other Unix but GNU Hurd
always returns "short reads", then maybe there really is a bug in the
buffering implementation in TCL that has not been noticed until now.)
(Also I cannot tell what change in GNU Hurd it is that this issue now
appears -- as an experiment, I downgraded all related packages to old
versions, that I used before ever noticing this, and this didn't help, so
it can't be a regression in TCL or the expect program itself, for
example.)

With the following hack applied to Hurd's term server, the GCC testsuite
again works as expected, and no regressions are seen with the GDB
testsuite, another heavy user of the expect program.  Obviously, this
doesn't quite look like a proper fix...

diff --git term/ptyio.c term/ptyio.c
index 211e70a..ac7fb85 100644
--- term/ptyio.c
+++ term/ptyio.c
@@ -26,6 +26,8 @@
 #include "term.h"
 #include "tioctl_S.h"
 
+#define READ_MAX 4095
+
 /* Set if we need a wakeup when tty output has been done */
 static int pty_read_blocked = 0;
 
@@ -350,6 +352,10 @@ pty_io_read (struct trivfs_protid *cred,
        size++;
     }
 
+  if (!packet_mode && !user_ioctl_mode)
+    if (amount > READ_MAX)
+      amount = READ_MAX;
+
   if (size > amount)
     size = amount;
   if (size > *datalen)
@@ -446,6 +452,7 @@ pty_io_write (struct trivfs_protid *cred,
 }
 
 /* Validation has already been done by trivfs_S_io_readable */
+//TODO: have to consider READ_MAX here?
 error_t
 pty_io_readable (size_t *amt)
 {
diff --git term/users.c term/users.c
index 9bd51d0..e2ab473 100644
--- term/users.c
+++ term/users.c
@@ -1545,6 +1545,7 @@ S_tioctl_tiocsti (struct trivfs_protid *cred,
 }
 
 /* TIOCOUTQ -- return output queue size */
+//TODO: have to consider ptyio.c:READ_MAX here?
 kern_return_t
 S_tioctl_tiocoutq (struct trivfs_protid *cred,
                   int *queue_size)


Grüße,
 Thomas

Attachment: pgp7oRGcRLcT7.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]