bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #59817] [PATCH] src/roff/troff/input.cpp: Flatten .ec parser


From: G. Branden Robinson
Subject: [bug #59817] [PATCH] src/roff/troff/input.cpp: Flatten .ec parser
Date: Mon, 4 Jan 2021 21:03:39 -0500 (EST)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0

Update of bug #59817 (project groff):

                  Status:                    None => Need Info              
             Assigned to:                    None => gbranden               

    _______________________________________________________

Follow-up Comment #1:

Hi Ivan!

[comment #0 original submission:]
> Using local temporary `int c` contributes to readability

I'm afraid I don't agree.  It's possible I've spent enough time looking at
input.cpp that I've grown accustomed to James Clark's idioms (and the GNU
coding style :-O ), but I don't find the logic of this short function
difficult to follow.

> and saves from some extra assignments.
> Might have improved code generation as well.

GNU troff is not a high-performance application.  As you may know, for many
years it did not even have a test suite, and even now it has only a handful of
tests that directly exercise certain features (we have some fairly effective
de facto integration tests thanks to some complex documents that are rendered
as part of the build process, on the other hand).

I'm more concerned with verifying the correctness of the formatter's operation
than optimizing its performance at this point.  Knuth's famous maxim about
premature optimization applies here, I think.

Furthermore, performance claims should always be quantified, or at least
concretized.  If a person is unwilling or unable to show the better assembly
language that is generated by a patch, I am reflexively skeptical about claims
of improved code generation.  (Doing so opens the can of worms that is the
variable impact of such changed for different instruction set architectures,
which I suspect is one reason people prefer to hand-wave this issue.  On the
gripping hand, I wouldn't be surprised to learn that for pure user-mode code
like this, almost every ISA in use by POSIX systems is comparable after
adjusting for implementation issues like instruction width, clock speed,
number of cores, pipelining, and so forth.)

If this were C, you'd have an easier time convincing me, because you're right
that tok.ch() appears twice, and as a general rule I try to minimize function
calls to obtain a value when that value is not expected to change.  In C,
saving the value to a temporary makes sense, because a thing that looks like a
function call usually _is_ one, and function calls are inherently more
expensive than basic blocks.  In C++, compilers have to be "smarter" and
there's a deeper idiom of getters and setters.  C++ is a higher-level language
than C, and one of the things we trade away for higher-level abstractions is
performance, despite much emphasis given to "zero-cost abstractions" by C++
advocates.

In the instant case, there may be no function call at all because token.ch()
is defined as an inline function (see src/roff/troff/token.h).

> Comments (copied from groff(7)) are supposed to help travelers navigate
> their way through the code base.

More helpful I think would simply be a comment above the function giving it a
specification.  Individual lines of code should require commenting only if
they're having to work around a bug elsewhere, or if they're doing something
"clever", for which a programmer usually owes the reader an apology.

> For future contributions, what's the right way to send patches here? Can I
use git send-email? How?

I don't know of a way to mail attachments to Savannah, but I know little of
Savannah's ticket tracking system.  `git --format-patch` with appropriate
arguments seems to work fine for our contributors, and can be reliably used
with `git am`.  It looks like that's what you already did here, so that's
fine.

Here's the meat of your patch for web/email reader convenience.


diff --git a/src/roff/troff/input.cpp b/src/roff/troff/input.cpp
index 742447f6..16fa69c4 100644
--- a/src/roff/troff/input.cpp
+++ b/src/roff/troff/input.cpp
@@ -168,16 +168,18 @@ void chop_macro();                // declare to avoid 
friend name
injection
 
 void set_escape_char()
 {
-  if (has_arg()) {
-    if (tok.ch() == 0) {
-      error("bad escape character");
-      escape_char = '\\';
-    }
-    else
-      escape_char = tok.ch();
+  int c = '\\';
+  if (!has_arg()) {
+    // .ec     Reset escape character to '\'.
+  } else if (tok.ch() == 0) {
+    error("bad escape character");
+    // Also reset to '\'.
+  } else {
+    // .ec c    Set escape character to c.
+    c = tok.ch();
   }
-  else
-    escape_char = '\\';
+
+  escape_char = c;
   skip_line();
 }


Thanks for the report.  While I'm leaning against merging it, I am open to
rebuttals, either yours or those of others.

Regards,
Branden

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?59817>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]