bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Further reducing special case code for XS structuring


From: Gavin Smith
Subject: Re: Further reducing special case code for XS structuring
Date: Thu, 4 Apr 2024 22:23:13 +0100

On Wed, Apr 03, 2024 at 11:53:09PM +0200, Patrice Dumas wrote:
> To avoid that, you could change the special call to $document->tree
> with an argument, like $document->tree(1) to be $document->tree(), as
> the argument, which is only taken in into account in XS is such that an
> handler only is returned, similar to calling get_document instead of
> build_document, while it is not what you want, you want the tree to be
> built at that point such that Texinfo::Transformations has a real tree
> to work with and not hash reference containing only the information to
> retrieve the tree in XS/C.
> 
> Only doing that is likely to lead to the Perl tree being incorrectly
> built if TEXINFO_XS_CONVERT is set to 1, while we want the tree not
> to be built at that point in that case.  So probably, $XS_convert need to be
> determined as in Texinfo/Convert/HTML.pm and be passed to
> $document->tree where a 1 is passed currently, so something like l 1528
> of texi2any.pl (and similar in test_utils.pl)
> 
>   my $tree = $document->tree($XS_convert);
> 
> I have not tested that part, so I may be wrong.

I am working on changes for this, but it still doesn't seem right to me,
in the case of XS structuring being on, and XS conversion being off.

As of commit 5f6800edc (2024-04-03 13:29:19 +0100), in texi2any.pl the
tree is retrieved on line 1526:

      goto NEXT;
    }
  
    my $tree = $document->tree(1);
    if ($tree_transformations{'fill_gaps_in_sectioning'}) {
      Texinfo::Transformations::fill_gaps_in_sectioning($tree);
    }


The changes involved changes to this part of the code:


@@ -1523,7 +1529,14 @@ while(@input_files) {
     goto NEXT;
   }
 
-  my $tree = $document->tree(1);
+  my $tree;
+  if (!$XS_convert) {
+    $tree = $document->tree();
+  } else {
+    # The argument prevents the tree being built as a Perl structure in
+    # case XS code is being used all the way through to conversion.
+    $tree = $document->tree(1);
+  }
   if ($tree_transformations{'fill_gaps_in_sectioning'}) {
     Texinfo::Transformations::fill_gaps_in_sectioning($tree);
   }


However, this is before several transformations take place on the
structure.  So if XS structuring is on, these transformations will (or
should) take place on the C tree, and the building of the Perl tree at
this stage seems needless.

For example, in the first transformation:

  if ($tree_transformations{'fill_gaps_in_sectioning'}) {
    Texinfo::Transformations::fill_gaps_in_sectioning($tree);
  }

'fill_gaps_in_sectioning' in structuring_transfo/StructuringTransfoXS.xs
calls 'get_sv_tree_document' in main/get_perl_info.c, looking for a
"tree_document_descriptor" hash value, which is indeed set (even though the
rest of the Perl tree is there as well), so it proceeds correctly on the
C version of the tree.  So it does appear to be done correctly, in spite
of some unnecessary processing.

(It also seems a bit confusing that some transformations use a "tree"
object and some use a "document" object.)

So I thought it should depend on "XS structuring" being on, rather than
"XS conversion":

    my $tree;
    if (!$XS_structuring) {
      $tree = $document->tree();
    } else {
      # The argument prevents the tree being built as a Perl structure at
      # this stage; only a "handle" is returned.
      $tree = $document->tree(1);
    }

Does this look right?

(Admittedly, this is still special-casing for XS, which I was trying to
eliminate, although it's still simpler overall, in my opinion.)

"make check" passed with this, both with TEXINFO_XS_STRUCTURE=0 and with
TEXINFO_XS_STRUCTURE=1.

Then it should be possible to simplify parse_texi_file in Parsetexi.pm
not to take the extra argument ($no_build).  This would be good, as it
would match parse_texi_file in ParserNonXS.pm, which doesn't have the
extra argument.  I feel that XS and pure Perl functions should present
the same interface as far as is practicable.  (Anyone reading a call
to parse_texi_file in the Perl code might find it harder to delve into
the XS code to understand what the extra argument does.  There's an
immediate barrier in working out that Texinfo::Parser::parse_texi_file
is actually implemented in Texinfo/XS/parsetexi/Parsetexi.pm - not that
this is avoidable, of course.  Then they may have to understand the
role of *.xs files in defining XS subroutines, like 'get_document' or
'build_document' if they want to understand what these do.)

The current state of the change is below.  I'd also plan on reviewing
the other parse_texi_* functions in Parsetexi.pm (such as parse_texi_piece)
to see if the $no_build and $no_store arguments are necessary.  I expect
I'll do this in the next few days.

(I had to change t/protect_character_in_texinfo.t to keep the call to
Texinfo::Document::rebuild_tree working - this is the only place in the
code base this function is called.)

diff --git a/tp/Texinfo/Document.pm b/tp/Texinfo/Document.pm
index 4b8ceb1c99..b5335dcfb6 100644
--- a/tp/Texinfo/Document.pm
+++ b/tp/Texinfo/Document.pm
@@ -41,12 +41,6 @@ my $XS_parser = ((not defined($ENV{TEXINFO_XS})
                  and (not defined($ENV{TEXINFO_XS_PARSER})
                       or $ENV{TEXINFO_XS_PARSER} eq '1'));
 
-# XS parser and not explicitely unset
-my $XS_structuring = ($XS_parser
-                      and (not defined($ENV{TEXINFO_XS_STRUCTURE})
-                           or $ENV{TEXINFO_XS_STRUCTURE} ne '0'));
-
-# needed by parser
 our %XS_overrides = (
   "Texinfo::Document::remove_document"
     => "Texinfo::DocumentXS::remove_document",
@@ -54,10 +48,6 @@ our %XS_overrides = (
     => "Texinfo::DocumentXS::set_document_global_info",
   "Texinfo::Document::errors"
     => "Texinfo::DocumentXS::document_errors",
-);
-
-# needed by structure code
-our %XS_structure_overrides = (
   "Texinfo::Document::rebuild_document"
     => "Texinfo::DocumentXS::rebuild_document",
   "Texinfo::Document::rebuild_tree"
@@ -95,11 +85,6 @@ sub import {
       for my $sub (keys %XS_overrides) {
         Texinfo::XSLoader::override ($sub, $XS_overrides{$sub});
       }
-      if ($XS_structuring) {
-        for my $sub (keys %XS_structure_overrides) {
-          Texinfo::XSLoader::override ($sub, $XS_structure_overrides{$sub});
-        }
-      }
     }
     $module_loaded = 1;
   }
diff --git a/tp/Texinfo/XS/parsetexi/Parsetexi.pm 
b/tp/Texinfo/XS/parsetexi/Parsetexi.pm
index e7f9a65779..39aa31a2c2 100644
--- a/tp/Texinfo/XS/parsetexi/Parsetexi.pm
+++ b/tp/Texinfo/XS/parsetexi/Parsetexi.pm
@@ -231,11 +231,10 @@ sub _get_parser_info($$;$$) {
   return $document;
 }
 
-sub parse_texi_file ($$;$)
+sub parse_texi_file ($$)
 {
   my $self = shift;
   my $input_file_path = shift;
-  my $no_build = shift;
   my $tree_stream;
 
   # the file is already a byte string, taken as is from the command
@@ -259,7 +258,7 @@ sub parse_texi_file ($$;$)
     return undef;
   }
 
-  my $document = _get_parser_info($self, $document_descriptor, $no_build);
+  my $document = _get_parser_info($self, $document_descriptor, 1);
 
   return $document;
 }
diff --git a/tp/t/protect_character_in_texinfo.t 
b/tp/t/protect_character_in_texinfo.t
index 9df2d7ffa5..90c2a4a771 100644
--- a/tp/t/protect_character_in_texinfo.t
+++ b/tp/t/protect_character_in_texinfo.t
@@ -13,6 +13,14 @@ use Texinfo::Common qw(protect_comma_in_tree 
protect_colon_in_tree
 use Texinfo::Convert::Texinfo;
 use Texinfo::Document;
 
+# XS parser and not explicitly unset
+my $XS_structuring = ((not defined($ENV{TEXINFO_XS})
+                        or $ENV{TEXINFO_XS} ne 'omit')
+                       and (not defined($ENV{TEXINFO_XS_PARSER})
+                            or $ENV{TEXINFO_XS_PARSER} eq '1')
+                       and (not defined($ENV{TEXINFO_XS_STRUCTURE})
+                            or $ENV{TEXINFO_XS_STRUCTURE} ne '0'));
+
 ok(1);
 
 #my %avoided_keys_tree;
@@ -55,7 +63,9 @@ sub run_test($$$$)
   # rebuild tree
   $tree_as_text = $document->tree();
 
-  $tree_as_line = Texinfo::Document::rebuild_tree($tree_as_line);
+  if ($XS_structuring) {
+    $tree_as_line = Texinfo::Document::rebuild_tree($tree_as_line);
+  }
 
   my $texi_result_as_text
      = Texinfo::Convert::Texinfo::convert_to_texinfo($tree_as_text);
diff --git a/tp/t/test_utils.pl b/tp/t/test_utils.pl
index 035da2be3f..de5e703699 100644
--- a/tp/t/test_utils.pl
+++ b/tp/t/test_utils.pl
@@ -1013,10 +1013,10 @@ sub test($$)
   if (!$test_file) {
     if ($full_document) {
       print STDERR "  TEST FULL $test_name\n" if ($self->{'DEBUG'});
-      $document = $parser->parse_texi_text($test_text, undef, $XS_structuring);
+      $document = $parser->parse_texi_text($test_text, undef, 1);
     } else {
       print STDERR "  TEST $test_name\n" if ($self->{'DEBUG'});
-      $document = $parser->parse_texi_piece($test_text, undef, 
$XS_structuring);
+      $document = $parser->parse_texi_piece($test_text, undef, 1);
       if (defined($test_input_file_name)) {
         warn "ERROR: $self->{'name'}: $test_name: piece of texi with a file 
name\n";
       }
@@ -1029,9 +1029,17 @@ sub test($$)
     }
   } else {
     print STDERR "  TEST $test_name ($test_file)\n" if ($self->{'DEBUG'});
-    $document = $parser->parse_texi_file($test_file, $XS_structuring);
+    $document = $parser->parse_texi_file($test_file, 1);
+  }
+
+  my $tree;
+  if (!$XS_structuring) {
+    $tree = $document->tree();
+  } else {
+    # The argument prevents the tree being built as a Perl structure at
+    # this stage; only a "handle" is returned.
+    $tree = $document->tree(1);
   }
-  my $tree = $document->tree(1);
 
   if (not defined($tree)) {
     print STDERR "ERROR: parsing result undef\n";
diff --git a/tp/texi2any.pl b/tp/texi2any.pl
index 7f6327f504..3efff24c3f 100755
--- a/tp/texi2any.pl
+++ b/tp/texi2any.pl
@@ -1422,7 +1422,7 @@ die _encode_message(
    .sprintf(__("Try `%s --help' for more information.\n"), $real_command_name))
      unless (scalar(@input_files) >= 1);
 
-# XS parser and not explicitely unset
+# XS parser and not explicitly unset
 my $XS_structuring = ((not defined($ENV{TEXINFO_XS})
                         or $ENV{TEXINFO_XS} ne 'omit')
                        and (not defined($ENV{TEXINFO_XS_PARSER})
@@ -1430,6 +1430,12 @@ my $XS_structuring = ((not defined($ENV{TEXINFO_XS})
                        and (not defined($ENV{TEXINFO_XS_STRUCTURE})
                             or $ENV{TEXINFO_XS_STRUCTURE} ne '0'));
 
+my $XS_convert = 0;
+$XS_convert = 1 if ($XS_structuring
+                    and defined $ENV{TEXINFO_XS_CONVERT}
+                    and $ENV{TEXINFO_XS_CONVERT} eq '1');
+
+
 if (defined($ENV{TEXINFO_XS_EXTERNAL_CONVERSION})
     and $ENV{TEXINFO_XS_EXTERNAL_CONVERSION}) {
   set_from_cmdline('XS_EXTERNAL_CONVERSION', 1);
@@ -1490,7 +1496,7 @@ while(@input_files) {
           @prepended_include_directories;
 
   my $parser = Texinfo::Parser::parser($parser_file_options);
-  my $document = $parser->parse_texi_file($input_file_name, $XS_structuring);
+  my $document = $parser->parse_texi_file($input_file_name, 1);
 
   if (defined($document)
       and (defined(get_conf('DUMP_TREE'))
@@ -1523,7 +1529,14 @@ while(@input_files) {
     goto NEXT;
   }
 
-  my $tree = $document->tree(1);
+  my $tree;
+  if (!$XS_structuring) {
+    $tree = $document->tree();
+  } else {
+    # The argument prevents the tree being built as a Perl structure at
+    # this stage; only a "handle" is returned.
+    $tree = $document->tree(1);
+  }
   if ($tree_transformations{'fill_gaps_in_sectioning'}) {
     Texinfo::Transformations::fill_gaps_in_sectioning($tree);
   }






reply via email to

[Prev in Thread] Current Thread [Next in Thread]