texinfo-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

branch master updated: * tp/Texinfo/Structuring.pm (sort_indices): add c


From: Patrice Dumas
Subject: branch master updated: * tp/Texinfo/Structuring.pm (sort_indices): add comments on how to test compatibility of Unicode::Collate with tests.
Date: Fri, 30 Dec 2022 17:58:43 -0500

This is an automated email from the git hooks/post-receive script.

pertusus pushed a commit to branch master
in repository texinfo.

The following commit(s) were added to refs/heads/master by this push:
     new 0f55f22ef0 * tp/Texinfo/Structuring.pm (sort_indices): add comments on 
how to test compatibility of Unicode::Collate with tests.
0f55f22ef0 is described below

commit 0f55f22ef019579f6d531557e46ece72c9057dde
Author: Patrice Dumas <pertusus@free.fr>
AuthorDate: Fri Dec 30 23:58:32 2022 +0100

    * tp/Texinfo/Structuring.pm (sort_indices): add comments on
    how to test compatibility of Unicode::Collate with tests.
    
    * configure.ac, tp/defs.in (PERL_UNICODE_COLLATE_OK), tp/t/09indices.t,
    tp/tests/run_parser_all.sh (check_unicode_collate_ok),
    tp/tests/layout/list-of-tests, tp/tests/tex_html/list-of-tests:
    Add a configure.ac test for perl 5.18.1 which seems to be the
    first compatible version with the tests results for test using Unicode
    collation.  Use that test result in run_parser_all.sh to skip tests
    with 'Need collation compatibility' put on their command line.
    Skip similarly tests in tp/t/09indices.t based on perl version too.
---
 ChangeLog                       | 14 ++++++++++++++
 configure.ac                    | 10 ++++++++++
 tp/Texinfo/Structuring.pm       | 26 ++++++++++++++++++++------
 tp/defs.in                      |  1 +
 tp/t/09indices.t                | 18 ++++++++++++------
 tp/tests/layout/list-of-tests   | 40 ++++++++++++++++++++--------------------
 tp/tests/run_parser_all.sh      | 14 +++++++++++++-
 tp/tests/tex_html/list-of-tests |  2 +-
 8 files changed, 91 insertions(+), 34 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index bdfddc0ff5..c2d34372ec 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,17 @@
+2022-12-30  Patrice Dumas  <pertusus@free.fr>
+
+       * tp/Texinfo/Structuring.pm (sort_indices): add comments on
+       how to test compatibility of Unicode::Collate with tests.
+
+       * configure.ac, tp/defs.in (PERL_UNICODE_COLLATE_OK), tp/t/09indices.t,
+       tp/tests/run_parser_all.sh (check_unicode_collate_ok),
+       tp/tests/layout/list-of-tests, tp/tests/tex_html/list-of-tests:
+       Add a configure.ac test for perl 5.18.1 which seems to be the
+       first compatible version with the tests results for test using Unicode
+       collation.  Use that test result in run_parser_all.sh to skip tests
+       with 'Need collation compatibility' put on their command line.
+       Skip similarly tests in tp/t/09indices.t based on perl version too.
+
 2022-12-30  Patrice Dumas  <pertusus@free.fr>
 
        * tp/Texinfo/Structuring.pm (sort_indices): give the 'UCA' => 22
diff --git a/configure.ac b/configure.ac
index 9bad0223e1..2f11ff69f9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -463,6 +463,16 @@ AC_ARG_ENABLE([perl-api-texi-build],
      fi
     ])
 
+AC_MSG_CHECKING([Perl version for tests requiring unicode collation])
+if $PERL -e "use 5.018_001; use Unicode::Collate" >/dev/null 2>&1; then
+  perl_unicode_collation_requirement='yes'
+else
+  perl_unicode_collation_requirement='no'
+fi
+AC_MSG_RESULT($perl_unicode_collation_requirement)
+PERL_UNICODE_COLLATE_OK=$perl_unicode_collation_requirement
+AC_SUBST([PERL_UNICODE_COLLATE_OK])
+
 AC_MSG_CHECKING([Perl Texinfo API texinfo documentation build requirements])
 AC_MSG_RESULT([$build_perl_api_texi])
 AM_CONDITIONAL([BUILD_PERL_API_TEXI], [test "z$build_perl_api_texi" = 'zyes'])
diff --git a/tp/Texinfo/Structuring.pm b/tp/Texinfo/Structuring.pm
index 64d7b3df36..7e3ef042a0 100644
--- a/tp/Texinfo/Structuring.pm
+++ b/tp/Texinfo/Structuring.pm
@@ -1834,12 +1834,26 @@ sub sort_indices($$$;$)
   #my $collator = Unicode::Collate::Locale->new('locale' => $documentlanguage,
   #                                             'variable' => 'Non-Ignorable');
   # The Unicode::Collate sorting changes often, based on the UCA version.
-  # To get a reproducible sorting 'UCA' => 22 corresponding to the 6.0.0
-  # version of the DUCET/allkeys.txt file which is the reference for
-  # Unicode::Collate.  This version is chosen to be old and may also correspond
-  # to a rather important change compared to theprevious versions.
-  my $collator = Unicode::Collate->new('variable' => 'Non-Ignorable',
-                                       'UCA' => 22);
+  # To test the result with a specific version, the UCA_Version should be set,
+  # and, more importantly the table should correspond to that version.
+  # To test a specific table, in tp, do
+  # wget -N http://www.unicode.org/Public/UCA/6.2.0/allkeys.txt
+  # mkdir -p Unicode/Collate/
+  # mv allkeys.txt Unicode/Collate/allkeys-6.2.0.txt
+  # The table argument leads to a very important slowdown, so the argument
+  # should only be used for checks.
+  # The test results seem to be consistent with 6.2.0, corresponding
+  # to the perl 5.18.0 Unicode::Collate
+  my $collator = Unicode::Collate->new('variable' => 'Non-Ignorable');
+  # to test for 6.2.0
+  #my $collator = Unicode::Collate->new('variable' => 'Non-Ignorable',
+  #                                     'UCA_Version' => 24,
+  #                                     'table' => 'allkeys-6.2.0.txt');
+  # To test files affected for UCA corresponding to perl 5.8.1
+  # wget -N http://www.unicode.org/Public/UCA/3.1.1/allkeys-3.1.1.txt
+  #my $collator = Unicode::Collate->new('variable' => 'Non-Ignorable',
+  #                                     'UCA_Version' => 9,
+  #                                     'table' => 'allkeys-3.1.1.txt');
   my $sorted_index_entries;
   my $index_entries_sort_strings = {};
   return $sorted_index_entries, $index_entries_sort_strings
diff --git a/tp/defs.in b/tp/defs.in
index d995dd1c17..51b0202667 100644
--- a/tp/defs.in
+++ b/tp/defs.in
@@ -2,6 +2,7 @@ PERL="@PERL@"
 DIFF_U_OPTION="@DIFF_U_OPTION@"
 DIFF_A_OPTION="@DIFF_A_OPTION@"
 HOST_IS_WINDOWS_VARIABLE="@HOST_IS_WINDOWS_VARIABLE@"
+PERL_UNICODE_COLLATE_OK="@PERL_UNICODE_COLLATE_OK@"
 if [ "z$srcdir" = 'z' ]; then
   srcdir="@srcdir@"
 fi
diff --git a/tp/t/09indices.t b/tp/t/09indices.t
index 88a5e3c812..06a6dbe270 100644
--- a/tp/t/09indices.t
+++ b/tp/t/09indices.t
@@ -912,31 +912,37 @@ undef,
 @setfilename encoding_index_ascii.info
 @documentencoding us-ascii
 '.$encoding_index_text,
-{'ENABLE_ENCODING' => 0, 'full_document' => 1}
+{'skip' => ($] < 5.018) ? 'Perl too old incompatible Unicode collation' : 
undef,
+'ENABLE_ENCODING' => 0, 'full_document' => 1}
 ],
 ['encoding_index_latin1',
 undef,
-{'test_file' => 'encoding_index_latin1.texi', 'ENABLE_ENCODING' => 0}, 
+{'skip' => ($] < 5.018) ? 'Perl too old incompatible Unicode collation' : 
undef,
+'test_file' => 'encoding_index_latin1.texi', 'ENABLE_ENCODING' => 0}, 
 ],
 ['encoding_index_utf8',
 undef,
-{'test_file' => 'encoding_index_utf8.texi', 'ENABLE_ENCODING' => 0}, 
+{'skip' => ($] < 5.018) ? 'Perl too old incompatible Unicode collation' : 
undef,
+'test_file' => 'encoding_index_utf8.texi', 'ENABLE_ENCODING' => 0}, 
 ],
 ['encoding_index_ascii_enable_encoding',
 '
 @setfilename encoding_index_ascii_enable_encoding.info
 @documentencoding us-ascii
 '.$encoding_index_text,
-{'ENABLE_ENCODING' => 1, 'full_document' => 1},
+{'skip' => ($] < 5.018) ? 'Perl too old incompatible Unicode collation' : 
undef,
+'ENABLE_ENCODING' => 1, 'full_document' => 1},
 ],
 ['encoding_index_latin1_enable_encoding',
 undef,
-{'test_file' => 'encoding_index_latin1.texi', 'ENABLE_ENCODING' => 1}, 
+{'skip' => ($] < 5.018) ? 'Perl too old incompatible Unicode collation' : 
undef,
+'test_file' => 'encoding_index_latin1.texi', 'ENABLE_ENCODING' => 1}, 
 {'ENABLE_ENCODING' => 1, 'OUTPUT_CHARACTERS' => 1}
 ],
 ['encoding_index_utf8_enable_encoding',
 undef,
-{'test_file' => 'encoding_index_utf8.texi', 'ENABLE_ENCODING' => 1}, 
+{'skip' => ($] < 5.018) ? 'Perl too old incompatible Unicode collation' : 
undef,
+'test_file' => 'encoding_index_utf8.texi', 'ENABLE_ENCODING' => 1},
 {'ENABLE_ENCODING' => 1, 'OUTPUT_CHARACTERS' => 1}
 ],
 );
diff --git a/tp/tests/layout/list-of-tests b/tp/tests/layout/list-of-tests
index 9f1fdd339f..a0fef7be84 100644
--- a/tp/tests/layout/list-of-tests
+++ b/tp/tests/layout/list-of-tests
@@ -11,10 +11,10 @@ formatting_xml formatting.texi --xml
 formatting_html formatting.texi --html --no-split
 formatting_html_nodes formatting.texi --html --split node --node-files -c 
'TOP_FILE index.html'
 # this is the default html output
-formatting_html_no_texi2html formatting.texi --html --no-split -c 
TEXI2HTML=undef
-formatting_info formatting.texi --info -c ASCII_PUNCTUATION=1
-formatting_info_disable_encoding formatting.texi --info --disable-encoding
-formatting_plaintext formatting.texi -c FORMAT_MENU=nomenu --plaintext -c 
ASCII_PUNCTUATION=1
+formatting_html_no_texi2html formatting.texi -D 'needcollationcompat Need 
collation compatibility' --html --no-split -c TEXI2HTML=undef
+formatting_info formatting.texi -D 'needcollationcompat Need collation 
compatibility' --info -c ASCII_PUNCTUATION=1
+formatting_info_disable_encoding formatting.texi -D 'needcollationcompat Need 
collation compatibility' --info --disable-encoding
+formatting_plaintext formatting.texi -D 'needcollationcompat Need collation 
compatibility' -c FORMAT_MENU=nomenu --plaintext -c ASCII_PUNCTUATION=1
 formatting_latex formatting.texi --latex
 
 # used to remove commands, for instance to count words
@@ -23,28 +23,28 @@ formatting_textcontent formatting.texi -c 
TEXINFO_OUTPUT_FORMAT=textcontent
 formatting_rawtext formatting.texi -c TEXINFO_OUTPUT_FORMAT=rawtext
 
 # count words
-formatting_sort_element_counts formatting.texi -c 
SORT_ELEMENT_COUNT=@OUT_DIR@formatting_elt_counts.txt
+formatting_sort_element_counts formatting.texi -D 'needcollationcompat Need 
collation compatibility' -c 
SORT_ELEMENT_COUNT=@OUT_DIR@formatting_elt_counts.txt
 
 # formats present in the documentation not tested: debugtree and texinfosxml
 #formatting_sxml formatting.texi -c TEXINFO_OUTPUT_FORMAT=texinfosxml
 
-formatting_nodes formatting.texi --split node
-formatting_mathjax formatting.texi --html -c HTML_MATH=mathjax
+formatting_nodes formatting.texi -D 'needcollationcompat Need collation 
compatibility' --split node
+formatting_mathjax formatting.texi -D 'needcollationcompat Need collation 
compatibility' --html -c HTML_MATH=mathjax
 #formatting_mediawiki formatting.texi --init mediawiki.pm
-formatting_weird_quotes formatting.texi -c 'OPEN_QUOTE_SYMBOL @' -c 
"CLOSE_QUOTE_SYMBOL '&lsquo;"
-formatting_html32 formatting.texi --init html32.pm
-formatting_regions formatting_regions.texi
-formatting_numerical_entities formatting.texi -c 'USE_NUMERIC_ENTITY 1'
-formatting_enable_encoding formatting.texi --enable-encoding -c 
OUTPUT_CHARACTERS=1
-formatting_xhtml formatting.texi -c DOCTYPE='<?xml version="1.0" 
encoding="UTF-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" 
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>' -c 'USE_XML_SYNTAX 1' -c 
'NO_CUSTOM_HTML_ATTRIBUTE 1' -c 'HTML_ROOT_ELEMENT_ATTRIBUTES 
xmlns="http://www.w3.org/1999/xhtml";'
-formatting_exotic formatting.texi --split section --no-header 
--no-number-sections -c 'TOC_LINKS 1' -c 'DEF_TABLE 1' -c 
'XREF_USE_NODE_NAME_ARG 1' --footnote-style=end --css-ref 
http://www.environnement.ens.fr/perso/dumas/background-color.css 
--internal-links=/dev/null -c 'USE_TITLEPAGE_FOR_TITLE 0'
+formatting_weird_quotes formatting.texi -D 'needcollationcompat Need collation 
compatibility' -c 'OPEN_QUOTE_SYMBOL @' -c "CLOSE_QUOTE_SYMBOL '&lsquo;"
+formatting_html32 formatting.texi -D 'needcollationcompat Need collation 
compatibility' --init html32.pm
+formatting_regions formatting_regions.texi -D 'needcollationcompat Need 
collation compatibility'
+formatting_numerical_entities formatting.texi -D 'needcollationcompat Need 
collation compatibility' -c 'USE_NUMERIC_ENTITY 1'
+formatting_enable_encoding formatting.texi -D 'needcollationcompat Need 
collation compatibility' --enable-encoding -c OUTPUT_CHARACTERS=1
+formatting_xhtml formatting.texi -D 'needcollationcompat Need collation 
compatibility' -c DOCTYPE='<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE html 
PUBLIC "-//W3C//DTD XHTML 1.1//EN" 
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>' -c 'USE_XML_SYNTAX 1' -c 
'NO_CUSTOM_HTML_ATTRIBUTE 1' -c 'HTML_ROOT_ELEMENT_ATTRIBUTES 
xmlns="http://www.w3.org/1999/xhtml";'
+formatting_exotic formatting.texi -D 'needcollationcompat Need collation 
compatibility' --split section --no-header --no-number-sections -c 'TOC_LINKS 
1' -c 'DEF_TABLE 1' -c 'XREF_USE_NODE_NAME_ARG 1' --footnote-style=end 
--css-ref http://www.environnement.ens.fr/perso/dumas/background-color.css 
--internal-links=/dev/null -c 'USE_TITLEPAGE_FOR_TITLE 0'
 # use of the doctype is to be able to use W3C old validator, it
 # could be removed when validation can be done differently
-formatting_inline_css formatting.texi -c 'INLINE_CSS_STYLE 1' -c 
DOCTYPE='<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 
"http://www.w3.org/TR/html4/loose.dtd";>'
-formatting_fr_icons formatting.texi --document-language fr --init icons.init
-formatting_chm formatting.texi -c FORMAT_MENU=nomenu --init chm.pm
-formatting_epub formatting.texi --epub3 -c 'EPUB_CREATE_CONTAINER_FILE 0'
-formatting_epub_nodes formatting.texi --split node --init epub3.pm -c 
'EPUB_CREATE_CONTAINER_FILE 0' -c INFO_JS_DIR=js
-formatting formatting.texi 
--internal-links=@OUT_DIR@internal_links_formatting.txt
+formatting_inline_css formatting.texi -D 'needcollationcompat Need collation 
compatibility'  -c 'INLINE_CSS_STYLE 1' -c DOCTYPE='<!DOCTYPE html PUBLIC 
"-//W3C//DTD HTML 4.01 Transitional//EN" 
"http://www.w3.org/TR/html4/loose.dtd";>'
+formatting_fr_icons formatting.texi -D 'needcollationcompat Need collation 
compatibility' --document-language fr --init icons.init
+formatting_chm formatting.texi -D 'needcollationcompat Need collation 
compatibility' -c FORMAT_MENU=nomenu --init chm.pm
+formatting_epub formatting.texi -D 'needcollationcompat Need collation 
compatibility' --epub3 -c 'EPUB_CREATE_CONTAINER_FILE 0'
+formatting_epub_nodes formatting.texi -D 'needcollationcompat Need collation 
compatibility' --split node --init epub3.pm -c 'EPUB_CREATE_CONTAINER_FILE 0' 
-c INFO_JS_DIR=js
+formatting formatting.texi -D 'needcollationcompat Need collation 
compatibility' --internal-links=@OUT_DIR@internal_links_formatting.txt
 
 #lightweight_markups_mediawiki lightweight_markups.texi --init mediawiki.pm
diff --git a/tp/tests/run_parser_all.sh b/tp/tests/run_parser_all.sh
index d2c68bb838..17cd5d8939 100755
--- a/tp/tests/run_parser_all.sh
+++ b/tp/tests/run_parser_all.sh
@@ -30,12 +30,23 @@ check_need_command_line_unicode ()
   if echo "$remaining" | grep 'Need command-line unicode' >/dev/null; then
     if test "z$HOST_IS_WINDOWS_VARIABLE" = 'zyes' ; then
       echo "S: (no reliable command-line Unicode) $current"
-       return 1
+      return 1
     fi
   fi
   return 0
 }
 
+check_unicode_collate_ok ()
+{        
+  if echo "$remaining" | grep 'Need collation compatibility' >/dev/null; then
+    if test "z$PERL_UNICODE_COLLATE_OK" = 'zno' ; then
+      echo "S: (no compatible unicode collation) $current"
+     return 1
+    fi
+  fi 
+  return 0
+}
+
 check_latex2html_and_tex4ht ()
 {
     use_latex2html=no
@@ -390,6 +401,7 @@ while read line; do
     check_need_recoded_file_names || skipped_test=yes
     check_need_command_line_unicode || skipped_test=yes
     check_latex2html_and_tex4ht || skipped_test=yes
+    check_unicode_collate_ok || skipped_test=yes
     if [ "$skipped_test" = 'yes' ] ; then
       if test $one_test = 'yes' ; then
         return_code=77
diff --git a/tp/tests/tex_html/list-of-tests b/tp/tests/tex_html/list-of-tests
index e1a2434b49..5d19e3ae8f 100644
--- a/tp/tests/tex_html/list-of-tests
+++ b/tp/tests/tex_html/list-of-tests
@@ -31,7 +31,7 @@ tex_encoded_utf8_l2h tex_encodé_utf8.texi -c 'HTML_MATH l2h' 
--iftex -c 'COMMAN
 tex_encoded_utf8_httex tex_encodé_utf8.texi --init tex4ht.pm --iftex -c 
'COMMAND_LINE_ENCODING UTF-8' -c OUTPUT_FILE_NAME_ENCODING=UTF-8
 tex_encoded_latin1_l2h tex_encode_latin1.texi -c 'HTML_MATH l2h' --iftex
 tex_encoded_latin1_httex tex_encode_latin1.texi --init tex4ht.pm --iftex
-formatting_singular ../layout/formatting.texi --init-file t2h_singular.init -c 
'HTML_MATH l2h' -c 'EXTENSION htm' -c 'PREFIX sing' -c 'TOP_FILE index.htm' 
--no-verbose
+formatting_singular ../layout/formatting.texi -D 'needcollationcompat Need 
collation compatibility' --init-file t2h_singular.init -c 'HTML_MATH l2h' -c 
'EXTENSION htm' -c 'PREFIX sing' -c 'TOP_FILE index.htm' --no-verbose
 # The following could be added, mainly to test the full 
 # ../layout/formatting.texi processing with tex4ht, but
 # also to check that singular style is compatible with tex4ht



reply via email to

[Prev in Thread] Current Thread [Next in Thread]