[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
branch master updated: * tp/Texinfo/Convert/HTML.pm (_css_string_accent)
From: |
Patrice Dumas |
Subject: |
branch master updated: * tp/Texinfo/Convert/HTML.pm (_css_string_accent) (_simplify_text_for_comparison, _default_format_element_footer): use Unicode properties and character classes that match non ascii letters and spaces when in regex where this is what is relevant and not ascii text only. |
Date: |
Fri, 19 Aug 2022 18:13:41 -0400 |
This is an automated email from the git hooks/post-receive script.
pertusus pushed a commit to branch master
in repository texinfo.
The following commit(s) were added to refs/heads/master by this push:
new 4fa03abd72 * tp/Texinfo/Convert/HTML.pm (_css_string_accent)
(_simplify_text_for_comparison, _default_format_element_footer): use Unicode
properties and character classes that match non ascii letters and spaces when
in regex where this is what is relevant and not ascii text only.
4fa03abd72 is described below
commit 4fa03abd7278d888b0a56441f0b0559c7004e281
Author: Patrice Dumas <pertusus@free.fr>
AuthorDate: Sat Aug 20 00:13:30 2022 +0200
* tp/Texinfo/Convert/HTML.pm (_css_string_accent)
(_simplify_text_for_comparison, _default_format_element_footer):
use Unicode properties and character classes that match non
ascii letters and spaces when in regex where this is what is relevant
and not ascii text only.
---
ChangeLog | 8 ++++++++
tp/Texinfo/Convert/HTML.pm | 8 +++++---
2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 3f999d81a9..8b09364e93 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2022-08-19 Patrice Dumas <pertusus@free.fr>
+
+ * tp/Texinfo/Convert/HTML.pm (_css_string_accent)
+ (_simplify_text_for_comparison, _default_format_element_footer):
+ use Unicode properties and character classes that match non
+ ascii letters and spaces when in regex where this is what is relevant
+ and not ascii text only.
+
2022-08-19 Patrice Dumas <pertusus@free.fr>
Use gnulib wcwidth in tp/Texinfo/XS/xspara.c
diff --git a/tp/Texinfo/Convert/HTML.pm b/tp/Texinfo/Convert/HTML.pm
index af7fd4565c..1e649e03f0 100644
--- a/tp/Texinfo/Convert/HTML.pm
+++ b/tp/Texinfo/Convert/HTML.pm
@@ -2968,7 +2968,7 @@ sub _css_string_accent($$$;$)
my $accent = $command->{'cmdname'};
- if ($in_upper_case and $text =~ /^\w$/) {
+ if ($in_upper_case and $text =~ /^\p{Word}$/) {
$text = uc ($text);
}
if (exists($Texinfo::Convert::Unicode::unicode_accented_letters{$accent})
@@ -5634,7 +5634,7 @@ $default_css_string_types_conversion{'text'} =
\&_css_string_convert_text;
sub _simplify_text_for_comparison($)
{
my $text = shift;
- $text =~ s/[^\w]//g;
+ $text =~ s/[^\p{Word}]//g;
return $text;
}
@@ -6364,7 +6364,9 @@ sub _default_format_element_footer($$$$)
if ($self->get_conf('HEADERS')) {
my $no_footer_word_count;
if ($self->get_conf('WORDS_IN_PAGE')) {
- my @cnt = split(/\W*\s+\W*/, $content);
+ # FIXME it seems that NO-BREAK SPACE and NEXT LINE (NEL) may
+ # not be in \h and \v in some case, but not sure which case it is
+ my @cnt = split(/\P{Word}*[\h\v]+\P{Word}*/, $content);
if (scalar(@cnt) < $self->get_conf('WORDS_IN_PAGE')) {
$no_footer_word_count = 1;
}
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- branch master updated: * tp/Texinfo/Convert/HTML.pm (_css_string_accent) (_simplify_text_for_comparison, _default_format_element_footer): use Unicode properties and character classes that match non ascii letters and spaces when in regex where this is what is relevant and not ascii text only.,
Patrice Dumas <=