[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[groff] 01/01: Sanitize text for use in PDF document outlines.
From: |
Keith Marshall |
Subject: |
[groff] 01/01: Sanitize text for use in PDF document outlines. |
Date: |
Sat, 4 Sep 2021 07:37:03 -0400 (EDT) |
keithmarshall pushed a commit to branch master
in repository groff.
commit 058b63ce3d614479a64d65d9272cbaa3e2f4b4d1
Author: Keith Marshall <keith.d.marshall@ntlworld.com>
AuthorDate: Sat Sep 4 12:35:26 2021 +0100
Sanitize text for use in PDF document outlines.
---
contrib/pdfmark/ChangeLog | 30 ++++++++
contrib/pdfmark/pdfmark.am | 3 +-
contrib/pdfmark/pdfmark.ms | 12 +--
contrib/pdfmark/sanitize.tmac | 170 ++++++++++++++++++++++++++++++++++++++++++
contrib/pdfmark/spdf.tmac | 33 +++++---
5 files changed, 230 insertions(+), 18 deletions(-)
diff --git a/contrib/pdfmark/ChangeLog b/contrib/pdfmark/ChangeLog
index 65ab4a0..ab034fe 100644
--- a/contrib/pdfmark/ChangeLog
+++ b/contrib/pdfmark/ChangeLog
@@ -1,3 +1,33 @@
+2021-09-03 Keith Marshall <keith.d.marshall@ntlworld.com>
+
+ Sanitize text for use in PDF document outlines.
+
+ * sanitize.tmac: New file; it implements...
+ (sanitize): ...this new macro; interprets its first argument as a
+ string name, and copies its remaining arguments to the named string,
+ discarding specific embedded troff escape sequences; currently...
+ (\F): ...only this is identified as "specifically discardable".
+
+ * pdfmark.am (TMACFILES): Add sanitize.tmac
+
+ * spdf.tmac (mso): Include sanitize.tmac
+ (xn*ref, xn*argc): Rename all occurrences...
+ (spdf:refname, spdf:argc): ...to these, respectively.
+ (XN): Stop inserting $* directly into PDF outlines; instead, use...
+ (spdf:bm.text): ...this new string; this is locally defined by...
+ (spdf:bm.define): ...this new macro; passed the original $* from
+ XN, this itself, is locally defined as a redirectable alias for...
+ (spdf:bm.basic): ...this new local macro; it simply copies $*,
+ passed from XN, to the string named by its first argument, (which is
+ always spdf:bm.text), so reproducing previous behaviour.
+ (opt*XN-S): New macro; defined for internal use only, it adds a "-S"
+ option to XN, such that, when specified, it temporarily redirects...
+ (spdf:bm.define): ...this macro mapping alias to...
+ (sanitize): ...this.
+
+ * pdfmark.ms (XN): Add "-S" option for all headings which include...
+ (\F[C]...\F[]): ...this escape sequence.
+
2021-08-21 Keith Marshall <keith.d.marshall@ntlworld.com>
Define, and use registered trade mark strings.
diff --git a/contrib/pdfmark/pdfmark.am b/contrib/pdfmark/pdfmark.am
index d56dd9b..9a2d030 100644
--- a/contrib/pdfmark/pdfmark.am
+++ b/contrib/pdfmark/pdfmark.am
@@ -1,4 +1,4 @@
-# Copyright (C) 2005-2020 Free Software Foundation, Inc.
+# Copyright (C) 2005-2021 Free Software Foundation, Inc.
# Written by Keith Marshall (keith.d.marshall@ntlworld.com)
# Automake migration by Bertrand Garrigues
#
@@ -27,6 +27,7 @@ bin_SCRIPTS += pdfroff
# Files installed in $(tmacdir)
TMACFILES = \
contrib/pdfmark/pdfmark.tmac \
+ contrib/pdfmark/sanitize.tmac \
contrib/pdfmark/spdf.tmac
pdfmarktmacdir = $(tmacdir)
dist_pdfmarktmac_DATA = $(TMACFILES)
diff --git a/contrib/pdfmark/pdfmark.ms b/contrib/pdfmark/pdfmark.ms
index fdd3e44..2abe022 100644
--- a/contrib/pdfmark/pdfmark.ms
+++ b/contrib/pdfmark/pdfmark.ms
@@ -349,7 +349,7 @@ of their choice, to format their documents, while also
using the
macros to add PDF features.
.
.NH 2
-.XN -N pdfmark-operator -- The \F[C]pdfmark\F[] Operator
+.XN -S -N pdfmark-operator -- The \F[C]pdfmark\F[] Operator
.LP
All PDF features are implemented by embedding instances of the
.B \F[C]pdfmark\F[]
@@ -1178,7 +1178,7 @@ which extend through a page transition;
.QE
.
.NH 3
-.XN Optional Features of the \F[C]pdfhref\F[] Macro
+.XN -S -- Optional Features of the \F[C]pdfhref\F[] Macro
.LP
The behaviour of a number of the
.CW pdfhref
@@ -2340,7 +2340,7 @@ illustrates how this may be accomplished:\(en
.XN -N add-note -- Annotating a PDF Document using Pop-Up Notes
.
.NH 2
-.XN -N pdfsync -- Synchronizing Output and \F[C]pdfmark\F[] Contexts
+.XN -S -N pdfsync -- Synchronizing Output and \F[C]pdfmark\F[] Contexts
.LP
It has been noted previously, that the
.CW pdfview
@@ -2493,7 +2493,7 @@ as to how the
macros may be employed with their chosen primary macro package.
.
.NH 2
-.XN -N using-spdf -- Using \F[C]pdfmark\F[] Macros with the \F[C]ms\F[] Macro
Package
+.XN -S -N using-spdf -- Using \F[C]pdfmark\F[] Macros with the \F[C]ms\F[]
Macro Package
.LP
The use of the binding macro package,
.CW spdf.tmac ,
@@ -2544,7 +2544,7 @@ and the issues they are intended to address,
are described below.
.
.NH 3
-.XN \F[C]ms\F[] Section Headings in PDF Documents
+.XN -S -- \F[C]ms\F[] Section Headings in PDF Documents
.LP
Traditionally,
.CW ms
@@ -2572,7 +2572,7 @@ to be used in conjunction with the
macro.
.
.NH 4
-.XN -N xn-macro -- The \F[C]XN\F[] Macro
+.XN -S -N xn-macro -- The \F[C]XN\F[] Macro
.
.NH 1
.XN The PDF Publishing Process
diff --git a/contrib/pdfmark/sanitize.tmac b/contrib/pdfmark/sanitize.tmac
new file mode 100644
index 0000000..4efa785
--- /dev/null
+++ b/contrib/pdfmark/sanitize.tmac
@@ -0,0 +1,170 @@
+.ig
+
+sanitize.tmac
+
+Copyright (C) 2021 Free Software Foundation, Inc.
+ Written by Keith Marshall (keith.d.marshall@ntlworld.com)
+
+This file is part of groff.
+
+groff is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation, either version 3 of the License, or
+(at your option) any later version.
+
+groff is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+..
+.eo
+.de sanitize
+.\" Usage: .sanitize name text ...
+.\"
+.\" Remove designated formatting escape sequences from "text ..."; return
+.\" the sanitized text in a string register, identified by "name".
+.\"
+.\" Begin by initializing the named result as an empty string, bind it to
+.\" an internal reference name, and discard the "name" argument, to leave
+.\" only the text which is to be sanitized, as residual arguments.
+.\"
+. ds \$1
+. als sanitize:result \$1
+. shift
+.
+.\" Initialize a working string register, which we will cyclically reduce
+.\" until it becomes empty, after starting with all of the text passed as
+.\" the residual arguments, and establish its initial length.
+.\"
+. ds sanitize:residual "\$*\"
+. length sanitize:residual.length "\$*\"
+.
+.\" Begin the cyclic reduction loop...
+.\"
+. while \n[sanitize:residual.length] \{\
+. \"
+. \" ...assuming, at the start of each cycle, that the next character
+. \" will not be skipped, and that it will be moved from the residual,
+. \" to the result, as the character-by-character scan proceeds.
+. \"
+. nr sanitize:skip.count 0
+. sanitize:scan.execute
+.
+. \" For each character scanned, we need to check if it matches the
+. \" normal escape character; the check is most readily performed, if
+. \" an alternative escape character is introduced, and when a match
+. \" is found, we prepare to skip an escape sequence.
+. \"
+. ec !
+. if '!*[sanitize:scan.char]'\' .nr sanitize:skip.count 1
+. ec
+. ie \n[sanitize:skip.count] \{\
+. \"
+. \" When a possible escape sequence has been detected, we back it
+. \" up, (in case it isn't recognized, and we need to reinstate its
+. \" content into the result string), then scan ahead to check for
+. \" an identifiable escape sequence...
+. \"
+. rn sanitize:scan.char sanitize:hold
+. sanitize:scan.execute
+. ie d sanitize:esc-\*[sanitize:scan.char] \
+. \"
+. \" ...which we delegate to its appropriate handler, to skip...
+. \"
+. sanitize:esc-\*[sanitize:scan.char]
+.
+. \" ...but, in the case of an unrecognized escape sequence, we copy
+. \" its backed-up content, followed by the character retrieved from
+. \" the current scan cycle, to the result string.
+. \"
+. el .as sanitize:result "\*[sanitize:hold]\*[sanitize:scan.char]\"
+. \}
+.
+. \" When the current scan cycle has retrieved a character, which isn't
+. \" part of any possible escape sequence, we simply copy that character
+. \" to the result string.
+. \"
+. el .as sanitize:result "\*[sanitize:scan.char]\"
+. \}
+.
+.\" Clean up the register space, by deleting all of the string registers,
+.\" and numeric registers, which are designated as temporary, for private
+.\" use within this macro only.
+.\"
+. rm sanitize:hold sanitize:scan.char sanitize:residual sanitize:result
+. rr sanitize:residual.length sanitize:skip.count
+..
+.de sanitize:scan.execute
+.\" Usage (internal): .sanitize:scan.execute
+.\"
+.\" Perform a single-character reduction of sanitize:residual, by copying
+.\" its initial character to sanitize:scan.char, and then deleting it from
+.\" sanitize:residual itself. (Note that we use arithmetic decrementation
+.\" of sanitize:residual.length, rather than repeating the length request
+.\" on sanitize:residual, because reduction WILL fail when there is only
+.\" one character remaining).
+.\"
+. nr sanitize:residual.length -1
+. ds sanitize:scan.char "\*[sanitize:residual]\"
+. substring sanitize:scan.char 0 0
+. substring sanitize:residual 1
+..
+.de sanitize:skip-(
+.\" Usage (internal): .sanitize:skip-(
+.\"
+.\" For any identified escape sequence, with a two-character property name,
+.\" simply skip over the next two characters in the residual string.
+.\"
+. nr sanitize:residual.length -2
+. substring sanitize:residual 2
+..
+.de sanitize:skip-[
+.\" Usage (internal): .sanitize:skip-[
+.\"
+.\" For any identified escape sequence, with an arbitrary-length property
+.\" name, skip following characters in the residual string, until we find
+.\" a terminal "]" character, or we exhaust the residual.
+.\"
+. while \n[sanitize:skip.count] \{\
+. sanitize:scan.execute
+. ie \n[sanitize:residual.length] \{\
+. \" We haven't yet exhausted the residual; if we find a nested "["
+. \" character, increment the nesting level, otherwise decrement it
+. \" for each "]"; it will become zero at the terminal "]".
+. \"
+. ie '\*[sanitize:scan.char]'[' .nr sanitize:skip.count +1
+. el .if '\*[sanitize:scan.char]']' .nr sanitize:skip.count -1
+. \}
+. \" Stop unconditionally, if we do exhaust the residual.
+. \"
+. el .nr sanitize:skip.count 0
+. \}
+..
+.de sanitize:esc-generic
+.\" Usage: .sanitize:esc-X
+.\"
+.\" (X represents any legitimate single-character escape sequence id).
+.\"
+.\" Handler for skipping "\X" sequences, in text which is to be sanitized;
+.\" this will automatically detect sequences conforming to any of the forms
+.\" "\Xc", "\X(cc", or "\X[...]", and will handle each appropriately. The
+.\" implementation is generic, and may be aliased to handle any specific
+.\" escape sequences, which exhibit similar semantics.
+.\"
+. sanitize:scan.execute
+. if d sanitize:skip-\*[sanitize:scan.char] \
+. sanitize:skip-\*[sanitize:scan.char]
+..
+.\" Map the generic handler to specific escape sequences, as required.
+.\"
+.als sanitize:esc-F sanitize:esc-generic
+.ec
+.\" Local Variables:
+.\" mode: nroff
+.\" End:
+.\" vim: filetype=groff:
+.\" sanitize.tmac: end of file
diff --git a/contrib/pdfmark/spdf.tmac b/contrib/pdfmark/spdf.tmac
index 767f5ee..33591d0 100644
--- a/contrib/pdfmark/spdf.tmac
+++ b/contrib/pdfmark/spdf.tmac
@@ -2,7 +2,7 @@
spdf.tmac
-Copyright (C) 2004-2020 Free Software Foundation, Inc.
+Copyright (C) 2004-2021 Free Software Foundation, Inc.
Written by Keith Marshall (keith.d.marshall@ntlworld.com)
This file is part of groff.
@@ -25,6 +25,7 @@ along with this program. If not, see
<http://www.gnu.org/licenses/>.
.if !rOPMODE .nr OPMODE 1
.\"
.mso s.tmac
+.mso sanitize.tmac
.mso pdfmark.tmac
.\"
.\" Omitted Sections
@@ -82,16 +83,18 @@ along with this program. If not, see
<http://www.gnu.org/licenses/>.
.\" additional spacing parameters may be set relative to the current
.\" document line spacing, as set by \n[VS]).
.\"
-.rm xn*ref
+.rm spdf:refname
+.als spdf:bm.define spdf:bm.basic
.while dopt*XN\\$1 \{\
. opt*XN\\$1 \\$*
-. shift \\n[xn*argc]
+. shift \\n[spdf:argc]
. \}
-.rr xn*argc
+.rr spdf:argc
.if '\\$1'--' .shift
-.if dxn*ref .XM -N \\*[xn*ref] -- \\$@
-.rm xn*ref
-.pdfhref O \\n[nh*hl] "\\*(SN \\$*"
+.if dspdf:refname .XM -N \\*[spdf:refname] -- \\$@
+.rm spdf:refname
+.spdf:bm.define spdf:bm.text "\\$*"
+.pdfhref O \\n[nh*hl] "\\*(SN \\*[spdf:bm.text]"
.XS
.if rtc*hl \{\
. if !dXNVS1 .ds XNVS1 1.0v \" default leading for top level
@@ -119,12 +122,20 @@ along with this program. If not, see
<http://www.gnu.org/licenses/>.
\&\\$*
..
.de opt*XN-N
-.nr xn*argc 2
-.ds xn*ref \\$2
+.ds spdf:refname \\$2
+.nr spdf:argc 2
+..
+.de opt*XN-S
+.als spdf:bm.define sanitize
+.nr spdf:argc 1
..
.de opt*XN-X
-.nr xn*argc 1
-.if !dxn*ref .ds xn*ref \\\\$1
+.if !dspdf:refname .ds spdf:refname \\\\$1
+.nr spdf:argc 1
+..
+.de spdf:bm.basic
+.shift
+.ds spdf:bm.text "\\$*\"
..
.de LU
.LP
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [groff] 01/01: Sanitize text for use in PDF document outlines.,
Keith Marshall <=