pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/18] rewrite PSPP lexer


From: John Darrington
Subject: Re: [PATCH 00/18] rewrite PSPP lexer
Date: Tue, 22 Mar 2011 11:00:50 +0000
User-agent: Mutt/1.5.18 (2008-05-17)

To build this, I had to add the gnulib module memrchr to Smake.

J'
On Sat, Mar 19, 2011 at 05:09:46PM -0700, Ben Pfaff wrote:
     These commits, taken together, rip out the PSPP lexer and replace
     it by one that is more flexible, easier to use, and better at handling
     character encodings.
     
     Ben Pfaff (18):
       data-reader: Remove unreachable "return" statements.
       file-name: Do not make output files line-buffered in fn_open().
       sys-file-reader: Refactor to clean up character encoding support.
       output: New function text_item_create_nocopy().
       str: New function ss_realloc().
       str: Rename ss_chomp() to ss_chomp_byte(), ds_chomp() to
         ds_chomp_byte().
       str: New functions for checking for and removing string suffixes.
       hash-functions: New function hash_case_bytes().
       i18n: New function uc_name().
       i18n: New function recode_string_len().
       i18n: New functions for truncating strings in an arbitrary encoding.
       identifier: Rename token_type_to_string() and make a new version.
       i18n: New functions and data structure for obtaining encoding info.
       encoding-guesser: New library to guess the encoding of a text file.
       u8-istream: New library for reading a text file and recoding to
         UTF-8.
       segment: New library for low-level phase of lexical syntax analysis.
       scan: New library for high-level PSPP syntax lexical analysis.
       lexer: Reimplement for better testability and internationalization.
     
      NEWS                                               |   46 +-
      Smake                                              |    8 +-
      doc/dev/concepts.texi                              |   25 +-
      doc/flow-control.texi                              |   33 +-
      doc/invoking.texi                                  |   20 +-
      doc/language.texi                                  |   96 +-
      doc/utilities.texi                                 |   69 +-
      perl-module/PSPP.xs                                |   14 +-
      perl-module/t/Pspp.t                               |    4 +-
      src/data/automake.mk                               |    1 +
      src/data/dictionary.c                              |  174 +-
      src/data/dictionary.h                              |   14 +-
      src/data/file-handle-def.c                         |    8 +-
      src/data/file-name.c                               |   11 +-
      src/data/gnumeric-reader.h                         |    8 +-
      src/data/identifier.c                              |  124 +-
      src/data/identifier.h                              |   12 +-
      src/data/identifier2.c                             |  133 ++
      src/data/mrset.c                                   |   33 +-
      src/data/mrset.h                                   |    7 +-
      src/data/por-file-reader.c                         |   25 +-
      src/data/por-file-writer.c                         |    5 +-
      src/data/procedure.c                               |   19 +
      src/data/procedure.h                               |    4 +-
      src/data/sys-file-reader.c                         | 2084 
+++++++++++---------
      src/data/sys-file-reader.h                         |    2 +-
      src/data/sys-file-writer.c                         |   20 +-
      src/data/variable.c                                |  139 +-
      src/data/variable.h                                |    7 +-
      src/data/vector.c                                  |   15 +-
      src/data/vector.h                                  |    4 +-
      src/language/automake.mk                           |    6 -
      src/language/command.c                             |  158 +-
      src/language/command.def                           |   11 +-
      src/language/control/automake.mk                   |    3 +-
      src/language/control/do-if.c                       |   12 +-
      src/language/control/loop.c                        |    4 +-
      src/language/control/repeat.c                      |  714 +++-----
      src/language/control/temporary.c                   |    6 +-
      src/language/data-io/combine-files.c               |   21 +-
      src/language/data-io/data-list.c                   |    5 +-
      src/language/data-io/data-parser.c                 |    9 +-
      src/language/data-io/data-reader.c                 |   58 +-
      src/language/data-io/file-handle.q                 |   29 +-
      src/language/data-io/get-data.c                    |    8 +-
      src/language/data-io/inpt-pgm.c                    |   30 +-
      src/language/data-io/save-translate.c              |    2 +
      src/language/data-io/trim.c                        |    5 +-
      src/language/dictionary/apply-dictionary.c         |   11 +-
      src/language/dictionary/attributes.c               |   69 +-
      src/language/dictionary/missing-values.c           |   44 +-
      src/language/dictionary/modify-variables.c         |    4 +-
      src/language/dictionary/mrsets.c                   |   27 +-
      src/language/dictionary/numeric.c                  |   12 +-
      src/language/dictionary/rename-variables.c         |    3 +-
      src/language/dictionary/split-file.c               |    4 +-
      src/language/dictionary/sys-file-info.c            |   18 +-
      src/language/dictionary/value-labels.c             |   22 +-
      src/language/dictionary/variable-label.c           |   17 +-
      src/language/dictionary/vector.c                   |   13 +-
      src/language/dictionary/weight.c                   |    4 +-
      src/language/expressions/parse.c                   |   34 +-
      src/language/expressions/private.h                 |    5 +-
      src/language/lexer/automake.mk                     |    8 +
      src/language/lexer/include-path.c                  |   89 +
      .../msg-ui.h => language/lexer/include-path.h}     |   20 +-
      src/language/lexer/lexer.c                         | 2143 
+++++++++++---------
      src/language/lexer/lexer.h                         |  158 ++-
      src/language/lexer/q2c.c                           |    6 +-
      src/language/lexer/scan.c                          |  596 ++++++
      src/language/lexer/scan.h                          |   93 +
      src/language/lexer/segment.c                       | 1631 +++++++++++++++
      src/language/lexer/segment.h                       |  122 ++
      src/language/lexer/token.c                         |  173 ++
      src/language/{prompt.h => lexer/token.h}           |   36 +-
      src/language/lexer/value-parser.c                  |    2 +-
      src/language/lexer/variable-parser.c               |   17 +-
      src/language/lexer/variable-parser.h               |   10 +-
      src/language/prompt.c                              |   75 -
      src/language/stats/aggregate.c                     |   23 +-
      src/language/stats/autorecode.c                    |    3 +-
      src/language/stats/descriptives.c                  |   23 +-
      src/language/stats/flip.c                          |    8 +-
      src/language/stats/frequencies.q                   |   25 +-
      src/language/stats/npar.c                          |   77 +-
      src/language/stats/rank.q                          |   12 +-
      src/language/stats/sort-cases.c                    |    2 +-
      src/language/syntax-file.c                         |  144 --
      src/language/syntax-file.h                         |   25 -
      src/language/syntax-string-source.c                |  151 --
      src/language/syntax-string-source.h                |   33 -
      src/language/tests/format-guesser-test.c           |    2 +-
      src/language/tests/moments-test.c                  |    2 +-
      src/language/tests/paper-size.c                    |    2 +-
      src/language/utilities/cache.c                     |    4 +-
      src/language/utilities/cd.c                        |    8 +-
      src/language/utilities/date.c                      |    4 +-
      src/language/utilities/host.c                      |   14 +-
      src/language/utilities/include.c                   |  198 +-
      src/language/utilities/permissions.c               |   13 +-
      src/language/utilities/set.q                       |   13 +-
      src/language/utilities/title.c                     |   92 +-
      src/language/xforms/compute.c                      |    6 +-
      src/language/xforms/count.c                        |   26 +-
      src/language/xforms/fail.c                         |    8 +-
      src/language/xforms/recode.c                       |   45 +-
      src/language/xforms/sample.c                       |    4 +-
      src/language/xforms/select-if.c                    |    2 +-
      src/libpspp/automake.mk                            |   10 +-
      src/libpspp/encoding-guesser.c                     |  289 +++
      src/libpspp/encoding-guesser.h                     |  126 ++
      src/libpspp/getl.c                                 |  271 ---
      src/libpspp/getl.h                                 |  113 -
      src/libpspp/hash-functions.c                       |   18 +-
      src/libpspp/hash-functions.h                       |    1 +
      src/libpspp/i18n.c                                 |  345 ++++-
      src/libpspp/i18n.h                                 |   87 +-
      src/libpspp/message.c                              |  118 +-
      src/libpspp/message.h                              |   23 +-
      src/libpspp/msg-locator.c                          |   87 -
      src/libpspp/msg-locator.h                          |   34 -
      src/{ui/terminal/msg-ui.c => libpspp/prompt.c}     |   41 +-
      src/{language => libpspp}/prompt.h                 |   17 +-
      src/libpspp/str.c                                  |   54 +-
      src/libpspp/str.h                                  |   11 +-
      src/libpspp/u8-istream.c                           |  475 +++++
      src/libpspp/u8-istream.h                           |   45 +
      src/output/driver.c                                |   56 +-
      src/output/text-item.c                             |   12 +-
      src/output/text-item.h                             |    3 +-
      src/ui/gui/automake.mk                             |    2 -
      src/ui/gui/comments-dialog.c                       |   13 +-
      src/ui/gui/executor.c                              |   19 +-
      src/ui/gui/executor.h                              |    4 +-
      src/ui/gui/main.c                                  |   39 +-
      src/ui/gui/psppire-data-window.c                   |   16 +-
      src/ui/gui/psppire-dict.c                          |    9 +-
      src/ui/gui/psppire-syntax-window.c                 |   16 +-
      src/ui/gui/psppire-syntax-window.h                 |    1 -
      src/ui/gui/psppire-var-store.c                     |    7 +-
      src/ui/gui/psppire.c                               |   34 +-
      src/ui/gui/psppire.h                               |    6 +-
      src/ui/gui/syntax-editor-source.c                  |  130 --
      src/ui/gui/syntax-editor-source.h                  |   34 -
      src/ui/gui/text-data-import-dialog.c               |    4 +-
      src/ui/source-init-opts.c                          |   20 +-
      src/ui/source-init-opts.h                          |    4 +-
      src/ui/terminal/automake.mk                        |   13 +-
      src/ui/terminal/main.c                             |  113 +-
      src/ui/terminal/read-line.h                        |   31 -
      src/ui/terminal/terminal-opts.c                    |   58 +-
      src/ui/terminal/terminal-opts.h                    |    8 +-
      src/ui/terminal/{read-line.c => terminal-reader.c} |  310 ++--
      .../repeat.h => ui/terminal/terminal-reader.h}     |   11 +-
      tests/automake.mk                                  |   43 +
      tests/data/data-in.at                              |   64 +-
      tests/data/sys-file-reader.at                      |  295 +--
      tests/dissect-sysfile.c                            |    8 +-
      tests/language/control/do-repeat.at                |  101 +-
      tests/language/data-io/data-list.at                |    6 +-
      tests/language/data-io/get.at                      |    2 -
      tests/language/data-io/inpt-pgm.at                 |    6 +-
      tests/language/data-io/print.at                    |    2 +-
      tests/language/dictionary/missing-values.at        |    2 +-
      tests/language/dictionary/sys-file-info.at         |    4 +-
      tests/language/expressions/evaluate.at             |    9 +-
      tests/language/expressions/parse.at                |    2 +-
      tests/language/lexer/lexer.at                      |   43 +
      tests/language/lexer/q2c.at                        |    2 +-
      tests/language/lexer/scan-test.c                   |  217 ++
      tests/language/lexer/scan.at                       |  818 ++++++++
      tests/language/lexer/segment-test.c                |  318 +++
      tests/language/lexer/segment.at                    | 1070 ++++++++++
      tests/language/stats/aggregate.at                  |    4 +-
      tests/language/stats/rank.at                       |    6 +-
      tests/language/utilities/insert.at                 |   29 +-
      tests/libpspp/encoding-guesser-test.c              |  102 +
      tests/libpspp/encoding-guesser.at                  |  143 ++
      tests/libpspp/i18n-test.c                          |   68 +-
      tests/libpspp/i18n.at                              |  125 +-
      tests/libpspp/u8-istream-test.c                    |  126 ++
      tests/libpspp/u8-istream.at                        |  142 ++
      182 files changed, 12211 insertions(+), 5364 deletions(-)
      create mode 100644 src/data/identifier2.c
      create mode 100644 src/language/lexer/include-path.c
      rename src/{ui/terminal/msg-ui.h => language/lexer/include-path.h} (66%)
      create mode 100644 src/language/lexer/scan.c
      create mode 100644 src/language/lexer/scan.h
      create mode 100644 src/language/lexer/segment.c
      create mode 100644 src/language/lexer/segment.h
      create mode 100644 src/language/lexer/token.c
      copy src/language/{prompt.h => lexer/token.h} (50%)
      delete mode 100644 src/language/prompt.c
      delete mode 100644 src/language/syntax-file.c
      delete mode 100644 src/language/syntax-file.h
      delete mode 100644 src/language/syntax-string-source.c
      delete mode 100644 src/language/syntax-string-source.h
      create mode 100644 src/libpspp/encoding-guesser.c
      create mode 100644 src/libpspp/encoding-guesser.h
      delete mode 100644 src/libpspp/getl.c
      delete mode 100644 src/libpspp/getl.h
      delete mode 100644 src/libpspp/msg-locator.c
      delete mode 100644 src/libpspp/msg-locator.h
      rename src/{ui/terminal/msg-ui.c => libpspp/prompt.c} (58%)
      rename src/{language => libpspp}/prompt.h (69%)
      create mode 100644 src/libpspp/u8-istream.c
      create mode 100644 src/libpspp/u8-istream.h
      delete mode 100644 src/ui/gui/syntax-editor-source.c
      delete mode 100644 src/ui/gui/syntax-editor-source.h
      delete mode 100644 src/ui/terminal/read-line.h
      rename src/ui/terminal/{read-line.c => terminal-reader.c} (53%)
      rename src/{language/control/repeat.h => ui/terminal/terminal-reader.h} 
(77%)
      create mode 100644 tests/language/lexer/scan-test.c
      create mode 100644 tests/language/lexer/scan.at
      create mode 100644 tests/language/lexer/segment-test.c
      create mode 100644 tests/language/lexer/segment.at
      create mode 100644 tests/libpspp/encoding-guesser-test.c
      create mode 100644 tests/libpspp/encoding-guesser.at
      create mode 100644 tests/libpspp/u8-istream-test.c
      create mode 100644 tests/libpspp/u8-istream.at
     
     -- 
     1.7.2.3
     
     
     _______________________________________________
     pspp-dev mailing list
     address@hidden
     http://lists.gnu.org/mailman/listinfo/pspp-dev

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]