gforth
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

New word suggestion - PARSE-NAMES


From: James Norris
Subject: New word suggestion - PARSE-NAMES
Date: Mon, 10 Aug 2020 11:00:20 -0400
User-agent: Roundcube Webmail/1.4.7


If you add PARSE-NAMES to Forth, then the user can easily add additional state behavior to Forth.

The way PARSE-NAMES (or PARSE-WORDS) works is, it is a combination of PARSE-NAME and PARSE with a few modifications.

PARSE-NAMES differs from PARSE-NAME in that it treat uenddelimiter as a delimiter for a name in addition to the <SPACE> character. The rules for parsing uenddelimiter are the same as for <SPACE>. This means no white space delimiters are needed before or after uenddelimiter. Also, after parsing uenddelimiter, the >IN offset will be on the character after
 uneddelimiter.

PARSE-NAMES differs from PARSE in that it is multi-line.
Multi-line means that:
PARSE-NAMES treats line terminator characters as white space delimiters. If the current input buffer is a file, ufoundendflag is only true if the end of file was reached or uenddelimiter was found If the current input buffer was from EVALUATE, ufoundendflag is only true if the end of the length passed to EVALUATE
   was reached or uenddelimiter was found
If the current input buffer was from the user from a terminal input device, then ufoundendflag is only true if the end of whatever packet was passed in from the user was reached or uenddelimiter was found.
   This packet may include line terminator characters.
If the current input buffer was from a block, then I'm not supporting blocks or am familiar with their use. I suggest asking people who use blocks what they want. But as an initial recommendation I suggest making ufoundendflag only true when the end
   of the block is reached or uenddelimiter was found.
  Refill is done if your implementation needs to do it.

The reason for adding PARSE-NAMES (or PARSE-WORDS) to Forth is that you can make words like this:

: VARIABLES{
   BEGIN
    [CHAR] } PARSE-NAMES
    DUP 0= IF
     DROP
    ELSE
     NEXTNAME CREATE
     ALIGN 1 CELLS ALLOT
    THEN
   UNTIL ;

Which you can use like this:

VARIABLES{ x y z }

You could even make a word like CONSTANTS{ which would be the same as above except it tries to convert the name to a number first. If it's a number it pushes it to the data stack. If it's not a number, it uses the name as the name of a new constant. Which you can use like this:

CONSTANTS{ 3 x 4 y 5 z }

You could also make a word that initializes variables. If the name is a number it pushes it to the data stack. If it's not a number, it creates a new variable with the name and puts the top number on the data stack into it.
So something like this:

INITIALIZED-VARIABLES{ 3 x 4 y 5 z }

You could also make a word that compiles bytes. Something that coverts the names to numbers and pushes the low byte of the converted number onto the end of the current compile buffer. Something like:

HEX
COMPILE-U8S{ 37 82 FF 63 97 C4 }

Words like COMPILE-U8s might make initializing compile time data easier, and more readable.

PARSE-NAMES can also be used to implement LOCALS|


// Stack action shorthand:
//  ( &quot;&lt;delimiters&gt;word&lt;delimiters&gt;morestuff&quot; |
//     &quot;&lt;delimiters&gt;word&lt;enddelimiter&gt;morestuff&quot;
//     -currentinputbuffer- &quot;&lt;delimiters&gt;morestuff&quot; )
//  ( uenddelimiter -- ufoundendflag c-addr ulength )
//
// Data stack in:
// uenddelimiter a character (byte) that will end the parsing // in addition to the whitespace delimiter list
//
// Data stack out:
// ufoundendflag FORTH_TRUE if the parse ended on uenddelimiter or // the parse reached the end of the current input
//                                 buffer (file)
// c-addr start address of word in current input buffer // ulength length of word in characters (bytes) in the
//                                 current input buffer (file)
//
// Action:
// Moves the current offset pointer (>IN) in the current input buffer to the character after // any leading delimiters or to the character after uenddelimiter or to the end of the buffer if
//   either of those come first to find the start of the next word.
// If the end of the current input buffer or uenddelimiter was found then
//   ufoundendflag = FORTH_TRUE and ulength = 0 is returned.
// Else this moves the current offset pointer in the current input buffer to after the // next occurrence of a delimiter or to the end of the buffer if that comes
//   first, to find the end of the word.
//  Then pushes TRUE to the data stack if an occurrence of uenddelimiter
//   or the end of the current input buffer was reached. Otherwise FALSE
//   is pushed to the data stack.
// Then pushes a pointer to the address of the current offset at the start of the // word and the length of the word in characters (bytes) onto the data stack.

// Note:
// I suggest these as white space delimiters (this list should work with most editors):
//   c shorthand      ascii code    name
//   ' '              0x20          &lt;space&gt;
//   '\n'             0x0a          &lt;line feed&gt;
//   '\t'             0x09          &lt;tab&gt;
//   '\v'             0x0b          &lt;vertical tab&gt;
//   '\b'             0x08          &lt;back space&gt;
//   '\r'             0x0c          &lt;carriage return&gt;
//   '\f'             0x0f          &lt;form feed&gt;
//

Jim Norris author of DiaperGlu
http://www.rainbarrel.com





reply via email to

[Prev in Thread] Current Thread [Next in Thread]