[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
New word suggestion - PARSE-NAMES
From: |
James Norris |
Subject: |
New word suggestion - PARSE-NAMES |
Date: |
Mon, 10 Aug 2020 11:00:20 -0400 |
User-agent: |
Roundcube Webmail/1.4.7 |
If you add PARSE-NAMES to Forth, then the user can easily add additional
state behavior to Forth.
The way PARSE-NAMES (or PARSE-WORDS) works is, it is a combination of
PARSE-NAME and PARSE with a few modifications.
PARSE-NAMES differs from PARSE-NAME in that it treat uenddelimiter as a
delimiter for a name in addition to the <SPACE>
character. The rules for parsing uenddelimiter are the same as for
<SPACE>. This means no white space delimiters are
needed before or after uenddelimiter. Also, after parsing
uenddelimiter, the >IN offset will be on the character after
uneddelimiter.
PARSE-NAMES differs from PARSE in that it is multi-line.
Multi-line means that:
PARSE-NAMES treats line terminator characters as white space
delimiters.
If the current input buffer is a file, ufoundendflag is only true if
the end of file was reached or uenddelimiter was found
If the current input buffer was from EVALUATE, ufoundendflag is only
true if the end of the length passed to EVALUATE
was reached or uenddelimiter was found
If the current input buffer was from the user from a terminal input
device, then ufoundendflag is only true if the end of
whatever packet was passed in from the user was reached or
uenddelimiter was found.
This packet may include line terminator characters.
If the current input buffer was from a block, then I'm not supporting
blocks or am familiar with their use. I suggest asking
people who use blocks what they want. But as an initial
recommendation I suggest making ufoundendflag only true when the end
of the block is reached or uenddelimiter was found.
Refill is done if your implementation needs to do it.
The reason for adding PARSE-NAMES (or PARSE-WORDS) to Forth is that you
can make words like this:
: VARIABLES{
BEGIN
[CHAR] } PARSE-NAMES
DUP 0= IF
DROP
ELSE
NEXTNAME CREATE
ALIGN 1 CELLS ALLOT
THEN
UNTIL ;
Which you can use like this:
VARIABLES{ x y z }
You could even make a word like CONSTANTS{ which would be the same as
above except
it tries to convert the name to a number first. If it's a number it
pushes it to the data stack.
If it's not a number, it uses the name as the name of a new constant.
Which you can use like this:
CONSTANTS{ 3 x 4 y 5 z }
You could also make a word that initializes variables. If the name is a
number it pushes it to the data stack.
If it's not a number, it creates a new variable with the name and puts
the top number on the data stack into it.
So something like this:
INITIALIZED-VARIABLES{ 3 x 4 y 5 z }
You could also make a word that compiles bytes. Something that coverts
the names to numbers and
pushes the low byte of the converted number onto the end of the current
compile buffer. Something like:
HEX
COMPILE-U8S{ 37 82 FF 63 97 C4 }
Words like COMPILE-U8s might make initializing compile time data easier,
and more readable.
PARSE-NAMES can also be used to implement LOCALS|
// Stack action shorthand:
// ( "<delimiters>word<delimiters>morestuff" |
// "<delimiters>word<enddelimiter>morestuff"
// -currentinputbuffer- "<delimiters>morestuff" )
// ( uenddelimiter -- ufoundendflag c-addr ulength )
//
// Data stack in:
// uenddelimiter a character (byte) that will end the
parsing
// in addition to the whitespace
delimiter list
//
// Data stack out:
// ufoundendflag FORTH_TRUE if the parse ended on
uenddelimiter or
// the parse reached the end of the
current input
// buffer (file)
// c-addr start address of word in current input
buffer
// ulength length of word in characters (bytes)
in the
// current input buffer (file)
//
// Action:
// Moves the current offset pointer (>IN) in the current input buffer
to the character after
// any leading delimiters or to the character after uenddelimiter or
to the end of the buffer if
// either of those come first to find the start of the next word.
// If the end of the current input buffer or uenddelimiter was found
then
// ufoundendflag = FORTH_TRUE and ulength = 0 is returned.
// Else this moves the current offset pointer in the current input
buffer to after the
// next occurrence of a delimiter or to the end of the buffer if that
comes
// first, to find the end of the word.
// Then pushes TRUE to the data stack if an occurrence of uenddelimiter
// or the end of the current input buffer was reached. Otherwise FALSE
// is pushed to the data stack.
// Then pushes a pointer to the address of the current offset at the
start of the
// word and the length of the word in characters (bytes) onto the data
stack.
// Note:
// I suggest these as white space delimiters (this list should work
with most editors):
// c shorthand ascii code name
// ' ' 0x20 <space>
// '\n' 0x0a <line feed>
// '\t' 0x09 <tab>
// '\v' 0x0b <vertical tab>
// '\b' 0x08 <back space>
// '\r' 0x0c <carriage return>
// '\f' 0x0f <form feed>
//
Jim Norris author of DiaperGlu
http://www.rainbarrel.com
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- New word suggestion - PARSE-NAMES,
James Norris <=