[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Documentation
From: |
Mohammad-Reza Nabipoor |
Subject: |
Re: Documentation |
Date: |
Tue, 15 Sep 2020 13:44:22 +0430 |
Hi Jose
On Tue, Sep 15, 2020 at 08:10:23AM +0200, Jose E. Marchesi wrote:
The manual is way far from being finished. I think publishing it could
confuse people at this point.
Publishing of an unfinished **good thing** is not bad!
And it's already in an acceptable state. It's very helpful.
Went thru it and have a few comments, but it would help if you could
send learn-poke-in-y-minutes.pk as either inline or as an attachment so
we can discuss it here. (The gitlab.com link you provided won't work
with a Javascript-disabled browser, and wget'ting it gives a useless
.html document.)
Sorry for the inconvenience. Everyday I wish for a JavaScript-less world (and
in general a web-less world!) :D
Thanks,
Mohammad-Reza
---
/* Copyright (C) 2020, Mohammad-Reza Nabipoor */
/* SPDX-License-Identifier: GFDL-1.3-or-later */
/* GNU poke is an interactive editor for binary data. But it's not just an
* editor, it provides a full-fledged procedural, interactive programming
* language designed to describe data structures and to operate on them.
* The programming language called Poke (with upper-case P).
*
* When the user have a description of binary data, he/she can *map* it on
* the actual data and start poking the data! The user can inspect and modify
* data.
*/
/* First start with nomenclature:
*
* - poke The editor program (also called GNU poke)
* - Poke Domain-specific programming language that used by `poke`
* - pickle A Poke source file. The extension of filename is `.pk`
*/
/* Let's talk about the Poke! */
/* Variables
*
* We can define variables in Poke using `defvar` keyword:
*
* defvar NAME_OF_VARIABLE = VALUE
*/
defvar an_integer = 10;
defvar a_string = "hello, poke users!";
/* Values
*
* Poke programming language has the following types of value:
*
* - Integer
* - String
* - Array
* - Offset
* - Struct
* - Union
* - Function
*/
/* Integer values */
defvar decimal = 10;
defvar hexadecimal = 0xff;
defvar binary = 0b1100;
defvar octal = 0o777;
defvar si8 = 1B; /* byte (8-bit) */
defvar si16 = 2H; /* byte (16-bit) */
defvar si32 = 3; /* int (32-bit) */
defvar si64 = 4L; /* long (64-bit) */
defvar ui8 = 4UB; /* unsigned byte (8-bit) */
defvar ui16 = 5UH; /* unsigned int (16-bit) */
defvar ui32 = 6U; /* unsigned int (32-bit) */
defvar ui64 = 7UL; /* unsigned long (64-bit) */
/* String values (null-terminated) */
defvar foobar_string = "foo\nbar";
defvar empty_string = "";
/* Array values */
defvar arr1 = [1, 2, 3];
defvar arr2 = [[1, 2], [3, 4]];
defvar elem10 = arr1[0]; /* Arrays are indexed using the usual notation */
defvar elem12 = arr1[2]; /* This is the last element of `arr1`: 3 */
/* If you try to access elements beyond the bounds, you'll get an
* `E_out_of_bound_exception` exception.
*/
/* defvar elem1x = arr1[3]; */
/* defvar elem1y = arr1[-1]; */
/* Array trimming: Extraction of a subset of the array */
defvar arr3 = arr1[0:1]; /* arr3 == [1, 2] */
/* Offset values
*
* Poke does not using integers to specify offsets in binary data, it has a
* primitive type for that: offset!
*
* Offsets have two parts:
* - magnitude (an integer)
* - unit (b (bit), byte (B), etc.)
*
* Offsets are also useful for specifying the size.
*/
/* Offsets with named units */
defvar off_8_bits = 8#b;
defvar off_23_bytes = 23#B;
defvar off_2000_bits = 2#Kb;
defvar off_2000_bytes = 2#KB;
defvar off_3_nibbles = 3#N; /* 3 nibbles (each nibble is 4 bits) */
defvar off_1_byte = #B; /* You can omit magnitude if it's 1 */
/* Offsets with numeric units */
defvar off_8_8 = 8#8; /* magnitude: 8, unit: 8 bits */
defvar off_2_3 = 2#3; /* magnitude: 2, unit: 3 bits */
/* Offset arithmetic
*
* OFF +- OFF -> OFF
* OFF * INT -> OFF
* OFF / OFF -> INT
* OFF % OFF -> OFF
*/
defvar off_1_plus_2 = 1#B + 2#B; /* 3#B */
defvar off_1_minus_2 = 1#B - 2#B; /* -1#B */
defvar off_8_times_10 = 8#B * 10; /* 80#B */
defvar off_10_times_8 = 10 * 8#B; /* 80#B */
defvar off_7_div_1 = 7#B / 1#B; /* 7 */ /* This is an integer */
defvar off_7_mod_3 = 7#B % 3#B; /* 1#B */
/* The following units are pre-defined in poke:
*
* b, N, B, Kb, KB, Mb, MB, Gb, GB, Kib, KiB, Mib, MiB, Gib, GiB
*/
/* Types
*
* Before talking about `struct` values, it'd be nice to first talk about types
* in Poke.
*/
/* Integer types
*
* Most general-purpose programming languages provide a small set of integer
* types. Poke, on the contrary, provides a rich set of integer types featuring
* different widths, in both signed and unsigned variants.
*
* `int<N>` is a signed integer with `N`-bit width. `N` can be an integer
* literal in the range `[1, 64]`.
*
* `uint<N>` is the unsigned variant.
*
* Examples:
*
* uint<1>
* uint<7>
* int<64>
*/
/* String type
*
* There is one string type in Poke: `string`
* Strings in Poke are null-terminated.
*/
/* Array types
*
* There are three kinds of array types:
*
* - Unbounded: arrays that have no explicit boundaries, like `int<32>[]`
* - Bounded by number of elements, like `int<64>[10]`
* - Bounded by size, like `uint<32>[8#B]`
*/
/* Offset types
*
* Offset types are denoted as `offset<BASE_TYPE,UNIT>`, where BASE_TYPE is
* an integer type and UNIT the specification of an unit.
*
* Examples:
*
* offset<int<32>,B>
* offset<uint<12>,Kb>
*/
/* Struct types
*
* Structs are the main abstraction that Poke provides to structure data. A
* collection of heterogeneous values.
*
* And there's no padding or alignment between the fields of structs.
*
* Examples:
*
* struct {
* uint<32> i32;
* uint<64> i64;
* }
*
* struct {
* uint<16> flags;
* uint<8>[32] data;
* }
*
* struct {
* int<32> code;
* string msg;
* int<32> exit_status;
* }
*/
/* User-declared types
*
* There's a mechanism to declare new types:
*
* deftype NAME = TYPE;
*
* where NAME is the name of the new type, and TYPE is either a type specifier
* or the name of some other type.
*
* The supported type specifiers are integral types, string type, array types,
* struct types, function types, and `any` (The `any` type is used to
* implement polymorphism).
*/
deftype Bit = uint<1>;
deftype Int = int<32>;
deftype Ulong = uint<64>;
deftype String = string; /* Just to show that this is possible! */
deftype Buffer = uint<8>[]; /* Unbounded array of type uint<8> */
deftype Triple = int<32>[3]; /* Bounded array of 3 elements */
deftype Buf1024 = uint<8>[1024#B]; /* Bounded array with size of 1024 bytes */
deftype EmptyStruct = struct {};
deftype BufferStruct = struct
{
Buffer buffer;
};
deftype Pair_32_64 =
struct
{
uint<32> i32;
uint<64> i64;
};
deftype Packet34 =
struct
{
uint<16> flags;
uint<8>[32] data;
};
deftype Error =
struct
{
int<32> code;
string msg;
int<32> exit_status;
};
/* Now back to the values */
/* Struct values */
defvar empty_struct = EmptyStruct {};
deftype Packet =
struct
{
uint<16> flags;
uint<8>[8] data;
};
defvar packet_1 =
Packet
{
flags = 0xff00,
data = [0UB, 1UB, 2UB, 3UB, 4UB, 5UB, 6UB, 7UB],
};
defvar packet_2 =
Packet
{
flags = 1,
/* The following line is invalid; because type of numbers is `uint<32>`.
*/
/* data = [0, 1, 2, 3, 4, 5, 6, 7], */
/* User cannot specify less than 8 elements; because the `data` field is a
* fixed size array. So the following line is compilation error:
*/
/* data = [0UB, 1UB, ], */
};
defvar packet_3 =
Packet
{
/* flags = 0, */ /* Fields can be omitted */
/* The fifth element (counting from zero) is initialized to `128UB`;
* and all uninitialized values before that will be initialized to `128UB`,
* too.
*/
data = [1UB, .[5] = 128UB, 2UB, 3UB],
};
/* packet_3 ==
Packet{flags=0UH,data=[1UB,128UB,128UB,128UB,128UB,128UB,2UB,3UB]}
*/
deftype Header =
struct
{
uint<8>[2] magic;
offset<uint<32>,B> file_size;
uint<16>; /* Reserved */
uint<16>; /* Reserved */
offset<uint<32>,B> data_offset;
};
deftype Payload =
struct
{
uint<8> magic;
uint<32> data_length;
/* Size of array depends on the `data_length` field */
uint<8>[data_length] data;
};
/* An interesting feature of Poke is that types also can be used as units for
* offsets. The only restriction is that the type should have known size at
* compile-time.
*/
defvar off_23_packets = 23#Packet; /* magnitude: 23, unit: Packet */
/* Note that this is invalid and give compilation error:
*
* defvar off_buffer = 1#Buffer;
*
* because `Buffer` is an unbounded array and the size is unknown at
* compile-time.
*/
/* Offset arithmetic with types as unit of offsets
*/
defvar packet_size = 1#Packet / 1#B; /* 10 */
defvar two_packet_size = 2 #Packet/#B; /* 20 */
/* Struct Field Constraints
*
* It is common for struct fields to be constrained to their values to
* satisfy some conditions. Obvious examples are magic numbers, and
* specification-derived constraints.
*/
deftype HeaderWithMagic =
struct
{
uint<8> magic : magic == 100UB;
uint<8> version : version <= 3;
offset<uint<32>,B> data_length;
uint<8>[data_length] data;
};
/* The constraint expression should evaluate to an integer value; that value
* is interpreted as a boolean
*/
/* The following variable definition will raise an exception:
* unhandled constraint violation exception
*/
/* defvar hdrmagic = HeaderWithMagic {}; */
/* This will work because all field constraints are satisfied */
defvar hdrmagic =
HeaderWithMagic
{
magic = 100UB,
};
/* There is another way to specify the constraints: field initializers */
/* Struct Field Initializers
*
* Field initializer has two roles:
* - Introduce constraint of the form: `field == initializer_expression`
* - Initialize the field with initializer expression
*/
deftype HeaderWithInit =
struct
{
uint<8> magic = 100UB;
uint<8> version = 3;
offset<uint<32>,B> data_length;
uint<8>[data_length] data;
};
/* With field initializers, this is possible: */
defvar hdrauto = HeaderWithInit {};
/* hdrauto.magic == 100UB && hdrauto.version == 3UB */
/* The only limitation is that we cannot specify a constraint for initialized
* fields.
*/
/* Functions
*
* Functions are lexically scoped.
*/
defun func1 = (uint<32> arg0, uint<64> arg1) uint<32>:
{
return arg0 | arg1 .>> 32; /* `.>>` is bitwise shift right operator */
}
defvar three = func1 (1, 2**33); /* three == 3 (and `**` is power operator) */
defun awesome = (string name) void:
{
printf ("%s is awesome!\n", name);
}
awesome ("Poke"); /* Will print "Poke is awesome!" on terminal */
defvar N = 10;
defun Nsquare = int<32>: /* No input parameter */
{
/* The `N` variable is captured inside the `Nsquare` function */
return N * N;
}
defvar Nsq = Nsquare; /* Nsq == 100 */
N = 20;
defvar Nsq2 = Nsquare; /* Nsq2 == 400 */
/* Functions with optional arguments
*
* Note that the value of initialization gets captured in the closure.
*/
defvar ten = 10;
defun double32 = (int<32> n = ten) uint<64>:
{
n = n * 2;
return n;
}
defvar twenty = double32 (); /* twenty == 20UL */
defvar another_twenty = double32; /* It's OK to omit the `()` */
defvar thirty = double32 (15); /* thirty == 30UL */
/* Function with no output (a procedure!) */
defun packet_toggle_flag = (Packet p) void:
{
p.flags = p.flags ^ 1;
}
packet_toggle_flag (packet_1); /* packet_1.flags == 0xff01 */
/* Struct Methods
*/
deftype Point =
struct
{
int<32> x;
int<32> y;
method norm_squared = int<32>:
{
return x*x + y*y;
}
};
defvar point = Point{ x = 10, y = -1 };
defvar point_nsq = point.norm_squared; /* point_nsq == 101 */
/* Unions
*
* Sometimes the structure of binary format can be different depending on some
* eariler fields. To describe these kinds of formats, Poke provides `union`s.
*
* The first field of `union` for which its constraints are satisfied will be
* selected.
*/
deftype PacketU =
struct
{
uint<8> size;
union
{
struct
{
uint<8> type;
uint<8>[size] data;
} : size < 32;
struct
{
uint<16> type;
uint<8>[size - 1] data;
} : size < 128;
struct
{
uint<16> type;
uint<8> flags;
uint<8>[size - 3] data;
};
};
};
defvar packet_u_1 =
PacketU
{
size = 10,
};
defvar packet_u_2 =
PacketU
{
size = 64,
};
defvar packet_u_3 =
PacketU
{
size = 128,
};
/* Casts
*/
defvar num_u32 = 1;
defvar num_u64 = num_u32 as uint<64>;
/* Attributes
*
* Each value has a set of attributes.
*/
/* `size` attribute */
defvar sizeof_num_u32 = num_u32'size; /* sizeof_num_u32 == 4#B */
defvar sizeof_num_u64 = num_u64'size; /* sizeof_num_u64 == 8#B */
defvar sbuf = BufferStruct{};
defvar sizeof_sbuf = sbuf'size; /* sizeof_sbuf == 0#B */
defvar sizeof_packet_1 = packet_1'size; /* sizeof_packet_1 == 10#B */
/* `length` attribute */
defvar nelem_arr1 = arr1'length; /* nelem_arr1 == 3 */
defvar nelem_arrx = [1, 2, 3, 4, 5, 6]'length; /* nelem_arrx == 6 */
/* For structs it's the number of fields */
defvar nfields_packet_1 = packet_1'length; /* nfields_packet_1 == 2 */
/* Conditionals
*
* - if-else
* - conditional expression
*/
if (num_u32 & 1) { /* This branch will be evaluated */
num_u32 = num_u32 | 2; /* 1 | 2 == 3 */
num_u64 = num_u64 | 4; /* 1 | 4 == 5 */
} else {
num_u32 = num_u32 | 8; /* 1 | 8 == 9 */
num_u64 = num_u64 | 16; /* 1 | 16 = 17 */
}
defvar a_true_value = num_u32 == 3 && num_u64 == 5;
defvar a_false_value = num_u32 == 9 || num_u64 == 17;
defvar hundred = a_true_value ? 100 : 200;
defvar thousand = a_false_value ? 200 : 1000;
/* Loops
*
* - while
* - for-in
*/
defvar i = 0;
while (1)
{
i = i + 1;
if (i == 10)
break;
}
/* i == 10 */
print "\nList of maintainers:\n";
for (i in ["egeyar", "jmd", "positron", "darnir", "dan.cermak", "bruno",
"ccaione", "eblake", "tim.ruehsen", "sdi1600195", "aaptel"])
{
printf " %v\n", i;
}
defvar digits = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0];
for (i in "0123456789")
{
digits[i - '0'] = i - '0';
}
/* digits == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] */
defvar digitsEven = [8, 6, 4, 2, 0];
for (i in "0123456789" where i % 2 == 0)
{
digitsEven[(i - '0') / 2] = i - '0';
}
/* digitsEven == [0, 2, 4, 6, 8] */
/* std.pk - Standard definition for poke
*
* The following types are defined as Standard Integral Types:
* - bit
* - nibble
* - uint8, byte, char, int8
* - uint16, ushort, int16, short
* - uint32, uint, int32, int
* - uint64, ulong, int64, long
*
* Standard Offset Types:
* deftype off64 = offset<int64,b>;
* deftype uoff64 = offset<uint64,b>;
*
* Conversion Functions:
* - catos Character array to string
* - stoca String to character array
* - atoi String to integer
*
* String Functions:
* - strchr Index of first occurrence of the character in string
* - ltrim Left trim
* - rtrim Right trim
*
* Sorting Functions:
* - qsort
*
* CRC Functions:
* - crc32
*
* Data and Time Functions:
* - ptime Print human-readable datetime string given seconds since epoch
*
* Data and Time Types:
* - POSIX_Time32
* - POSIX_Time64
*
* Misc:
* defvar NULL = 0#B;
*/
/* Now we can talk about the most important concept in Poke: mapping! */
/* Mapping
*
* The purpose of poke is to edit "IO spaces", which are the files or devices,
* or memory areas being edited. This is achieved by **mapping** values.
*/
/* Using `open` function one can open an IO space; Poke supports the following
* IO spaces:
*
* - Auto-growing memory buffer
* - Address-space of a process
* - File
* - Block device served by an NDB server
*
* It has the following prototype:
*
* defun open = (string HANDLER, uint<64> flags = 0) int<32>
*/
/* open an auto-growing memory buffer */
defvar memio = open("*Arbitrary Name*");
/* open a file */
defvar zeroio = open("/dev/zero");
/* close the IO space */
close(zeroio);
/* To access to IO space we can map a value to some area using this syntax:
*
* TYPE @ OFFST
* or,
* TYPE @ IOS : OFFSET
*/
defvar ui32num = uint<32> @ 0#B;
defvar i32num = int<32> @ 4#B;
/* If we modify the `ui32num` the first 4 bytes in IO space will change. */
ui32num = 0xaabbccdd;
/* Endianness
*
* Big-endian is the default endian-ness. This can be verified by the following
* expression:
*
* get_endian == ENDIAN_BIG
*
* This can be changed using `set_endian` function.
*/
set_endian(ENDIAN_LITTLE); /* get_endian == ENDIAN_LITTLE */
/* WIP ... */
/* Based on
*
https://kernel-recipes.org/en/2019/talks/gnu-poke-an-extensible-editor-for-structured-binary-data/
* GNU poke reference documentation (Texinfo file)
*/