poke-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Documentation


From: Mohammad-Reza Nabipoor
Subject: Re: Documentation
Date: Tue, 15 Sep 2020 13:44:22 +0430

Hi Jose

On Tue, Sep 15, 2020 at 08:10:23AM +0200, Jose E. Marchesi wrote:

The manual is way far from being finished.  I think publishing it could
confuse people at this point.

Publishing of an unfinished **good thing** is not bad!
And it's already in an acceptable state. It's very helpful.


Went thru it and have a few comments, but it would help if you could
send learn-poke-in-y-minutes.pk as either inline or as an attachment so
we can discuss it here.  (The gitlab.com link you provided won't work
with a Javascript-disabled browser, and wget'ting it gives a useless
.html document.)

Sorry for the inconvenience. Everyday I wish for a JavaScript-less world (and
in general a web-less world!) :D


Thanks,
Mohammad-Reza

---

/* Copyright (C) 2020, Mohammad-Reza Nabipoor */
/* SPDX-License-Identifier: GFDL-1.3-or-later */

/* GNU poke is an interactive editor for binary data. But it's not just an
 * editor, it provides a full-fledged procedural, interactive programming
 * language designed to describe data structures and to operate on them.
 * The programming language called Poke (with upper-case P).
 *
 * When the user have a description of binary data, he/she can *map* it on
 * the actual data and start poking the data! The user can inspect and modify
 * data.
 */

/* First start with nomenclature:
 *
 *   - poke      The editor program (also called GNU poke)
 *   - Poke      Domain-specific programming language that used by `poke`
 *   - pickle    A Poke source file. The extension of filename is `.pk`
 */

/* Let's talk about the Poke! */

/* Variables
 *
 * We can define variables in Poke using `defvar` keyword:
 *
 *   defvar NAME_OF_VARIABLE = VALUE
 */

defvar an_integer = 10;
defvar a_string = "hello, poke users!";

/* Values
 *
 * Poke programming language has the following types of value:
 *
 *   - Integer
 *   - String
 *   - Array
 *   - Offset
 *   - Struct
 *   - Union
 *   - Function
 */


/* Integer values */
defvar decimal = 10;
defvar hexadecimal = 0xff;
defvar binary = 0b1100;
defvar octal = 0o777;

defvar si8  = 1B;     /* byte (8-bit)  */
defvar si16 = 2H;     /* byte (16-bit) */
defvar si32 = 3;      /* int  (32-bit) */
defvar si64 = 4L;     /* long (64-bit) */

defvar ui8  = 4UB;    /* unsigned byte (8-bit)  */
defvar ui16 = 5UH;    /* unsigned int  (16-bit) */
defvar ui32 = 6U;     /* unsigned int  (32-bit) */
defvar ui64 = 7UL;    /* unsigned long (64-bit) */


/* String values (null-terminated) */
defvar foobar_string = "foo\nbar";
defvar empty_string = "";


/* Array values */
defvar arr1 = [1, 2, 3];
defvar arr2 = [[1, 2], [3, 4]];

defvar elem10 = arr1[0];    /* Arrays are indexed using the usual notation */
defvar elem12 = arr1[2];    /* This is the last element of `arr1`: 3 */

/* If you try to access elements beyond the bounds, you'll get an
 * `E_out_of_bound_exception` exception.
 */
/* defvar elem1x = arr1[3]; */
/* defvar elem1y = arr1[-1]; */

/* Array trimming: Extraction of a subset of the array */
defvar arr3   = arr1[0:1];  /* arr3 == [1, 2] */


/* Offset values
 *
 * Poke does not using integers to specify offsets in binary data, it has a
 * primitive type for that: offset!
 *
 * Offsets have two parts:
 *  - magnitude (an integer)
 *  - unit      (b (bit), byte (B), etc.)
 *
 * Offsets are also useful for specifying the size.
 */

/* Offsets with named units */
defvar off_8_bits     = 8#b;
defvar off_23_bytes   = 23#B;
defvar off_2000_bits  = 2#Kb;
defvar off_2000_bytes = 2#KB;
defvar off_3_nibbles  = 3#N;    /* 3 nibbles (each nibble is 4 bits) */

defvar off_1_byte = #B;   /* You can omit magnitude if it's 1 */

/* Offsets with numeric units */
defvar off_8_8 = 8#8;    /* magnitude: 8, unit: 8 bits */
defvar off_2_3 = 2#3;    /* magnitude: 2, unit: 3 bits */

/* Offset arithmetic
 *
 * OFF +- OFF -> OFF
 * OFF *  INT -> OFF
 * OFF /  OFF -> INT
 * OFF %  OFF -> OFF
 */
defvar off_1_plus_2   = 1#B + 2#B;    /* 3#B  */
defvar off_1_minus_2  = 1#B - 2#B;    /* -1#B */
defvar off_8_times_10 = 8#B * 10;     /* 80#B */
defvar off_10_times_8 = 10  * 8#B;    /* 80#B */
defvar off_7_div_1    = 7#B / 1#B;    /* 7    */  /* This is an integer */
defvar off_7_mod_3    = 7#B % 3#B;    /* 1#B  */

/* The following units are pre-defined in poke:
 *
 *   b, N, B, Kb, KB, Mb, MB, Gb, GB, Kib, KiB, Mib, MiB, Gib, GiB
 */


/* Types
 *
 * Before talking about `struct` values, it'd be nice to first talk about types
 * in Poke.
 */

/* Integer types
 *
 * Most general-purpose programming languages provide a small set of integer
 * types. Poke, on the contrary, provides a rich set of integer types featuring
 * different widths, in both signed and unsigned variants.
 *
 * `int<N>` is a signed integer with `N`-bit width. `N` can be an integer
 * literal in the range `[1, 64]`.
 *
 * `uint<N>` is the unsigned variant.
 *
 * Examples:
 *
 *    uint<1>
 *    uint<7>
 *    int<64>
 */

/* String type
 *
 * There is one string type in Poke: `string`
 * Strings in Poke are null-terminated.
 */

/* Array types
 *
 * There are three kinds of array types:
 *
 *   - Unbounded: arrays that have no explicit boundaries, like `int<32>[]`
 *   - Bounded by number of elements, like `int<64>[10]`
 *   - Bounded by size, like `uint<32>[8#B]`
 */

/* Offset types
 *
 * Offset types are denoted as `offset<BASE_TYPE,UNIT>`, where BASE_TYPE is
 * an integer type and UNIT the specification of an unit.
 *
 * Examples:
 *
 *   offset<int<32>,B>
 *   offset<uint<12>,Kb>
 */

/* Struct types
 *
 * Structs are the main abstraction that Poke provides to structure data. A
 * collection of heterogeneous values.
 *
 * And there's no padding or alignment between the fields of structs.
 *
 * Examples:
 *
 *   struct {
 *     uint<32> i32;
 *     uint<64> i64;
 *   }
 *
 *   struct {
 *     uint<16> flags;
 *     uint<8>[32] data;
 *   }
 *
 *   struct {
 *     int<32> code;
 *     string msg;
 *     int<32> exit_status;
 *   }
 */


/* User-declared types
 *
 * There's a mechanism to declare new types:
 *
 *   deftype NAME = TYPE;
 *
 * where NAME is the name of the new type, and TYPE is either a type specifier
 * or the name of some other type.
 *
 * The supported type specifiers are integral types, string type, array types,
 * struct types, function types, and `any` (The `any` type is used to
 * implement polymorphism).
 */

deftype Bit   = uint<1>;
deftype Int   = int<32>;
deftype Ulong = uint<64>;

deftype String = string;    /* Just to show that this is possible! */

deftype Buffer  = uint<8>[];        /* Unbounded array of type uint<8> */
deftype Triple  = int<32>[3];       /* Bounded array of 3 elements */
deftype Buf1024 = uint<8>[1024#B];  /* Bounded array with size of 1024 bytes */

deftype EmptyStruct = struct {};
deftype BufferStruct = struct
  {
    Buffer buffer;
  };
deftype Pair_32_64 =
  struct
  {
    uint<32> i32;
    uint<64> i64;
  };
deftype Packet34 =
  struct
  {
    uint<16> flags;
    uint<8>[32] data;
  };
deftype Error =
  struct
  {
    int<32> code;
    string msg;
    int<32> exit_status;
  };


/* Now back to the values */


/* Struct values */

defvar empty_struct = EmptyStruct {};

deftype Packet =
  struct
  {
    uint<16> flags;
    uint<8>[8] data;
  };

defvar packet_1 =
  Packet
  {
    flags = 0xff00,
    data = [0UB, 1UB, 2UB, 3UB, 4UB, 5UB, 6UB, 7UB],
  };

defvar packet_2 =
  Packet
  {
    flags = 1,

    /* The following line is invalid; because type of numbers is `uint<32>`.
     */
    /* data = [0, 1, 2, 3, 4, 5, 6, 7], */

    /* User cannot specify less than 8 elements; because the `data` field is a
     * fixed size array. So the following line is compilation error:
     */
    /* data = [0UB, 1UB, ], */
  };

defvar packet_3 =
  Packet
  {
    /* flags = 0, */    /* Fields can be omitted */

    /* The fifth element (counting from zero) is initialized to `128UB`;
     * and all uninitialized values before that will be initialized to `128UB`,
     * too.
     */
    data = [1UB, .[5] = 128UB, 2UB, 3UB],
  };
/* packet_3 == 
Packet{flags=0UH,data=[1UB,128UB,128UB,128UB,128UB,128UB,2UB,3UB]}
 */

deftype Header =
  struct
  {
    uint<8>[2] magic;
    offset<uint<32>,B> file_size;
    uint<16>;    /* Reserved */
    uint<16>;    /* Reserved */
    offset<uint<32>,B> data_offset;
  };

deftype Payload =
  struct
  {
    uint<8> magic;
    uint<32> data_length;

    /* Size of array depends on the `data_length` field */
    uint<8>[data_length] data;
  };

/* An interesting feature of Poke is that types also can be used as units for
 * offsets. The only restriction is that the type should have known size at
 * compile-time.
 */
defvar off_23_packets = 23#Packet;    /* magnitude: 23, unit: Packet */

/* Note that this is invalid and give compilation error:
 *
 *   defvar off_buffer = 1#Buffer;
 *
 * because `Buffer` is an unbounded array and the size is unknown at
 * compile-time.
 */

/* Offset arithmetic with types as unit of offsets
 */
defvar packet_size     = 1#Packet / 1#B;    /* 10 */
defvar two_packet_size = 2 #Packet/#B;      /* 20 */


/* Struct Field Constraints
 *
 * It is common for struct fields to be constrained to their values to
 * satisfy some conditions.  Obvious examples are magic numbers, and
 * specification-derived constraints.
 */
deftype HeaderWithMagic =
  struct
  {
    uint<8> magic : magic == 100UB;
    uint<8> version : version <= 3;
    offset<uint<32>,B> data_length;
    uint<8>[data_length] data;
  };
/* The constraint expression should evaluate to an integer value; that value
 * is interpreted as a boolean
 */

/* The following variable definition will raise an exception:
 *   unhandled constraint violation exception
 */
/* defvar hdrmagic = HeaderWithMagic {}; */

/* This will work because all field constraints are satisfied */
defvar hdrmagic =
  HeaderWithMagic
  {
    magic = 100UB,
  };

/* There is another way to specify the constraints: field initializers  */

/* Struct Field Initializers
 *
 * Field initializer has two roles:
 *   - Introduce constraint of the form: `field == initializer_expression`
 *   - Initialize the field with initializer expression
 */
deftype HeaderWithInit =
  struct
  {
    uint<8> magic = 100UB;
    uint<8> version = 3;

    offset<uint<32>,B> data_length;
    uint<8>[data_length] data;
  };

/* With field initializers, this is possible: */
defvar hdrauto = HeaderWithInit {};
/* hdrauto.magic == 100UB && hdrauto.version == 3UB */

/* The only limitation is that we cannot specify a constraint for initialized
 * fields.
 */


/* Functions
 *
 * Functions are lexically scoped.
 */
defun func1 = (uint<32> arg0, uint<64> arg1) uint<32>:
  {
    return arg0 | arg1 .>> 32;    /* `.>>` is bitwise shift right operator */
  }

defvar three = func1 (1, 2**33);   /* three == 3 (and `**` is power operator) */

defun awesome = (string name) void:
  {
    printf ("%s is awesome!\n", name);
  }
awesome ("Poke");    /* Will print "Poke is awesome!" on terminal */

defvar N = 10;
defun Nsquare = int<32>:    /* No input parameter */
  {
    /* The `N` variable is captured inside the `Nsquare` function */
    return N * N;
  }

defvar Nsq = Nsquare;     /* Nsq == 100 */

N = 20;
defvar Nsq2 = Nsquare;    /* Nsq2 == 400 */


/* Functions with optional arguments
 *
 * Note that the value of initialization gets captured in the closure.
 */

defvar ten = 10;
defun double32 = (int<32> n = ten) uint<64>:
  {
    n = n * 2;
    return n;
  }

defvar twenty = double32 ();         /* twenty == 20UL */
defvar another_twenty = double32;    /* It's OK to omit the `()` */
defvar thirty = double32 (15);       /* thirty == 30UL */

/* Function with no output (a procedure!) */
defun packet_toggle_flag = (Packet p) void:
  {
    p.flags = p.flags ^ 1;
  }

packet_toggle_flag (packet_1);    /* packet_1.flags == 0xff01 */


/* Struct Methods
 */
deftype Point =
  struct
  {
    int<32> x;
    int<32> y;

    method norm_squared = int<32>:
      {
        return x*x + y*y;
      }
  };

defvar point = Point{ x = 10, y = -1 };
defvar point_nsq = point.norm_squared;    /* point_nsq == 101 */


/* Unions
 *
 * Sometimes the structure of binary format can be different depending on some
 * eariler fields. To describe these kinds of formats, Poke provides `union`s.
 *
 * The first field of `union` for which its constraints are satisfied will be
 * selected.
 */
deftype PacketU =
  struct
  {
    uint<8> size;

    union
    {
      struct
      {
        uint<8> type;
        uint<8>[size] data;
      } : size < 32;

      struct
      {
        uint<16> type;
        uint<8>[size - 1] data;
      } : size < 128;

      struct
      {
        uint<16> type;
        uint<8> flags;
        uint<8>[size - 3] data;
      };
    };
  };


defvar packet_u_1 =
  PacketU
  {
    size = 10,
  };
defvar packet_u_2 =
  PacketU
  {
    size = 64,
  };
defvar packet_u_3 =
  PacketU
  {
    size = 128,
  };


/* Casts
 */
defvar num_u32 = 1;
defvar num_u64 = num_u32 as uint<64>;


/* Attributes
 *
 * Each value has a set of attributes.
 */

/* `size` attribute */

defvar sizeof_num_u32 = num_u32'size;    /* sizeof_num_u32 == 4#B */
defvar sizeof_num_u64 = num_u64'size;    /* sizeof_num_u64 == 8#B */

defvar sbuf = BufferStruct{};
defvar sizeof_sbuf = sbuf'size;          /* sizeof_sbuf == 0#B */
defvar sizeof_packet_1 = packet_1'size;  /* sizeof_packet_1 == 10#B */

/* `length` attribute */

defvar nelem_arr1 = arr1'length;         /* nelem_arr1 == 3 */
defvar nelem_arrx = [1, 2, 3, 4, 5, 6]'length;    /* nelem_arrx == 6 */

/* For structs it's the number of fields */
defvar nfields_packet_1 = packet_1'length;      /* nfields_packet_1 == 2 */


/* Conditionals
 *
 *   - if-else
 *   - conditional expression
 */

if (num_u32 & 1) { /* This branch will be evaluated */
  num_u32 = num_u32 | 2;    /* 1 | 2 == 3 */
  num_u64 = num_u64 | 4;    /* 1 | 4 == 5 */
} else {
  num_u32 = num_u32 | 8;    /* 1 | 8 == 9 */
  num_u64 = num_u64 | 16;   /* 1 | 16 = 17 */
}

defvar a_true_value = num_u32 == 3 && num_u64 == 5;
defvar a_false_value = num_u32 == 9 || num_u64 == 17;

defvar hundred = a_true_value ? 100 : 200;
defvar thousand = a_false_value ? 200 : 1000;


/* Loops
 *
 *   - while
 *   - for-in
 */

defvar i = 0;
while (1)
{
  i = i + 1;
  if (i == 10)
    break;
}
/* i == 10 */

print "\nList of maintainers:\n";
for (i in ["egeyar", "jmd", "positron", "darnir", "dan.cermak", "bruno",
  "ccaione", "eblake", "tim.ruehsen", "sdi1600195", "aaptel"])
  {
    printf "  %v\n", i;
  }

defvar digits = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0];
for (i in "0123456789")
  {
    digits[i - '0'] = i - '0';
  }
/* digits == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] */

defvar digitsEven = [8, 6, 4, 2, 0];
for (i in "0123456789" where i % 2 == 0)
  {
    digitsEven[(i - '0') / 2] = i - '0';
  }
/* digitsEven == [0, 2, 4, 6, 8] */


/* std.pk - Standard definition for poke
 *
 * The following types are defined as Standard Integral Types:
 *   - bit
 *   - nibble
 *   - uint8, byte, char, int8
 *   - uint16, ushort, int16, short
 *   - uint32, uint, int32, int
 *   - uint64, ulong, int64, long
 *
 * Standard Offset Types:
 *   deftype off64 = offset<int64,b>;
 *   deftype uoff64 = offset<uint64,b>;
 *
 * Conversion Functions:
 *   - catos  Character array to string
 *   - stoca  String to character array
 *   - atoi   String to integer
 *
 * String Functions:
 *   - strchr  Index of first occurrence of the character in string
 *   - ltrim   Left trim
 *   - rtrim   Right trim
 *
 * Sorting Functions:
 *   - qsort
 *
 * CRC Functions:
 *   - crc32
 *
 * Data and Time Functions:
 *   - ptime   Print human-readable datetime string given seconds since epoch
 *
 * Data and Time Types:
 *   - POSIX_Time32
 *   - POSIX_Time64
 *
 * Misc:
 *   defvar NULL = 0#B;
 */


/* Now we can talk about the most important concept in Poke: mapping! */


/* Mapping
 *
 * The purpose of poke is to edit "IO spaces", which are the files or devices,
 * or memory areas being edited.  This is achieved by **mapping** values.
 */

/* Using `open` function one can open an IO space; Poke supports the following
 * IO spaces:
 *
 *   - Auto-growing memory buffer
 *   - Address-space of a process
 *   - File
 *   - Block device served by an NDB server
 *
 * It has the following prototype:
 *
 *   defun open = (string HANDLER, uint<64> flags = 0) int<32>
 */

/* open an auto-growing memory buffer */
defvar memio = open("*Arbitrary Name*");

/* open a file */
defvar zeroio = open("/dev/zero");

/* close the IO space */
close(zeroio);

/* To access to IO space we can map a value to some area using this syntax:
 *
 *     TYPE @ OFFST
 * or,
 *     TYPE @ IOS : OFFSET
 */
defvar ui32num = uint<32> @ 0#B;
defvar i32num = int<32> @ 4#B;

/* If we modify the `ui32num` the first 4 bytes in IO space will change. */
ui32num = 0xaabbccdd;

/* Endianness
 *
 * Big-endian is the default endian-ness. This can be verified by the following
 * expression:
 *
 *   get_endian == ENDIAN_BIG
 *
 * This can be changed using `set_endian` function.
 */
set_endian(ENDIAN_LITTLE);    /* get_endian == ENDIAN_LITTLE */


/* WIP ... */


/* Based on
 * 
https://kernel-recipes.org/en/2019/talks/gnu-poke-an-extensible-editor-for-structured-binary-data/
 * GNU poke reference documentation (Texinfo file)
 */



reply via email to

[Prev in Thread] Current Thread [Next in Thread]