gm2
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What is the equivalent type in GM2?


From: Gaius Mulley
Subject: Re: What is the equivalent type in GM2?
Date: Mon, 30 Mar 2020 14:32:23 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Benjamin Kowarsch <address@hidden> writes:

> On Sat, 28 Mar 2020 at 17:10, Hưng Hưng <address@hidden>
> wrote:
>
>     The C version doesn't impose any max length.
>
> Strictly speaking, C the language considers char* to be a pointer to a
> single character, thus a maximum length of one.
>
> However, the C compiler doesn't care about type safety and allows you
> to read and write past the end of that character string of length one.
>
> This is the single most important reason why the major operating
> systems today are so vulnerable to cyber attacks. The vast majority of
> security vulnerabilities are based on buffer overflow exploits in C.
>
> By contrast, Modula-2 was specifically designed as a type safe
> language. For this reason, the compiler does not permit you to read
> and write past the declared capacity limit of a type.
>
> This is not a bug, but a feature. And a very important feature that we
> want to keep.
>
>     So I have to guest and put a value for max length that I found
>     suitable? It's like we impose an imagination/artificial limitation
>     to our binding with no reason at all.
>     
>
> The use of static character arrays goes back to the 1960s. Nowadays we
> tend to use dynamic collection types where the capacity is determined
> during allocation at runtime. But this isn't built into the language.
> Instead, it has to be supplied in form of libraries.
>
> You may want to consider writing yourself a dynamic string library and
> then use that.
>
> Or perhaps you can find one that meets your requirements and use that.
>
> The indeterminate record types Gaius and I were talking about in the
> other thread are specifically designed to allow easy implementation of
> dynamic collection types but with type safety.
>
> TYPE DynString = POINTER TO RECORD
> length : LONGCARD;
> + string : ARRAY OF CHAR
> END;
>
> then ...
>
> VAR str : DynString;
>
> NEW str CAPACITY 1000;
>
> after which
>
> CAPACITY(str) will return 1000
>
> and LENGTH(str) will return 0.
>
> Alternatively, with initialisation string ...
>
> NEW str := "The quick brown fox jumps over the lazy dog.";
>
> after which
>
> CAPACITY(str) and LENGTH(str) will both return 34.
>
> However, that's not available yet. So, in the meantime, you will have
> to either use dangerous pointer arithmetic in your dynamic type
> implementation, or if you want to keep type safety you will need to be
> creative.
>
> I have implemented a dynamic string library for interned strings in
> one of my projects which is available at github.
>
> PIM version
> https://github.com/m2sf/m2pp/blob/master/src/String.pim.def
> https://github.com/m2sf/m2pp/blob/master/src/imp/String.pim.mod
>
> ISO version
> https://github.com/m2sf/m2pp/blob/master/src/String.iso.def
> https://github.com/m2sf/m2pp/blob/master/src/imp/String.iso.mod
>
> This uses a Passepartout, which is French for a key that matches
> multiple locks.
>
> TYPE Passepartout = POINTER TO StrBlank.Largest;
>
> TYPE StringDescriptor = RECORD
> length : CARDINAL;
> intern : Passepartout
> END;
>
> where StrBlank.Largest is defined in
>
> https://github.com/m2sf/m2pp/blob/master/src/StrBlank.def
> https://github.com/m2sf/m2pp/blob/master/src/imp/StrBlank.mod
>
> which contains a number of length specific character array types.
>
> Type Largest is the largest character array type available.
>
> When a new dynamic string is allocated, the library determines the
> character array that is the closest match for capacity and allocates a
> new dynamic string of that type, which is then linked to the intern
> field using a CAST since the formal type of field intern is of the
> largest character array type. However the benefit is that we can still
> use array subscript notation to address individual characters in the
> string instead of having to use pointer arithmetic. The casting
> between type Largest and the actually allocated character array type
> is confined to this one library and happens only in two or three
> places. Outside the library the strings are only accessible via the
> library's API.
>
> This is a reasonable compromise between readability, convenience and
> type safety. Besides, using pointer arithmetic would be less readable,
> less convenient and less safe. So it is the best you can do with
> classical Modula-2 at this time.
>
>     
>
>     
>     
>     Vào Th 7, 28 thg 3, 2020 vào lúc 14:47 Benjamin Kowarsch
>     <address@hidden> đã viết:
>     
>     
>         A pointer to char in C is not equivalent to a pointer to CHAR
>         in Modula-2.
>         
>         
>         In C a string may be either a char array or a pointer to a
>         single char where the lack of type safety is then EXPLOITED to
>         ignore the fact that the pointer type points to a single char,
>         not a character string, and with DEVASTATING CONSEQUENCES !!!
>         
>         
>         By contrast, in Modula-2 a string is a character array with a
>         maximum capacity associated to the type and type safety is
>         enforced, thus a pointer to a singe character is always
>         interpreted correctly as having a payload of only one single
>         character.
>         
>         
>         Thus, the closest equivalent of
>         
>         
>         char* str;
>         
>         
>         in Modula-2 would be
>         
>         
>         POINTER TO ARRAY [0..MaxStrLen] OF CHAR;
>         
>         
>         where MaxStrLen must be a compile time constant, that is, it
>         cannot be changed dynamically at runtime.
>         
>         
>         And if you have a static character array string in Modula-2,
>         like
>         
>         
>         VAR str : ARRAY [0..80] OF CHAR;
>         
>         
>         then you can't just pass str to a char* parameter of a C
>         function. Instead you need to pass a pointer to it.
>         
>         
>         TYPE Str80 = ARRAY [0..80] OF CHAR;
>         VAR str : Str80;
>         
>         
>         TYPE Str80Ptr = POINTER TO Str80;
>         VAR strPtr : Str80Ptr;
>         
>         
>         then
>         
>         
>         str := "the quick brown fox jumps over the lazy dog.";
>         strPtr := VAL(Str80Ptr, ADR(str));
>         
>         
>         then
>         
>         
>         passToC(strPtr);
>         
>         
>         assuming
>         
>         
>         void passToC(const char* s);
>         
>         
>         Although GM2 may already map an argument of a character array
>         type to char* when using the DEFINITION MODULE FOR "C" syntax
>         to map C functions. Even if it does, it likely won't do the
>         same for char** and char***.
>         
>         
>         Thus, if the C function parameters are char** then you need
>         
>         
>         POINTER TO POINTER TO ARRAY [0..MaxStrLen] OF CHAR;
>         
>         
>         Likewise for char*** you need
>         
>         
>         POINTER TO POINTER TO POINTER TO ARRAY [0..MaxStrLen] OF CHAR;
>         
>         
>         As I have mentioned before, the best way to interface to C
>         APIs is to use a layered approach where the lowest level
>         interfaces directly with the C API and a user level provides a
>         wrapped Modula-2 representation that is independent of the C
>         API. In the lower level library you can then convert and cast
>         types as needed to pass between C and Modula-2.
>         
>         
>         
>         
>         
>         
>         
>         
>
>         
>         
>         On Sat, 28 Mar 2020 at 03:27, Hưng Hưng
>         <address@hidden> wrote:
>         
>         
>             
>             Let me add additional information. If I use the pointer
>             trick, e.g: PChar, PPChar, PPPChar, then I can't pass the
>             M2 string into C function as it requires C string, if I
>             try to do so the compiler will complain because it expect
>             char to have only length 1. M2 and C have a very different
>             way of processing string, as I see the equivalent pointer
>             to char trick in C would not work on M2.
>             
>             
>             There is a procedure in module DynamicStrings allow to
>             convert between M2 string and C string, but again, how to
>             translate these data type correctly? If we go the pointer
>             trick we will then have to figure out how to represent
>             PPChar, PPPChar as the procedure in DynamicStrings only
>             helps us up to here. It's circular reasoning. I feel my
>             head as going to explode.
>             
>
>             
>             
>             Vào Th 7, 28 thg 3, 2020 vào lúc 01:15 Hưng Hưng
>             <address@hidden> đã viết:
>             
>             
>                 
>                 The C function return or took a C string as parameter,
>                 with is an array of char or pointer to unsigned char.
>                 
>                 
>                 Another function return or took an array of C string
>                 as parameter, which is an 2D array of char or pointer
>                 to pointer to unsigned char.
>                 
>                 
>                 Another function return or took an array of array of C
>                 string as parameter, which is an 3D array of 
>                 
>                 char or pointer to pointer to pointer to unsigned
>                 char.
>                 
>                 
>                 It's too complex. C code tends to abuse pointer too
>                 much.
>                 
>                 
>                 e.g:
>                 
>                 
>                 void IupResetAttribute(Ihandle* ih, const char* name);
>                 
>                 
>                 int IupGetAllAttributes(Ihandle* ih, char** names, int
>                 n);
>                 
>                 
>                 int IupOpen (int *argc, char ***argv);

Hi,

it might also be worth examining DynamicStrings in gm2 - there is a
'string' procedure which will convert a String to a C string.

So you can do:

VAR
   s : String ;
   cs: ADDRESS ;
BEGIN
   s := InitString ('hello world') ;  (* convert "hello world" into a
                                         dynamic string type.  *)
   cs := string (s) ;                 (* cs is a C compatible string
                                         attached to s.  *)
   s := KillString (s) ;              (* deconstruct s and cs.  *)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]