bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

supporting strings > 2 GB


From: Bruno Haible
Subject: supporting strings > 2 GB
Date: Sat, 12 Oct 2019 16:38:49 +0200
User-agent: KMail/5.1.3 (Linux/4.4.0-165-generic; KDE/5.18.0; x86_64; ; )

Hi Paul, Eric,

I'd like to get over the INT_MAX limit on string size for
  * the *printf family of functions,
  * the wcswidth, mbswidth functions,
like it has been done for large files and regular expressions.

The benefit I expect from that is:
  - Support of strings > 2 GB or 4 GB without making applications more complex.
  - Since such strings occur rarely, these corner cases of the code are most
    often untested. The change would eliminate these untested corners, thus
    eliminating a number of bugs.

How was it done for regular expressions?
  1) POSIX introduced a type 'regoff_t' that is to be used instead of 'int',
     in the context of the regex APIs.
     https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/regex.h.html
  2) glibc introduced a preprocessor define _REGEX_LARGE_OFFSETS.
  3) gnulib defines _REGEX_LARGE_OFFSETS to 1.

In a similar vein, I think it could be done like this for *printf:
  1) Introduce a type 'printf_len_t' that is a signed type, either 'int' or
     'ptrdiff_t'. And a constant PRINTF_LEN_MAX accordingly.
  2) For each *printf functions that returns 'int', define a similar function
     *printfl, that returns 'printf_len_t'.
  3) Introduce %ln as a printf_len_t alternative to %n.
  4) If _PRINTF_LARGE is defined and non-zero, define xxxprintf as an alias
     of xxxprintfl (e.g. '#define xxxprintf xxxprintfl').
  5) Gnulib defines _PRINTF_LARGE to 1.

And similarly for wcswidth, with new function wclswidth and macro
_WCSWIDTH_LARGE.

This way, applications could switch from *printf to *printfl at their pace,
without introducing uncaught overflow bugs at any moment.

Has this already been discussed in the Austin Group, or on the glibc list?

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]