Re: JSON/YAML/TOML/etc. parsing performance

Paul Eggert <address@hidden> schrieb am Di., 3. Okt. 2017 um 22:52 Uhr:

On 10/03/2017 05:26 AM, Philipp Stephani wrote:

> > +#if __has_attribute (warn_unused_result)
> > +# define ATTRIBUTE_WARN_UNUSED_RESULT __attribute__
> ((__warn_unused_result__))
> > +#else
> > +# define ATTRIBUTE_WARN_UNUSED_RESULT
> > +#endif
>
> Hmm... why do we need this attribute? You use it with 2 static
> functions, so this sounds like a left-over from the development stage?
>
> It's not strictly needed (and if you don't like it, I can remove it),
> but it helps catching memory leaks.

I've found __warn_unused_result__ to be more trouble than it's worth in
library functions. Emacs has lib/ignore-value.h in order to work around
__warn_unused_result__ brain damage, for example. For static functions
the problem is less, but still, I mildly favor leaving it out.

OK, I'll remove it.

>
> And btw, how can size be greater than SIZE_MAX in this case? This is
> a valid Lisp object, isn't it? (There are more such tests in the
> patch, e.g. in lisp_to_json, and I think they, too, are redundant.)
>
>
> Depends on the range of ptrdiff_t and size_t. IIUC nothing in the C
> standard guarantees PTRDIFF_MAX <= SIZE_MAX. If we want to guarantee
> that, we should probably add at least a static assertion.

There should be no need for that. No Lisp object can exceed min
(PTRDIFF_MAX, SIZE_MAX) bytes; alloc.c guarantees this, so that Emacs
should work even on oddball platforms where SIZE_MAX < PTRDIFF_MAX, and
there is no need for a runtime check here.

Should I at least add an eassert to document this?

> > + if (buffer_and_size->size > PTRDIFF_MAX)
> > + xsignal1 (Qoverflow_error, build_string ("buffer too large"));
> > + insert (buffer_and_size->buffer, buffer_and_size->size);
>
> I don't think we need this test here, as 'insert' already has the
> equivalent test in one of its subroutines.
>
> It can't, because it takes the byte length as ptrdiff_t. We need to
> check before whether the size is actually in the valid range of ptrdiff_t.

A PTRDIFF_MAX test is needed if the JSON library can return strings
longer than PTRDIFF_MAX. Please just use buffer_overflow () to signal
the error, though.

Done.

> > + case JSON_INTEGER:
> > + {
> > + json_int_t value = json_integer_value (json);
> > + if (FIXNUM_OVERFLOW_P (value))
> > + xsignal1 (Qoverflow_error,
> > + build_string ("JSON integer is too large"));
> > + return make_number (value);
>
> This overflow test is also redundant, as make_number already does it.
>
> It can't, because json_int_t can be larger than EMACS_INT. Also,
> make_number doesn't contain any checks.

You're right that a test is needed. However, elsewhere we just use
xsignal0 (Qoverflow_error) for this sort of thing, and I suggest being
consistent and doing that here as well. Similarly for other calls to
xsignal1 with Qoverflow_error.

Done.

> > + case JSON_STRING:
> > + {
> > + size_t size = json_string_length (json);
> > + if (FIXNUM_OVERFLOW_P (size))
> > + xsignal1 (Qoverflow_error, build_string ("JSON string is
> too long"));
> > + return json_make_string (json_string_value (json), size);
>
> Once again, the overflow test is redundant, as make_specified_string
> (called by json_make_string) already includes an equivalent test.
>
> And once again, we need to check at least whether the size fits into
> ptrdiff_t.
You're right, a test is needed. However, I suggest using string_overflow
() to signal string overflows.

Done. I've attached a new patch (which currently segfaults on decode_coding_gap, but the call to that function doesn't seem to be required anyway).

From:	Philipp Stephani
Subject:	Re: JSON/YAML/TOML/etc. parsing performance
Date:	Sun, 08 Oct 2017 17:58:14 +0000