qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/5] qemu/qarray.h: introduce QArray


From: Christian Schoenebeck
Subject: Re: [PATCH v2 1/5] qemu/qarray.h: introduce QArray
Date: Tue, 28 Sep 2021 18:23:23 +0200

On Dienstag, 28. September 2021 15:04:36 CEST Daniel P. Berrangé wrote:
> On Sun, Aug 22, 2021 at 03:16:46PM +0200, Christian Schoenebeck wrote:
> > Implements deep auto free of arrays while retaining common C-style
> > squared bracket access.
> > 
> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > ---
> > 
> >  include/qemu/qarray.h | 150 ++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 150 insertions(+)
> >  create mode 100644 include/qemu/qarray.h
> > 
> > diff --git a/include/qemu/qarray.h b/include/qemu/qarray.h
> > new file mode 100644
> > index 0000000000..9885e5e9ed
> > --- /dev/null
> > +++ b/include/qemu/qarray.h
> > @@ -0,0 +1,150 @@
> > 
> > +#ifndef QEMU_QARRAY_H
> > +#define QEMU_QARRAY_H
> > +
> > +/**
> > + * QArray provides a mechanism to access arrays in common C-style (e.g.
> > by
> > + * square bracket [] operator) in conjunction with reference variables
> > that + * perform deep auto free of the array when leaving the scope of
> > the auto + * reference variable. That means not only is the array itself
> > automatically + * freed, but also memory dynamically allocated by the
> > individual array + * elements.
> > + *
> > + * Example:
> > + *
> > + * Consider the following user struct @c Foo which shall be used as
> > scalar
> > + * (element) type of an array:
> > + * @code
> > + * typedef struct Foo {
> > + *     int i;
> > + *     char *s;
> > + * } Foo;
> > + * @endcode
> > + * and assume it has the following function to free memory allocated by
> > @c Foo + * instances:
> > + * @code
> > + * void free_foo(Foo *foo) {
> > + *     free(foo->s);
> > + * }
> > + * @endcode
> > + * Add the following to a shared header file:
> > + * @code
> > + * DECLARE_QARRAY_TYPE(Foo);
> > + * @endcode
> > + * and the following to a C unit file:
> > + * @code
> > + * DEFINE_QARRAY_TYPE(Foo, free_foo);
> > + * @endcode
> > + * Finally the array may then be used like this:
> > + * @code
> > + * void doSomething(int n) {
> > + *     QArrayRef(Foo) foos = NULL;
> > + *     QARRAY_CREATE(Foo, foos, n);
> > + *     for (size_t i = 0; i < n; ++i) {
> > + *         foos[i].i = i;
> > + *         foos[i].s = calloc(4096, 1);
> > + *         snprintf(foos[i].s, 4096, "foo %d", i);
> > + *     }
> > + * }
> 
> So essentially here the 'foos' variable is still a plain C array
> from POV of the caller ie it is a  'Foo *foos'

Yes, that's the main feature probably: getting rid of any (n times) manually 
written deep-free code branches while at the same time keeping the burden low, 
i.e. requiring to refactor as little existing code as possible, as it is still 
a "normal" C-array from user code perspective, just with meta info prepended 
in front of the C-array that the user code however does not need to care about 
at all:

  Contiguous Memory Space ->
  ----------------------------------------------------------
  | Array-Meta-Info | Scalar 0 | Scalar 1 | ... | Scalar n |
  ----------------------------------------------------------

  ^- QArrayFoo*     ^- Foo*

So switching between the two is just +- const_offset math.

Plus this has the positive side effect that a C-style array code is better to 
the human eye.

For now that "Array-Meta-Info" is really just the array size, in future this 
might be extended to contain a reference count for instance. And as this 
concept is actually generic code (a.k.a template code), you could also extend 
this in a way where you just pull in features that you really need in user 
code. I.e. if you don't need an atomic reference count at one place (and avoid 
the sync overhead), don't enable it for your use case. If you need it 
somewhere else, then enable it there.

> By QARRAY_CREATE() is actually allocating a 'QArrayFoo' which
> is a struct containing a 'size_t len' along with the 'Foo *foos'
> entry.
> 
> This is a clever trick, and it works in the example above,
> because the code already has the length available in a
> convenient way via the 'int n' parameter.
> 
> IMHO this makes the code less useful than it otherwise would
> be in the general case, because there are places where the code
> needs to pass around the array and its assoicated length, and
> so this with this magic hidden length, we're still left to pass
> around the length separately.
> 
> Consider this example:
> 
>   int open_conn(const char *hostname) {
>     SocketAddress *addrs;
>     size_t naddrs;
>     int ret = -1;
>     size_t i;
> 
>     if (resolve_hostname(hostname, &addrs, &naddrs) < 0)
>         return -1;
> 
>     for (i = 0; i < naddrs; i++) {
>         ...try to connect to addrs[i]...
>     }
> 
>     ret = 0
>    cleanup:
>     for (i = 0; i < naddrs; i++) {
>        qapi_free_SocketAddress(addrs[i])
>     }
>     return ret;
>   }
> 
> To simplify this it is deisrable to autofree the 'addrs'
> array.
> 
> If using QArray, it still has to keep passing around the
> 'size_t naddrs' value because QArray hides the length
> field from the code.

Well no, you don't need to pass around anything, as the array length is always 
accessible; it is always just (compile time) constant -wordsize (-padding) 
offset away from the C-array pointer. Maybe the phrasing "private" was a bit 
misleading in the QArray.h comments.

It is correct that my 9p use case so far did not need the array length info by 
means of accessing an API, for that reason I really just ommitted (yet) to add 
a separate patch for that. All it would take was extending QArray.h in a way 
like (roughly):

typedef struct _QArrayGeneric {
    size_t len;
    char first[];
} _QArrayGeneric;

/**
 * Returns the amount of scalar elements in the passed array.
 *
 * @param first - start of array
 */
size_t qarray_len(void* first)
{
    if (!first) {
        return 0;
    }
    _QArrayGeneric *arr = (_QArrayGeneric *) (
        ((char *)first) - offsetof(_QArrayGeneric, first)
    );
    return arr->len;
}

#define QARRAY_LEN(arr) qarray_len(arr)

And as this is generic code for all array scalar types, it would probably be 
partly placed in a separate qarray.c file.

After that change your user example would become:

  for (i = 0; i < QARRAY_LEN(addrs); i++) {
      ...try to connect to addrs[i]...
  }

If you want I can post a v3 with a formal patch (or two) to handle that array 
length API.

> If it instead just exposed the array struct explicitly, it can
> use the normal g_autoptr() declarator, and can also now just
> return the array directly since it is a single pointer
> 
> 
>   int open_conn(const char *hostname) {
>     g_autoptr(SocketAddressArray) addrs = NULL;
>     int ret = -1;
>     size_t i;
> 
>     if (!(addrs = resolve_hostname(hostname)))
>         return -1;
> 
>     for (i = 0; i < addrs.len; i++) {
>         ...try to connect to addrs.data[i]...
>     }
> 
>     ret = 0
>    cleanup:
>     return ret;
>   }
> 
> 
> In terms of the first example, it adds an indirection to access
> the array data, but on the plus side IMHO the code is clearer
> because it uses 'g_autoptr' which is what is more in line with
> what is expected for variables that are automatically freed.
> QArrayRef() as a name doesn't make it clear that the value will
> be freed.
> 
>    void doSomething(int n) {
>        g_autoptr(FooArray) foos = NULL;
>        QARRAY_CREATE(Foo, foos, n);
>        for (size_t i = 0; i < foos.len; ++i) {
>            foos.data[i].i = i;
>            foos.data[i].s = calloc(4096, 1);
>            snprintf(foos.data[i].s, 4096, "foo %d", i);
>        }
>    }

Well, that would destroy the intended major feature "as little refactoring as 
possible". The amount of locations where you define a reference variable is 
usually much smaller than the amount of code locations where you actually 
access arrays.

Personally I would not mix in this case macros of foreign libraries (glib) 
with macros of a local framework (QArray), because if for some reason one of 
the two deviate in future in a certain way, you would need to refactor a whole 
bunch of user code. By just separating those definitions from day one, you can 
avoid such future refactoring work right from the start.

As far as the terminology is concerned: probably a matter of taste. For me a 
"reference" implies (either unique or shared) ownership, a "pointer" IMO 
doesn't. And the usage of QArray alone makes it clear that an array without 
any references gets automatically freed.

> I would also suggest that QARRAY_CREATE doesn't need to
> exist as a macro - callers could just use the allocator
> function directly for clearer code, if it was changed to
> return the ptr rather than use an out parameter:
> 
>    void doSomething(int n) {
>        g_autoptr(FooArray) foos = foo_array_new(n);
>        for (size_t i = 0; i < foos.len; ++i) {
>            foos.data[i].i = i;
>            foos.data[i].s = calloc(4096, 1);
>            snprintf(foos.data[i].s, 4096, "foo %d", i);
>        }
>    }
> 
> For this it needs to pass 2 args into the DECLARE_QARRAY_TYPE
> macro - the struct name and desired method name - basically
> the method name is the struct name in lowercase with underscores.

As you can see with patch 2, one of the movations of making this a macro was 
the intention to increase strictness of type safety, e.g to make things like:

        void *p;
        ...
        QARRAY_CREATE(FooType, p, n);

to raise a compiler error immediately, but that's not all ...

> Overall I think the goal of having an convenient sized array for
> types is good, but I think we should make it look a bit less
> magic. I think we only need the DECLARE_QARRAY_TYPE and
> DEFINE_QARRAY_TYPE macros.

... actually making it appear anything like magic was not my intention. The 
actual main reason for wrapping these things into macros is because that's 
actually the only way to write generic code in C. Especially in larger 
projects like this one I favour clear separation of API ("how to use it") from 
its actual implementation ("how does it do it exactly").

So if you use macros for all those things from the beginning, it is far less 
likely that you will need to refactor a huge amount of user code with future 
changes of this array framework.

> Incidentally, I'd suggest naming to be QARRAY_DECLARE_TYPE
> and QARRAY_DEFINE_TYPE.

Also a matter of taste I guess. The suggested naming DECLARE_QARRAY_TYPE() and 
DEFINE_QARRAY_TYPE() reflect more natural language IMO.

> > + * @endcode
> > + */
> > +
> > +/**
> > + * Declares an array type for the passed @a scalar_type.
> > + *
> > + * This is typically used from a shared header file.
> > + *
> > + * @param scalar_type - type of the individual array elements
> > + */
> > +#define DECLARE_QARRAY_TYPE(scalar_type) \
> > +    typedef struct QArray##scalar_type { \
> > +        size_t len; \
> > +        scalar_type first[]; \
> > +    } QArray##scalar_type; \
> > +    \
> > +    void qarray_create_##scalar_type(scalar_type **auto_var, size_t len);
> > \ +    void qarray_auto_free_##scalar_type(scalar_type **auto_var); \ +
> > +/**
> > + * Defines an array type for the passed @a scalar_type and appropriate
> > + * @a scalar_cleanup_func.
> > + *
> > + * This is typically used from a C unit file.
> > + *
> > + * @param scalar_type - type of the individual array elements
> > + * @param scalar_cleanup_func - appropriate function to free memory
> > dynamically + *                              allocated by individual
> > array elements before + */
> > +#define DEFINE_QARRAY_TYPE(scalar_type, scalar_cleanup_func) \
> > +    void qarray_create_##scalar_type(scalar_type **auto_var, size_t len)
> > \
> > +    { \
> > +        qarray_auto_free_##scalar_type(auto_var); \
> > +        QArray##scalar_type *arr = g_malloc0(sizeof(QArray##scalar_type)
> > + \ +            len * sizeof(scalar_type)); \
> > +        arr->len = len; \
> > +        *auto_var = &arr->first[0]; \
> > +    } \
> > +    \
> > +    void qarray_auto_free_##scalar_type(scalar_type **auto_var) \
> > +    { \
> > +        scalar_type *first = (*auto_var); \
> > +        if (!first) { \
> > +            return; \
> > +        } \
> > +        QArray##scalar_type *arr = (QArray##scalar_type *) ( \
> > +            ((char *)first) - offsetof(QArray##scalar_type, first) \
> > +        ); \
> > +        for (size_t i = 0; i < arr->len; ++i) { \
> > +            scalar_cleanup_func(&arr->first[i]); \
> > +        } \
> > +        g_free(arr); \
> > +    } \
> > +
> > +/**
> > + * Used to declare a reference variable (unique pointer) for an array.
> > After + * leaving the scope of the reference variable, the associated
> > array is + * automatically freed.
> > + *
> > + * @param scalar_type - type of the individual array elements
> > + */
> > +#define QArrayRef(scalar_type) \
> > +    __attribute((__cleanup__(qarray_auto_free_##scalar_type)))
> > scalar_type* +
> > +/**
> > + * Allocates a new array of passed @a scalar_type with @a len number of
> > array + * elements and assigns the created array to the reference
> > variable + * @a auto_var.
> > + *
> > + * @param scalar_type - type of the individual array elements
> > + * @param auto_var - destination reference variable
> > + * @param len - amount of array elements to be allocated immediately
> > + */
> > +#define QARRAY_CREATE(scalar_type, auto_var, len) \
> > +    qarray_create_##scalar_type((&auto_var), len)
> > +
> > +#endif /* QEMU_QARRAY_H */
> 
> Regards,
> Daniel





reply via email to

[Prev in Thread] Current Thread [Next in Thread]