qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v1 0/8] qapi: add generator for Golang interface


From: Andrea Bolognani
Subject: Re: [RFC PATCH v1 0/8] qapi: add generator for Golang interface
Date: Mon, 2 May 2022 10:01:41 -0400

On Mon, May 02, 2022 at 01:46:23PM +0200, Markus Armbruster wrote:
> Andrea Bolognani <abologna@redhat.com> writes:
> >> > The wire protocol would still retain the unappealing name, but at
> >> > least client libraries could hide the uglyness from users.
> >>
> >> At the price of mild inconsistency between the library interface and
> >> QMP.
> >
> > That's fine, and in fact it already happens all the time when QAPI
> > names (log-append) are translated to C identifiers (log_append).
>
> There's a difference between trivial translations like "replace '-' by
> '_'" and arbitrary replacement like the one for enumeration constants
> involving 'prefix'.

Fair enough.

I still feel that 1) users of a language SDK will ideally not need to
look at the QAPI schema or wire chatter too often and 2) even when
that ends up being necessary, figuring out that LogAppend and
logappend are the same thing is not going to be an unreasonable
hurdle, especially when the status quo would be to work with
Logappend instead.

> > The point is that, if we want to provide a language interface that
> > feels natural, we need a way to mark parts of a QAPI symbol's name in
> > a way that makes it possible for the generator to know they're
> > acronyms and treat them in an appropriate, language-specific manner.
>
> It's not just acronyms.  Consider IAmALittleTeapot.  If you can assume
> that only beginning of words are capitalized, even for acronyms, you can
> split this into words without trouble.  You can't recover correct case,
> though: "i am a little teapot" is wrong.

Is there any scenario in which we would care though? We're in the
business of translating identifiers from one machine representation
to another, so once it has been split up correctly into the words
that compose it (which in your example above it has) then we don't
really care about anything else unless acronyms are involved.

In other words, we can obtain the list of words "i am a little
teapot" programmatically both from IAmALittleTeapot and
i-am-a-little-teapot, and in both cases we can then generate
IAmALittleTeapot or I_AM_A_LITTLE_TEAPOT or i_am_a_little_teapot or
whatever is appropriate for the context and target language, but the
fact that in a proper English sentence "I" would have to be
capitalized doesn't really enter the picture.

> "Split before capital letter" falls apart when you have characters that
> cannot be capitalized: Point3d.
>
> Camel case is hopeless.

I would argue that it works quite well for most scenarios, but there
are some corner cases where it's clearly not good enough. If we can
define a way to clue in the generator about "Point3d" having to be
interpreted as "point 3d" and "VNCProps" as "vnc props", then we are
golden. That wouldn't be necessary for simple cases that are already
handled correctly.

A more radical idea would be to start using dash-notation for types
too. That'd remove the word splitting issue altogether, at the cost
of the schema being (possibly) harder to read and more distanced from
the generated code.

You'd still only be able to generate VncProps from vnc-props though.

> > The obvious way to implement this would be with an annotation along
> > the lines of the one I proposed. Other ideas?
>
> I'm afraid having the schema spell out names in multiple naming
> conventions could be onerous.  How many names will need it?

I don't have hard data on this. I could try extracting it, but that
might end up being a bigger job than I had anticipated.

My guess is that the number of cases where the naive algorithm can't
split words correctly is relatively small compared to the size of the
entire QAPI schema. Fair warning: I have made incorrect guesses in
the past ;)

> Times how many naming conventions?

Yeah, I don't think requiring all possible permutations to be spelled
out in the schema is the way to go. That's exactly why my proposal
was to offer a way to inject the semantic information that the parser
can't figure out itself.

Once you have a way to inform the generator that "VNCProps" is made
of the two words "vnc" and "props", and that "vnc" is an acronym,
then it can generate an identifier appropriate for the target
language without having to spell out anywhere that such an identifier
would be VNCProps for Go and VncProps for Rust.

By the way, while looking around I realized that we also have to take
into account things like D-Bus: the QAPI type ChardevDBus, for
example, would probably translate verbatim to Go but have to be
changed to ChardevDbus for Rust. Fun :)

Revised proposal for the annotation:

  ns:word-WORD-WoRD-123Word

Words are always separated by dashes; "regular" words are entirely
lowercase, while the presence of even a single uppercase letter in a
word denotes the fact that its case should be preserved when the
naming conventions of the target language allow that.

> Another issue: the fancier the translation from schema name to
> language-specific name gets, the harder it becomes to find one from the
> other.

That's true, but at least to me the trade-off feels reasonable.

-- 
Andrea Bolognani / Red Hat / Virtualization




reply via email to

[Prev in Thread] Current Thread [Next in Thread]