[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v11 2/7] QIOChannelSocket: Implement io_writev zero copy flag
From: |
Leonardo Bras Soares Passos |
Subject: |
Re: [PATCH v11 2/7] QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX |
Date: |
Thu, 5 May 2022 12:42:47 -0300 |
On Thu, May 5, 2022 at 5:05 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Wed, May 04, 2022 at 04:18:31PM -0300, Leonardo Bras wrote:
> > For CONFIG_LINUX, implement the new zero copy flag and the optional callback
> > io_flush on QIOChannelSocket, but enables it only when MSG_ZEROCOPY
> > feature is available in the host kernel, which is checked on
> > qio_channel_socket_connect_sync()
> >
> > qio_channel_socket_flush() was implemented by counting how many times
> > sendmsg(...,MSG_ZEROCOPY) was successfully called, and then reading the
> > socket's error queue, in order to find how many of them finished sending.
> > Flush will loop until those counters are the same, or until some error
> > occurs.
> >
> > Notes on using writev() with QIO_CHANNEL_WRITE_FLAG_ZERO_COPY:
> > 1: Buffer
> > - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid
> > copying,
> > some caution is necessary to avoid overwriting any buffer before it's sent.
> > If something like this happen, a newer version of the buffer may be sent
> > instead.
> > - If this is a problem, it's recommended to call qio_channel_flush() before
> > freeing
> > or re-using the buffer.
> >
> > 2: Locked memory
> > - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued,
> > and
> > unlocked after it's sent.
> > - Depending on the size of each buffer, and how often it's sent, it may
> > require
> > a larger amount of locked memory than usually available to non-root user.
> > - If the required amount of locked memory is not available, writev_zero_copy
> > will return an error, which can abort an operation like migration,
> > - Because of this, when an user code wants to add zero copy as a feature, it
> > requires a mechanism to disable it, so it can still be accessible to less
> > privileged users.
> >
> > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> > Reviewed-by: Peter Xu <peterx@redhat.com>
> > Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
> > Reviewed-by: Juan Quintela <quintela@redhat.com>
> > ---
> > include/io/channel-socket.h | 2 +
> > io/channel-socket.c | 120 ++++++++++++++++++++++++++++++++++--
> > 2 files changed, 118 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
> > index e747e63514..513c428fe4 100644
> > --- a/include/io/channel-socket.h
> > +++ b/include/io/channel-socket.h
> > @@ -47,6 +47,8 @@ struct QIOChannelSocket {
> > socklen_t localAddrLen;
> > struct sockaddr_storage remoteAddr;
> > socklen_t remoteAddrLen;
> > + ssize_t zero_copy_queued;
> > + ssize_t zero_copy_sent;
> > };
> >
> >
> > diff --git a/io/channel-socket.c b/io/channel-socket.c
> > index 696a04dc9c..ae756ce166 100644
> > --- a/io/channel-socket.c
> > +++ b/io/channel-socket.c
> > @@ -25,9 +25,25 @@
> > #include "io/channel-watch.h"
> > #include "trace.h"
> > #include "qapi/clone-visitor.h"
> > +#ifdef CONFIG_LINUX
> > +#include <linux/errqueue.h>
> > +#include <sys/socket.h>
> > +#endif
> >
> > #define SOCKET_MAX_FDS 16
> >
> > +/*
> > + * Zero-copy defines bellow are included to avoid breaking builds on
> > systems
> > + * that don't support MSG_ZEROCOPY, while keeping the functions more
> > readable
> > + * (without a lot of ifdefs).
> > + */
> > +#ifndef MSG_ZEROCOPY
> > +#define MSG_ZEROCOPY 0x4000000
> > +#endif
> > +#ifndef SO_ZEROCOPY
> > +#define SO_ZEROCOPY 60
> > +#endif
>
> Please put these behind CONFIG_LINUX to make it clear to readers that
> this is entirely Linux specific
>
>
> > +
> > SocketAddress *
> > qio_channel_socket_get_local_address(QIOChannelSocket *ioc,
> > Error **errp)
> > @@ -54,6 +70,8 @@ qio_channel_socket_new(void)
> >
> > sioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET));
> > sioc->fd = -1;
> > + sioc->zero_copy_queued = 0;
> > + sioc->zero_copy_sent = 0;
> >
> > ioc = QIO_CHANNEL(sioc);
> > qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN);
> > @@ -153,6 +171,16 @@ int qio_channel_socket_connect_sync(QIOChannelSocket
> > *ioc,
> > return -1;
> > }
> >
> > +#ifdef CONFIG_LINUX
> > + int ret, v = 1;
> > + ret = setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v));
> > + if (ret == 0) {
> > + /* Zero copy available on host */
> > + qio_channel_set_feature(QIO_CHANNEL(ioc),
> > + QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY);
> > + }
> > +#endif
> > +
> > return 0;
> > }
> >
> > @@ -533,6 +561,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel
> > *ioc,
> > char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)];
> > size_t fdsize = sizeof(int) * nfds;
> > struct cmsghdr *cmsg;
> > + int sflags = 0;
> >
> > memset(control, 0, CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS));
> >
> > @@ -557,15 +586,27 @@ static ssize_t qio_channel_socket_writev(QIOChannel
> > *ioc,
> > memcpy(CMSG_DATA(cmsg), fds, fdsize);
> > }
> >
> > + if (flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) {
> > + sflags = MSG_ZEROCOPY;
> > + }
>
> Also should be behind CONFIG_LINUX
Hello Daniel,
But what if this gets compiled in a Linux system without MSG_ZEROCOPY support?
As qapi will have zero-copy-send as an option we could have this scenario:
- User request migration using zero-copy-send
- multifd_save_setup() will set write_flags = QIO_CHANNEL_WRITE_FLAG_ZERO_COPY
- In qio_channel_socket_connect_sync(): setsockopt() part will be
compiled-out, so no error here
- above part in qio_channel_socket_writev() will be commented-out,
which means write_flags will be ignored
- sflags will not contain MSG_ZEROCOPY, so sendmsg() will use copy-mode
- migration will succeed
In the above case, the user has all the reason to think migration is
using MSG_ZEROCOPY, but in fact it's quietly falling back to
copy-mode.
That's why I suggested creating a 'global' config usiing SO_ZEROCOPY &
MSG_ZEROCOPY & CONFIG_LINUX so we can use in qapi and have no chance
of even offering zero-copy-send if we don't have it.
Another local option is to do implement your suggestions, and also
change qio_channel_socket_connect_sync() so it returns an error if
MSG_ZEROCOPY && SO_ZEROCOPY is not present, such as:
+#ifdef CONFIG_LINUX
+#if defined(MSG_ZEROCOPY) && defined(SO_ZEROCOPY)
+ int ret, v = 1;
+ ret = setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &v, sizeof(v));
+ if (ret == 0) {
+ /* Zero copy available on host */
+ qio_channel_set_feature(QIO_CHANNEL(ioc),
+ QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY);
+ }
+#else
+ error_setg_errno(errp, errno,"MSG_ZEROCOPY not available");
+ return -1;
+#endif
+#endif
What do you think?
Best regards,
Leo
>
> > +
> > retry:
> > - ret = sendmsg(sioc->fd, &msg, 0);
> > + ret = sendmsg(sioc->fd, &msg, sflags);
> > if (ret <= 0) {
> > - if (errno == EAGAIN) {
> > + switch (errno) {
> > + case EAGAIN:
> > return QIO_CHANNEL_ERR_BLOCK;
> > - }
> > - if (errno == EINTR) {
> > + case EINTR:
> > goto retry;
> > + case ENOBUFS:
> > + if (sflags & MSG_ZEROCOPY) {
> > + error_setg_errno(errp, errno,
> > + "Process can't lock enough memory for
> > using MSG_ZEROCOPY");
> > + return -1;
> > + }
> > + break;
>
> And this ENOBUGS case behind CONFIG_LINUX
>
[...]
[PATCH v11 3/7] migration: Add zero-copy-send parameter for QMP/HMP for Linux, Leonardo Bras, 2022/05/04
[PATCH v11 4/7] migration: Add migrate_use_tls() helper, Leonardo Bras, 2022/05/04
[PATCH v11 5/7] multifd: multifd_send_sync_main now returns negative on error, Leonardo Bras, 2022/05/04
[PATCH v11 6/7] multifd: Send header packet without flags if zero-copy-send is enabled, Leonardo Bras, 2022/05/04
[PATCH v11 7/7] multifd: Implement zero copy write in multifd migration (multifd-zero-copy), Leonardo Bras, 2022/05/04