[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[RFC PATCH v1 10/26] migration/ram: Introduce 'fixed-ram' migration stre
From: |
Fabiano Rosas |
Subject: |
[RFC PATCH v1 10/26] migration/ram: Introduce 'fixed-ram' migration stream capability |
Date: |
Thu, 30 Mar 2023 15:03:20 -0300 |
From: Nikolay Borisov <nborisov@suse.com>
Implement 'fixed-ram' feature. The core of the feature is to ensure that
each ram page of the migration stream has a specific offset in the
resulting migration stream. The reason why we'd want such behavior are
two fold:
- When doing a 'fixed-ram' migration the resulting file will have a
bounded size, since pages which are dirtied multiple times will
always go to a fixed location in the file, rather than constantly
being added to a sequential stream. This eliminates cases where a vm
with, say, 1G of ram can result in a migration file that's 10s of
GBs, provided that the workload constantly redirties memory.
- It paves the way to implement DIO-enabled save/restore of the
migration stream as the pages are ensured to be written at aligned
offsets.
The feature requires changing the stream format. First, a bitmap is
introduced which tracks which pages have been written (i.e are
dirtied) during migration and subsequently it's being written in the
resulting file, again at a fixed location for every ramblock. Zero
pages are ignored as they'd be zero in the destination migration as
well. With the changed format data would look like the following:
|name len|name|used_len|pc*|bitmap_size|pages_offset|bitmap|pages|
* pc - refers to the page_size/mr->addr members, so newly added members
begin from "bitmap_size".
This layout is initialized during ram_save_setup so instead of having a
sequential stream of pages that follow the ramblock headers the dirty
pages for a ramblock follow its header. Since all pages have a fixed
location RAM_SAVE_FLAG_EOS is no longer generated on every migration
iteration but there is effectively a single RAM_SAVE_FLAG_EOS right at
the end.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
docs/devel/migration.rst | 36 +++++++++++++++
include/exec/ramblock.h | 8 ++++
migration/migration.c | 51 +++++++++++++++++++++-
migration/migration.h | 1 +
migration/ram.c | 94 +++++++++++++++++++++++++++++++++-------
migration/savevm.c | 1 +
qapi/migration.json | 2 +-
7 files changed, 176 insertions(+), 17 deletions(-)
diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
index 1080211f8e..84112d7f3f 100644
--- a/docs/devel/migration.rst
+++ b/docs/devel/migration.rst
@@ -568,6 +568,42 @@ Others (especially either older devices or system devices
which for
some reason don't have a bus concept) make use of the ``instance id``
for otherwise identically named devices.
+Fixed-ram format
+----------------
+
+When the ``fixed-ram`` capability is enabled, a slightly different
+stream format is used for the RAM section. Instead of having a
+sequential stream of pages that follow the RAMBlock headers, the dirty
+pages for a RAMBlock follow its header. This ensures that each RAM
+page has a fixed offset in the resulting migration stream.
+
+ - RAMBlock 1
+
+ - ID string length
+ - ID string
+ - Used size
+ - Shadow bitmap size
+ - Pages offset in migration stream*
+
+ - Shadow bitmap
+ - Sequence of pages for RAMBlock 1 (* offset points here)
+
+ - RAMBlock 2
+
+ - ID string length
+ - ID string
+ - Used size
+ - Shadow bitmap size
+ - Pages offset in migration stream*
+
+ - Shadow bitmap
+ - Sequence of pages for RAMBlock 2 (* offset points here)
+
+The ``fixed-ram`` capaility can be enabled in both source and
+destination with:
+
+ ``migrate_set_capability fixed-ram on``
+
Return path
-----------
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index adc03df59c..4360c772c2 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -43,6 +43,14 @@ struct RAMBlock {
size_t page_size;
/* dirty bitmap used during migration */
unsigned long *bmap;
+ /* shadow dirty bitmap used when migrating to a file */
+ unsigned long *shadow_bmap;
+ /*
+ * offset in the file pages belonging to this ramblock are saved,
+ * used only during migration to a file.
+ */
+ off_t bitmap_offset;
+ uint64_t pages_offset;
/* bitmap of already received pages in postcopy */
unsigned long *receivedmap;
diff --git a/migration/migration.c b/migration/migration.c
index 177fb0de0f..29630523e2 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -168,7 +168,8 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot,
MIGRATION_CAPABILITY_XBZRLE,
MIGRATION_CAPABILITY_X_COLO,
MIGRATION_CAPABILITY_VALIDATE_UUID,
- MIGRATION_CAPABILITY_ZERO_COPY_SEND);
+ MIGRATION_CAPABILITY_ZERO_COPY_SEND,
+ MIGRATION_CAPABILITY_FIXED_RAM);
/* When we add fault tolerance, we could have several
migrations at once. For now we don't need to add
@@ -1341,6 +1342,28 @@ static bool migrate_caps_check(bool *cap_list,
}
#endif
+ if (cap_list[MIGRATION_CAPABILITY_FIXED_RAM]) {
+ if (cap_list[MIGRATION_CAPABILITY_MULTIFD]) {
+ error_setg(errp, "Directly mapped memory incompatible with
multifd");
+ return false;
+ }
+
+ if (cap_list[MIGRATION_CAPABILITY_XBZRLE]) {
+ error_setg(errp, "Directly mapped memory incompatible with
xbzrle");
+ return false;
+ }
+
+ if (cap_list[MIGRATION_CAPABILITY_COMPRESS]) {
+ error_setg(errp, "Directly mapped memory incompatible with
compression");
+ return false;
+ }
+
+ if (cap_list[MIGRATION_CAPABILITY_POSTCOPY_RAM]) {
+ error_setg(errp, "Directly mapped memory incompatible with
postcopy ram");
+ return false;
+ }
+ }
+
if (cap_list[MIGRATION_CAPABILITY_POSTCOPY_RAM]) {
/* This check is reasonably expensive, so only when it's being
* set the first time, also it's only the destination that needs
@@ -2736,6 +2759,11 @@ MultiFDCompression migrate_multifd_compression(void)
return s->parameters.multifd_compression;
}
+int migrate_fixed_ram(void)
+{
+ return
migrate_get_current()->enabled_capabilities[MIGRATION_CAPABILITY_FIXED_RAM];
+}
+
int migrate_multifd_zlib_level(void)
{
MigrationState *s;
@@ -4324,6 +4352,20 @@ fail:
return NULL;
}
+static int migrate_check_fixed_ram(MigrationState *s, Error **errp)
+{
+ if (!s->enabled_capabilities[MIGRATION_CAPABILITY_FIXED_RAM]) {
+ return 0;
+ }
+
+ if (!qemu_file_is_seekable(s->to_dst_file)) {
+ error_setg(errp, "Directly mapped memory requires a seekable
transport");
+ return -1;
+ }
+
+ return 0;
+}
+
void migrate_fd_connect(MigrationState *s, Error *error_in)
{
Error *local_err = NULL;
@@ -4390,6 +4432,12 @@ void migrate_fd_connect(MigrationState *s, Error
*error_in)
}
}
+ if (migrate_check_fixed_ram(s, &local_err) < 0) {
+ migrate_fd_cleanup(s);
+ migrate_fd_error(s, local_err);
+ return;
+ }
+
if (resume) {
/* Wakeup the main migration thread to do the recovery */
migrate_set_state(&s->state, MIGRATION_STATUS_POSTCOPY_PAUSED,
@@ -4519,6 +4567,7 @@ static Property migration_properties[] = {
DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
/* Migration capabilities */
+ DEFINE_PROP_MIG_CAP("x-fixed-ram", MIGRATION_CAPABILITY_FIXED_RAM),
DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
DEFINE_PROP_MIG_CAP("x-rdma-pin-all", MIGRATION_CAPABILITY_RDMA_PIN_ALL),
DEFINE_PROP_MIG_CAP("x-auto-converge", MIGRATION_CAPABILITY_AUTO_CONVERGE),
diff --git a/migration/migration.h b/migration/migration.h
index 2da2f8a164..8cf3caecfe 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -416,6 +416,7 @@ bool migrate_zero_blocks(void);
bool migrate_dirty_bitmaps(void);
bool migrate_ignore_shared(void);
bool migrate_validate_uuid(void);
+int migrate_fixed_ram(void);
bool migrate_auto_converge(void);
bool migrate_use_multifd(void);
diff --git a/migration/ram.c b/migration/ram.c
index 96e8a19a58..56f0f782c8 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1310,9 +1310,14 @@ static int save_zero_page_to_file(PageSearchStatus *pss,
int len = 0;
if (buffer_is_zero(p, TARGET_PAGE_SIZE)) {
- len += save_page_header(pss, block, offset | RAM_SAVE_FLAG_ZERO);
- qemu_put_byte(file, 0);
- len += 1;
+ if (migrate_fixed_ram()) {
+ /* for zero pages we don't need to do anything */
+ len = 1;
+ } else {
+ len += save_page_header(pss, block, offset | RAM_SAVE_FLAG_ZERO);
+ qemu_put_byte(file, 0);
+ len += 1;
+ }
ram_release_page(block->idstr, offset);
}
return len;
@@ -1394,14 +1399,20 @@ static int save_normal_page(PageSearchStatus *pss,
RAMBlock *block,
{
QEMUFile *file = pss->pss_channel;
- ram_transferred_add(save_page_header(pss, block,
- offset | RAM_SAVE_FLAG_PAGE));
- if (async) {
- qemu_put_buffer_async(file, buf, TARGET_PAGE_SIZE,
- migrate_release_ram() &&
- migration_in_postcopy());
+ if (migrate_fixed_ram()) {
+ qemu_put_buffer_at(file, buf, TARGET_PAGE_SIZE,
+ block->pages_offset + offset);
+ set_bit(offset >> TARGET_PAGE_BITS, block->shadow_bmap);
} else {
- qemu_put_buffer(file, buf, TARGET_PAGE_SIZE);
+ ram_transferred_add(save_page_header(pss, block,
+ offset | RAM_SAVE_FLAG_PAGE));
+ if (async) {
+ qemu_put_buffer_async(file, buf, TARGET_PAGE_SIZE,
+ migrate_release_ram() &&
+ migration_in_postcopy());
+ } else {
+ qemu_put_buffer(file, buf, TARGET_PAGE_SIZE);
+ }
}
ram_transferred_add(TARGET_PAGE_SIZE);
stat64_add(&ram_atomic_counters.normal, 1);
@@ -2731,6 +2742,8 @@ static void ram_save_cleanup(void *opaque)
block->clear_bmap = NULL;
g_free(block->bmap);
block->bmap = NULL;
+ g_free(block->shadow_bmap);
+ block->shadow_bmap = NULL;
}
xbzrle_cleanup();
@@ -3098,6 +3111,7 @@ static void ram_list_init_bitmaps(void)
*/
block->bmap = bitmap_new(pages);
bitmap_set(block->bmap, 0, pages);
+ block->shadow_bmap = bitmap_new(block->used_length >>
TARGET_PAGE_BITS);
block->clear_bmap_shift = shift;
block->clear_bmap = bitmap_new(clear_bmap_size(pages, shift));
}
@@ -3287,6 +3301,33 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
if (migrate_ignore_shared()) {
qemu_put_be64(f, block->mr->addr);
}
+
+ if (migrate_fixed_ram()) {
+ long num_pages = block->used_length >> TARGET_PAGE_BITS;
+ long bitmap_size = BITS_TO_LONGS(num_pages) * sizeof(unsigned
long);
+
+ /* Needed for external programs (think analyze-migration.py) */
+ qemu_put_be32(f, bitmap_size);
+
+ /*
+ * The bitmap starts after pages_offset, so add 8 to
+ * account for the pages_offset size.
+ */
+ block->bitmap_offset = qemu_get_offset(f) + 8;
+
+ /*
+ * Make pages_offset aligned to 1 MiB to account for
+ * migration file movement between filesystems with
+ * possibly different alignment restrictions when
+ * using O_DIRECT.
+ */
+ block->pages_offset = ROUND_UP(block->bitmap_offset +
+ bitmap_size, 0x100000);
+ qemu_put_be64(f, block->pages_offset);
+
+ /* Now prepare offset for next ramblock */
+ qemu_set_offset(f, block->pages_offset + block->used_length,
SEEK_SET);
+ }
}
}
@@ -3306,6 +3347,18 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
return 0;
}
+static void ram_save_shadow_bmap(QEMUFile *f)
+{
+ RAMBlock *block;
+
+ RAMBLOCK_FOREACH_MIGRATABLE(block) {
+ long num_pages = block->used_length >> TARGET_PAGE_BITS;
+ long bitmap_size = BITS_TO_LONGS(num_pages) * sizeof(unsigned long);
+ qemu_put_buffer_at(f, (uint8_t *)block->shadow_bmap, bitmap_size,
+ block->bitmap_offset);
+ }
+}
+
/**
* ram_save_iterate: iterative stage for migration
*
@@ -3413,9 +3466,15 @@ out:
return ret;
}
- qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
- qemu_fflush(f);
- ram_transferred_add(8);
+ /*
+ * For fixed ram we don't want to pollute the migration stream with
+ * EOS flags.
+ */
+ if (!migrate_fixed_ram()) {
+ qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
+ qemu_fflush(f);
+ ram_transferred_add(8);
+ }
ret = qemu_file_get_error(f);
}
@@ -3461,6 +3520,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
pages = ram_find_and_save_block(rs);
/* no more blocks to sent */
if (pages == 0) {
+ if (migrate_fixed_ram()) {
+ ram_save_shadow_bmap(f);
+ }
break;
}
if (pages < 0) {
@@ -3483,8 +3545,10 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
return ret;
}
- qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
- qemu_fflush(f);
+ if (!migrate_fixed_ram()) {
+ qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
+ qemu_fflush(f);
+ }
return 0;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index 92102c1fe5..1f1bc19224 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -241,6 +241,7 @@ static bool should_validate_capability(int capability)
/* Validate only new capabilities to keep compatibility. */
switch (capability) {
case MIGRATION_CAPABILITY_X_IGNORE_SHARED:
+ case MIGRATION_CAPABILITY_FIXED_RAM:
return true;
default:
return false;
diff --git a/qapi/migration.json b/qapi/migration.json
index c84fa10e86..22eea58ce3 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -485,7 +485,7 @@
##
{ 'enum': 'MigrationCapability',
'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
- 'compress', 'events', 'postcopy-ram',
+ 'compress', 'events', 'postcopy-ram', 'fixed-ram',
{ 'name': 'x-colo', 'features': [ 'unstable' ] },
'release-ram',
'block', 'return-path', 'pause-before-switchover', 'multifd',
--
2.35.3
- [RFC PATCH v1 00/26] migration: File based migration with multifd and fixed-ram, Fabiano Rosas, 2023/03/30
- [RFC PATCH v1 02/26] migration: Add support for 'file:' uri for incoming migration, Fabiano Rosas, 2023/03/30
- [RFC PATCH v1 04/26] tests/qtest: migration-test: Add tests for file-based migration, Fabiano Rosas, 2023/03/30
- [RFC PATCH v1 01/26] migration: Add support for 'file:' uri for source migration, Fabiano Rosas, 2023/03/30
- [RFC PATCH v1 03/26] tests/qtest: migration: Add migrate_incoming_qmp helper, Fabiano Rosas, 2023/03/30
- [RFC PATCH v1 05/26] migration: Initial support of fixed-ram feature for analyze-migration.py, Fabiano Rosas, 2023/03/30
- [RFC PATCH v1 06/26] io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file, Fabiano Rosas, 2023/03/30
- [RFC PATCH v1 08/26] io: implement io_pwritev/preadv for QIOChannelFile, Fabiano Rosas, 2023/03/30
- [RFC PATCH v1 07/26] io: Add generic pwritev/preadv interface, Fabiano Rosas, 2023/03/30
- [RFC PATCH v1 09/26] migration/qemu-file: add utility methods for working with seekable channels, Fabiano Rosas, 2023/03/30
- [RFC PATCH v1 10/26] migration/ram: Introduce 'fixed-ram' migration stream capability,
Fabiano Rosas <=
- Re: [RFC PATCH v1 10/26] migration/ram: Introduce 'fixed-ram' migration stream capability, Fabiano Rosas, 2023/03/31
Re: [RFC PATCH v1 10/26] migration/ram: Introduce 'fixed-ram' migration stream capability, Markus Armbruster, 2023/03/31
[RFC PATCH v1 11/26] migration: Refactor precopy ram loading code, Fabiano Rosas, 2023/03/30
[RFC PATCH v1 12/26] migration: Add support for 'fixed-ram' migration restore, Fabiano Rosas, 2023/03/30
[RFC PATCH v1 13/26] tests/qtest: migration-test: Add tests for fixed-ram file-based migration, Fabiano Rosas, 2023/03/30