[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug 1912224] Re: qemu may freeze during drive-mirroring on fragmented F
From: |
Alexandre arents |
Subject: |
[Bug 1912224] Re: qemu may freeze during drive-mirroring on fragmented FS |
Date: |
Wed, 10 Feb 2021 13:36:48 -0000 |
Some more data for Qcow2,
Qcow2 format is not impacted by the issue because it
does not do SEEK_DATA/HOLE at all, because as you said it knows
from its own metadata.
But qcow2 instances with a RAW backing file does..
So qcow2 instances laverage also benefit of the patch,
if they have big RAW backing file (or smaller but fragmented).
Openstack default is RAW backing file for qcow2
instance images_type.
I made others test on local mirror:
*57 GiB disk in on ssd, mirror file is on a nvme device
*the disk is on fragmented file
(Average size per extent 2125 KB, nb_extent: 28495, ext4)
*both qcow2 and raw test are run on the same fragmented file,
only content change(format/data).
RESULTS
master____ RAW_________ 203s 291MB/s qemu freeze
master-fix RAW_________ 113s 523MB/s qemu stable (+56% perf)
master____ QCOW2_______ 115s 505MB/s qemu stable
master-fix QCOW2_______ 116s 505MB/s qemu stable
master____ QCOW2-SML_BF 113s 523MB/s qemu stable (1)
master-fix QCOW2-SML_BF 113s 523MB/s qemu stable (1)
master____ QCOW2-BIG_BF 201s 294MB/s qemu freeze (2)
master-fix QCOW2-BIG_BF 112s 523MB/s qemu stable (2) (+56% perf)
(1) qcow2 disk with small RAW backing file (1.5GB)
big part of data are in qcow2 format
(2) qcow2 disk with all data in RAW backing file (57GiB)
Here are some stap metrics during test:
(see script at the end)
find_allocation: total count of qemu find_allocation calls
iomap_seek_hole: count SEEK_HOLE (fs/iomap.c)
iomap_apply: count file extent iomap (fs/iomap.c)
iomap_seek_hole loop iomap_apply calls on all file extent until
it finds a hole or EOF, it can be 1 up to 28495 in worst case
in the test file.
master____ RAW_________ 203s 291MB/s qemu freeze
find_allocation: 59139
iomap_seek_hole: 59139 (each find_allocation call SEEK_HOLE)
iomap_apply: 843330989 (this breaks qemu)
master-fix RAW_________ 113s 523MB/s qemu stable (+56% perf)
find_allocation: 59167
iomap_seek_hole: 4 (hole cache hit 99.993%)
iomap_apply: 113286 (avg iomap_apply call per SEEK_HOLE: 28321)
master____ QCOW2_______ 115s 505MB/s qemu stable
find_allocation: 0
iomap_seek_hole: 0
iomap_apply: 0
master-fix QCOW2_______ 116s 505MB/s qemu stable
find_allocation: 0
iomap_seek_hole: 0
iomap_apply: 0
master____ QCOW2-SML_BF 113s 523MB/s qemu stable (1)
find_allocation: 1418
iomap_seek_hole: 1297
iomap_apply: 7297
master-fix QCOW2-SML_BF 113s 523MB/s qemu stable (1)
find_allocation: 1418
iomap_seek_hole: 145 (hole cache hit 0.898%)
iomap_apply: 794
master____ QCOW2-BIG_BF 201s 294MB/s qemu freeze (2)
find_allocation: 59133
iomap_seek_hole: 59133
iomap_apply: 843172534
master-fix QCOW2-BIG_BF 112s 523MB/s qemu stable (2) (+56% perf)
find_allocation: 59130
iomap_seek_hole: 1 (hole cache hit 0.999%)
iomap_apply: 28494
################# seek_hole.stp ###############################
global iomap_apply, iomap_seek_hole, find_allocation
probe kernel.function("iomap_seek_hole").call {
if (pid() == target()) {
iomap_seek_hole ++
}
}
probe kernel.function("iomap_apply").call {
if (pid() == target()) {
iomap_apply ++
}
}
probe process("/usr/bin/qemu-system-x86_64").function("find_allocation").call {
if (pid() == target()) {
find_allocation ++
}
}
probe end {
printf ("find_allocation: %d\niomap_seek_hole: %d\niomap_apply:
%d\n",find_allocation, iomap_seek_hole, iomap_apply)
}
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1912224
Title:
qemu may freeze during drive-mirroring on fragmented FS
Status in QEMU:
New
Bug description:
We have odd behavior in operation where qemu freeze during long
seconds, We started an thread about that issue here:
https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg05623.html
It happens at least during openstack nova snapshot (qemu blockdev-mirror)
or live block migration(which include network copy of disk).
After further troubleshoots, it seems related to FS fragmentation on
host.
reproducible at least on:
Ubuntu 18.04.3/4.18.0-25-generic/qemu-4.0
Ubuntu 16.04.6/5.10.6/qemu-5.2.0-rc2
# Lets create a dedicated file system on a SSD/Nvme 60GB disk in my case:
$sudo mkfs.ext4 /dev/sda3
$sudo mount /dev/sda3 /mnt
$df -h /mnt
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 59G 53M 56G 1% /mnt
#Create a fragmented disk on it using 2MB Chunks (about 30min):
$sudo python3 create_fragged_disk.py /mnt 2
Filling up FS by Creating chunks files in: /mnt/chunks
We are probably full as expected!!: [Errno 28] No space left on device
Creating fragged disk file: /mnt/disk
$ls -lhs
59G -rw-r--r-- 1 root root 59G Jan 15 14:08 /mnt/disk
$ sudo e4defrag -c /mnt/disk
Total/best extents 41971/30
Average size per extent 1466 KB
Fragmentation score 2
[0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
This file (/mnt/disk) does not need defragmentation.
Done.
# the tool^^^ says it is not enough fragmented to be able to defrag.
#Inject an image on fragmented disk
sudo chown ubuntu /mnt/disk
wget
https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img
qemu-img convert -O raw bionic-server-cloudimg-amd64.img \
bionic-server-cloudimg-amd64.img.raw
dd conv=notrunc iflag=fullblock if=bionic-server-cloudimg-amd64.img.raw \
of=/mnt/disk bs=1M
virt-customize -a /mnt/disk --root-password password:xxxx
# logon run console activity ex: ping -i 0.3 127.0.0.1
$qemu-system-x86_64 -m 2G -enable-kvm -nographic \
-chardev socket,id=test,path=/tmp/qmp-monitor,server,nowait \
-mon chardev=test,mode=control \
-drive
file=/mnt/disk,format=raw,if=none,id=drive-virtio-disk0,cache=none,discard\
-device
virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on
$sync
$echo 3 | sudo tee -a /proc/sys/vm/drop_caches
#start drive-mirror via qmp on another SSD/nvme partition
nc -U /tmp/qmp-monitor
{"execute":"qmp_capabilities"}
{"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"/home/ubuntu/mirror","sync":"full","format":"qcow2"}}
^^^ qemu console may start to freeze at this step.
NOTE:
- smaller chunk sz and bigger disk size the worst it is.
In operation we also have issue on 400GB disk size with average 13MB/extent
- Reproducible also on xfs
Expected behavior:
-------------------
QEMU should remain steady, eventually only have decrease storage Performance
or mirroring, because of fragmented fs.
Observed behavior:
-------------------
Perf of mirroring is still quite good even on fragmented FS,
but it breaks qemu.
###################### create_fragged_disk.py ############
import sys
import os
import tempfile
import glob
import errno
MNT_DIR = sys.argv[1]
CHUNK_SZ_MB = int(sys.argv[2])
CHUNKS_DIR = MNT_DIR + '/chunks'
DISK_FILE = MNT_DIR + '/disk'
if not os.path.exists(CHUNKS_DIR):
os.makedirs(CHUNKS_DIR)
with open("/dev/urandom", "rb") as f_rand:
mb_rand=f_rand.read(1024 * 1024)
print("Filling up FS by Creating chunks files in: ",CHUNKS_DIR)
try:
while True:
tp = tempfile.NamedTemporaryFile(dir=CHUNKS_DIR,delete=False)
for x in range(CHUNK_SZ_MB):
tp.write(mb_rand)
os.fsync(tp)
tp.close()
except Exception as ex:
print("We are probably full as expected!!: ",ex)
chunks = glob.glob(CHUNKS_DIR + '/*')
print("Creating fragged disk file: ",DISK_FILE)
with open(DISK_FILE, "w+b") as f_disk:
for chunk in chunks:
try:
os.unlink(chunk)
for x in range(CHUNK_SZ_MB):
f_disk.write(mb_rand)
os.fsync(f_disk)
except IOError as ex:
if ex.errno != errno.ENOSPC:
raise
###########################################################3
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1912224/+subscriptions
- [Bug 1912224] Re: qemu may freeze during drive-mirroring on fragmented FS, Alexandre arents, 2021/02/02
- [Bug 1912224] Re: qemu may freeze during drive-mirroring on fragmented FS, Max Reitz, 2021/02/03
- [Bug 1912224] Re: qemu may freeze during drive-mirroring on fragmented FS, Max Reitz, 2021/02/03
- [Bug 1912224] Re: qemu may freeze during drive-mirroring on fragmented FS, Alexandre arents, 2021/02/03
- [Bug 1912224] Re: qemu may freeze during drive-mirroring on fragmented FS, Max Reitz, 2021/02/03
- [Bug 1912224] Re: qemu may freeze during drive-mirroring on fragmented FS, Alexandre arents, 2021/02/04
- [Bug 1912224] Re: qemu may freeze during drive-mirroring on fragmented FS,
Alexandre arents <=