discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] compressing I/Q files


From: Kristoff
Subject: Re: [Discuss-gnuradio] compressing I/Q files
Date: Thu, 14 Mar 2019 10:49:45 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1

Marcus, all,


Thx.

In the mean time, I did a little bit of testing.

A 256 MB piece of a I/Q file (a pass of NOAA-19), sampled at 240 Ksps.
Gzip compressed this down to 40 MB. 7Zip managed to get this down to 29 MB (but compressing took 10 to 20 times longer).

Now, after converting this file from float to short, you get a 128 MB file.
However, if you then compress that, the gain isn't that big anymore: gzip 33 MB, 7zip 25 MB.


My guess is that gzip and 7zip do compression based on looking for repetitive patterns. This means that converting 32bit floats to 16bit shorts does not really help if you plan to compress the files afterwards anyway.



Kristoff


On 10/03/19 18:33, Marcus Müller wrote:
Hi Kristoff, Benny and Alban,

TL;DR:
Benny is exactly on spot. Other than that, decimate your signal if you
know the bandwidth is less than your sampling rate, and don't put too
much hope on audio encoders.

Long Version:

Point is: the signal coming from your SDR device, whatever that might
be, has finite resolution – typically, no more than 16 bits per
channel. Hence, the conversion from float to short (or directly getting
short, if your device driver allows that) is lossless. For example,
USRPs' driver (UHD), and the GNU Radio USRP source, can be configured
to hand out the signed complex 16 bit conversion of the data from the
network or USB interface instead of the float32 conversion.

Any other compression method can only do so much:
Your signal recording is essentially random – meaning that all values
should be roughly equally likely. Maybe extreme high amplitudes are a
little rarer, since you'd typically avoid those to stay clear of
clipping.
That means that the average info per sample is relatively high: From
seeing other samples, we know very little about it, so the surprise we
get from its actual value is pretty high. Information-theoretically,
the expected information content per sample is the entropy of a source.
Information and entropy are both measured in bit – the completely fair
random decision between 0 and 1 ("flipping a coin") is worth 1 bit, and
picking one out of 2¹⁶ values perfectly randomly is worth 16 bit.

(Lossless) compression can, best case, achieve a compression where the
amount of bits used per sample is equal to the entropy of the source.
Now, if your signal is somewhat noisy, and other than that relatively
interesting (i.e. you're not observing a constant value), your source
entropy often approaches the limit given by the ADC – in my tests, even
on severly backed-off signals, standard Huffmann and Lempel-Ziv-Welch
compressors (zip, gzip, 7z, zstd, bz2, xz) achieved negligible
compression ratios on radio recordings.

I've tried FLAC, too – FLAC doesn't allow to set the actual sampling
rate as high as was truly used by typical SDR hardware (i.e. the header
field for the sampling rate simply doesn't have enough size to allow
for 10⁷, for example). But that's mainly a metadata problem that can be
solved by ignorance.
However, FLAC's linear prediction coding relies on signals having
a) "small" deviation from a linear function for short time periods, and
b) the following residual coding relies on geometric distribution –

and that's usually not given, because
a) if you already know you will be in need of compression, you're
probably not significantly oversampling your signal, but are already
decimating it to a rate barely more than sufficient. Everything else
would be a larger waste of space – and has no benefits for signal
analysis later, and
b) with the prior assumption broken, only a zero-order linear precoder
doesn't make things worse – i.e., simply handing through the input
samples to the residual coder. That residual coder, as said, depends on
the distribution of amplitudes to follow a specific statistic to work
well.  Sadly, that statistic doesn't apply to I&Q signals, typically.

My experience is that FLAC doesn't work well for anything that's not
massively oversampled AM audio – which is no surprise, because that
literally isn't very different from audio, which is what FLAC was
designed for.

However, my FLAC experiments lie years in the past – maybe the encoder
got more versatile; Alban, do you have deviating experience?

Best regards,
Marcus
On Sun, 2019-03-10 at 11:54 +0000, Benny Alexandar wrote:
  Yes, converting float 32bit to short16 is an option, compressing
using 7zip or gzip won't give good compression .
From: Discuss-gnuradio <
address@hidden> on behalf of
Kristoff<address@hidden>
Sent: Sunday, March 10, 2019 3:57 PM
To:address@hidden
Subject: [Discuss-gnuradio] compressing I/Q files
Hi all,



Simple and short question:
What is the best way to compress a raw I/Q file? A generic
compression-tool like gzip, zip? Or are there better and specialised
tools?


Is converting the data in the I/Q file from float to short an option?


Kristoff


_______________________________________________
Discuss-gnuradio mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
_______________________________________________
Discuss-gnuradio mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio



reply via email to

[Prev in Thread] Current Thread [Next in Thread]