[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection
From: |
Qiuhao Li |
Subject: |
Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection |
Date: |
Mon, 11 Jan 2021 10:19:57 +0800 |
User-agent: |
Evolution 3.36.4-0ubuntu1 |
On Sun, 2021-01-10 at 11:00 -0500, Alexander Bulekov wrote:
> On 210110 2110, Qiuhao Li wrote:
> > On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote:
> > > On 201229 1240, Qiuhao Li wrote:
> > > > We spend much time waiting for the timeout program during the
> > > > minimization
> > > > process until it passes a time limit. This patch hacks the
> > > > CLOSED
> > > > (indicates
> > > > the redirection file closed) notification in QTest's output if
> > > > it
> > > > doesn't
> > > > crash.
> > > >
> > > > Test with quadrupled trace input at:
> > > > https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> > > >
> > > > Original version:
> > > > real 1m37.246s
> > > > user 0m13.069s
> > > > sys 0m8.399s
> > > >
> > > > Refined version:
> > > > real 0m45.904s
> > > > user 0m16.874s
> > > > sys 0m10.042s
> > > >
> > > > Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
> > > > ---
> > > > scripts/oss-fuzz/minimize_qtest_trace.py | 41
> > > > ++++++++++++++++--
> > > > ------
> > > > 1 file changed, 28 insertions(+), 13 deletions(-)
> > > >
> > > > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > index 5e405a0d5f..aa69c7963e 100755
> > > > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > @@ -29,30 +29,46 @@ whether the crash occred. Optionally,
> > > > manually
> > > > set a string that idenitifes the
> > > > crash by setting CRASH_TOKEN=
> > > > """.format((sys.argv[0])))
> > > >
> > > > +deduplication_note = """\n\
> > > > +Note: While trimming the input, sometimes the mutated trace
> > > > triggers a different
> > > > +crash output but indicates the same bug. Under this situation,
> > > > our
> > > > minimizer is
> > > > +incapable of recognizing and stopped from removing it. In the
> > > > future, we may
> > > > +use a more sophisticated crash case deduplication method.
> > > > +\n"""
> > > > +
> > > > def check_if_trace_crashes(trace, path):
> > > > - global CRASH_TOKEN
> > > > with open(path, "w") as tracefile:
> > > > tracefile.write("".join(trace))
> > > >
> > > > - rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path}
> > > > {qemu_args} 2>&1\
> > > > + proc = subprocess.Popen("timeout {timeout}s {qemu_path}
> > > > {qemu_args} 2>&1\
> > >
> > > Why remove the -s 9 here? I ran into a case where the minimizer
> > > got
> > > stuck on one iteration. Adding back "sigkill" to the timeout can
> > > be a
> > > safety net to catch those bad cases.
> > > -Alex
> >
> > Hi Alex,
> >
> > After reviewed this patch again, I think this get-stuck bug may be
> > caused by code:
> >
> > - return CRASH_TOKEN in output
>
> Hi,
> Thanks for fixing this. Strangely, I was able to fix it by swapping
> the b'' for a ' ' when I was stuck on a testcase a few days ago.
> vvv
> > + for line in iter(rc.stdout.readline, b''):
> > + if "CLOSED" in line:
> > + return False
> > + if CRASH_TOKEN in line:
> > + return True
> >
>
> I think your proposed change essentially does the same?
> -Alex
Hi Alex,
It looks like I misused the bytes type. Instead of b'', '' (the str
type) should be used here:
- for line in iter(rc.stdout.readline, b''):
+ for line in iter(rc.stdout.readline, ''):
And you are right, if we use iter() with sentinel parameter '', it's
does the same as:
+ if line == "":
+ return False
But if we just fix the get-stuck bug here, we may fail
the assert(check_if_trace_crashes(newtrace, outpath)) check after
remove_lines() or clear_bits() since the same trace input may trigger a
different output between runs.
My solution is instead of using the whole second-to-last line as token,
we only use the the first three words which indicate the type of crash:
- CRASH_TOKEN = output.splitlines()[-2]
+ CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])
Example: "SUMMARY: AddressSanitizer: stack-overflow"
And thus, we may a get a more trimmed input trace.
>
> > I assumed there are only two end cases in lines of stdout, but
> > while we
> > are trimming the trace input, the crash output (second-to-last
> > line)
> > may changes, in which case we will go through the output and fail
> > to
> > find "CLOSED" and CRASH_TOKEN, thus get stuck in the loop above.
> >
> > To fix this bug and get a more trimmed input trace, we can:
> >
> > Use the first three words of the second-to-last line instead of the
> > whole string, which indicate the type of crash as the token.
> >
> > - CRASH_TOKEN = output.splitlines()[-2]
> > + CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])
> >
> > If we reach the end of a subprocess' output, return False.
> >
> > + if line == "":
> > + return False
> >
> > I fix it in [PATCH v7 1/7] and give an example. Could you review
> > again?
> > Thanks :-)
> >
> > FYI, I mentioned this situation firstly in [PATCH 1/4], where I
> > gave a
> > more detailed example:
> >
> > https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg05888.html
> >
> > > > < {trace_path}".format(timeout=TIMEOUT,
> > > > qemu_path=QEMU_PATH,
> > > > qemu_args=QEMU_ARGS,
> > > > trace_path=path),
> > > > shell=True,
> > > > stdin=subprocess.PIPE,
> > > > - stdout=subprocess.PIPE)
> > > > - stdo = rc.communicate()[0]
> > > > - output = stdo.decode('unicode_escape')
> > > > - if rc.returncode == 137: # Timed Out
> > > > - return False
> > > > - if len(output.splitlines()) < 2:
> > > > - return False
> > > > -
> > > > + stdout=subprocess.PIPE,
> > > > + encoding="utf-8")
> > > > + global CRASH_TOKEN
> > > > if CRASH_TOKEN is None:
> > > > - CRASH_TOKEN = output.splitlines()[-2]
> > > > + try:
> > > > + outs, _ = proc.communicate(timeout=5)
> > > > + CRASH_TOKEN = outs.splitlines()[-2]
> > > > + except subprocess.TimeoutExpired:
> > > > + print("subprocess.TimeoutExpired")
> > > > + return False
> > > > + print("Identifying Crashes by this string:
> > > > {}".format(CRASH_TOKEN))
> > > > + global deduplication_note
> > > > + print(deduplication_note)
> > > > + return True
> > > >
> > > > - return CRASH_TOKEN in output
> > > > + for line in iter(proc.stdout.readline, b''):
> > > > + if "CLOSED" in line:
> > > > + return False
> > > > + if CRASH_TOKEN in line:
> > > > + return True
> > > > +
> > > > + return False
> > > >
> > > >
> > > > def minimize_trace(inpath, outpath):
> > > > @@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
> > > > print("Crashed in {} seconds".format(end-start))
> > > > TIMEOUT = (end-start)*5
> > > > print("Setting the timeout for {}
> > > > seconds".format(TIMEOUT))
> > > > - print("Identifying Crashes by this string:
> > > > {}".format(CRASH_TOKEN))
> > > >
> > > > i = 0
> > > > newtrace = trace[:]
> > > > --
> > > > 2.25.1
> > > >