pspp-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Malformed sav file - need help for investigation


From: Alan Mead
Subject: Re: Malformed sav file - need help for investigation
Date: Wed, 10 Jun 2020 12:57:08 -0500
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0

Ben's a genius at understanding SPSS file formats, but he's not going to
have any insight into tiamo's PHP code. If tiamo's code has a source
file called sys-file-reader, it's probably a port of PSPP code (and that
should be self-evident from examining the file; it should have a FSF
copyright) and you would want to see what tiamo is doing for error
checking and reporting. Maybe they have turned off error reporting and
there's a way to turn it back on?

About PSPP code, I think you're going to need to ask more specific
questions. In pspp-1.2.0/src/data/sys-file-reader.c, the error seems to
be related to a variable index pointing into some inappropriate segment
of a very long string (I think the code at that point expects a variable
name and instead finds something else):
https://www.gnu.org/software/pspp/pspp-dev/html_node/Very-Long-String-Record.html#Very-Long-String-Record

All of this suggests that either the file is being corrupted (so it's
reading bad data) or else long labels are not being handled correctly.
Maybe only a part of the file is read because of PHP's limitation on
upload file size?

You may find this service that Ben provides to be of some assistance in
verifying sav files:  https://pspp.benpfaff.org/

One advantage (and potential gotcha) is that Ben runs his latest code at
the website above. Historically, there are features available online
that are not available in PSPP.

-Alan

On 6/10/2020 12:24 PM, Ignácz, István wrote:
> Thanks, Ben for the quick response!
>
> Yes, the ultimate goal is to add a patch to the package which fixes the
> problem or throws an error if the user of the package (in this case me)
> provides invalid input. I think I'm facing the later.
>
> One significant difference between the test file and my case is that in my
> case no variables and no cases are recognized so the reading process
> basically crashes.
> I think I've found the part where the PSPP ends up in a 'goto error'
> (sys-file-reader.813) but I lack the context of what's going on here. I
> guess the value labels could be misconfigured? Maybe there's a value in a
> record that doesn't have a label in the variable settings?
>
> If I would know a little bit more about what happens in that part and why
> that causes an error, maybe I could force my (and tiamo's) code to reliably
> produce the error.
>
> Thanks,
> István
>
> On Wed, 10 Jun 2020 at 18:51, Ben Pfaff <blp@cs.stanford.edu> wrote:
>
>> I'm not sure exactly what you're looking for, but PSPP has a test case
>> that intentionally provokes this error message. I'm attaching the file it
>> produces.
>>
>> I guess the ultimate goal here should be to fix tiamo's package?
>>
>> On Wed, Jun 10, 2020 at 9:42 AM Ignácz, István <istvan.ignacz@neticle.com>
>> wrote:
>>
>>> Greetings,
>>>
>>> I think this is not the usual topic on this list, excuse me if you find it
>>> off-topic.
>>>
>>> I'm facing a tough bug in my software and I'm looking for some help to be
>>> able to take the next steps.
>>>
>>> Our software generates sav files via tiamo's package:
>>> https://packagist.org/packages/tiamo/spss
>>> Usually, everything works fine, but in some cases, we generate malformed
>>> files. In this case, SPSS crashes without any notice, however, PSPP at
>>> least gives a useful error message:
>>> Variable index n refers to long string continuation.
>>>
>>> I cannot reproduce this issue on my side, so I tried to reverse engineer
>>> the PSPP codebase to find any cases when it could occur. So far not with
>>> much luck.
>>>
>>> Could you help me with some example cases?
>>>
>>> Thanks for your help,
>>> István
>>>

-- 

Alan D. Mead, Ph.D.
President, Talent Algorithms Inc.

science + technology = better workers

http://www.alanmead.org


A lie can travel half way around the world while the truth is
putting on its shoes.

-- Mark Twain




reply via email to

[Prev in Thread] Current Thread [Next in Thread]