pspp-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Malformed sav file - need help for investigation


From: Ignácz , István
Subject: Re: Malformed sav file - need help for investigation
Date: Wed, 10 Jun 2020 20:26:31 +0200

Excuse me if my former mail was ambiguous, I meant PSPP's sys-file-reader.c
and I never thought that anyone on this list should check the PHP code by
any means.

You shared a lot of really useful details about potential problems, it
helps me a great deal to start/continue my investigation on where I went
wrong. This is exactly what I was looking for when I wrote my first email.

Thanks for your time and help!
István

On Wed, 10 Jun 2020 at 19:57, Alan Mead <amead@alanmead.org> wrote:

> Ben's a genius at understanding SPSS file formats, but he's not going to
> have any insight into tiamo's PHP code. If tiamo's code has a source file
> called sys-file-reader, it's probably a port of PSPP code (and that should
> be self-evident from examining the file; it should have a FSF copyright)
> and you would want to see what tiamo is doing for error checking and
> reporting. Maybe they have turned off error reporting and there's a way to
> turn it back on?
>
> About PSPP code, I think you're going to need to ask more specific
> questions. In pspp-1.2.0/src/data/sys-file-reader.c, the error seems to be
> related to a variable index pointing into some inappropriate segment of a
> very long string (I think the code at that point expects a variable name
> and instead finds something else):
> https://www.gnu.org/software/pspp/pspp-dev/html_node/Very-Long-String-Record.html#Very-Long-String-Record
>
> All of this suggests that either the file is being corrupted (so it's
> reading bad data) or else long labels are not being handled correctly.
> Maybe only a part of the file is read because of PHP's limitation on upload
> file size?
>
> You may find this service that Ben provides to be of some assistance in
> verifying sav files:  https://pspp.benpfaff.org/
>
> One advantage (and potential gotcha) is that Ben runs his latest code at
> the website above. Historically, there are features available online that
> are not available in PSPP.
>
> -Alan
>
> On 6/10/2020 12:24 PM, Ignácz, István wrote:
>
> Thanks, Ben for the quick response!
>
> Yes, the ultimate goal is to add a patch to the package which fixes the
> problem or throws an error if the user of the package (in this case me)
> provides invalid input. I think I'm facing the later.
>
> One significant difference between the test file and my case is that in my
> case no variables and no cases are recognized so the reading process
> basically crashes.
> I think I've found the part where the PSPP ends up in a 'goto error'
> (sys-file-reader.813) but I lack the context of what's going on here. I
> guess the value labels could be misconfigured? Maybe there's a value in a
> record that doesn't have a label in the variable settings?
>
> If I would know a little bit more about what happens in that part and why
> that causes an error, maybe I could force my (and tiamo's) code to reliably
> produce the error.
>
> Thanks,
> István
>
> On Wed, 10 Jun 2020 at 18:51, Ben Pfaff <blp@cs.stanford.edu> 
> <blp@cs.stanford.edu> wrote:
>
>
> I'm not sure exactly what you're looking for, but PSPP has a test case
> that intentionally provokes this error message. I'm attaching the file it
> produces.
>
> I guess the ultimate goal here should be to fix tiamo's package?
>
> On Wed, Jun 10, 2020 at 9:42 AM Ignácz, István <istvan.ignacz@neticle.com> 
> <istvan.ignacz@neticle.com>
> wrote:
>
>
> Greetings,
>
> I think this is not the usual topic on this list, excuse me if you find it
> off-topic.
>
> I'm facing a tough bug in my software and I'm looking for some help to be
> able to take the next steps.
>
> Our software generates sav files via tiamo's 
> package:https://packagist.org/packages/tiamo/spss
> Usually, everything works fine, but in some cases, we generate malformed
> files. In this case, SPSS crashes without any notice, however, PSPP at
> least gives a useful error message:
> Variable index n refers to long string continuation.
>
> I cannot reproduce this issue on my side, so I tried to reverse engineer
> the PSPP codebase to find any cases when it could occur. So far not with
> much luck.
>
> Could you help me with some example cases?
>
> Thanks for your help,
> István
>
>
>
> --
>
> Alan D. Mead, Ph.D.
> President, Talent Algorithms Inc.
>
> science + technology = better workers
> http://www.alanmead.org
>
>
> A lie can travel half way around the world while the truth is
> putting on its shoes.
>
> -- Mark Twain
>
>
>
>

-- 

*István Ignácz*
Software Engineer @ Neticle

+36 30 460 5356
istvan.ignacz@neticle.com
neticle.com

<https://neticle.com/mediaintelligence/en/>  <https://zurvey.io/>
<https://api.neticle.com/>  <https://data.neticle.com/documentation>

Neticle Labs Ltd.
Address: H-1082 Budapest, Leonardo da Vinci  41. (2nd floor/12)
Billing address: H-1213 Budapest, Veto street 10.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]