octave-patch-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-patch-tracker] [patch #9980] JSON encoder and decoder, alternati


From: anonymous
Subject: [Octave-patch-tracker] [patch #9980] JSON encoder and decoder, alternative to object2json
Date: Wed, 22 Dec 2021 00:33:09 -0500 (EST)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:95.0) Gecko/20100101 Firefox/95.0

Follow-up Comment #55, patch #9980 (project octave):

Got it.  

On your 'xulstore.json' trial: this was an issue cause by a URL being used as
a struct key.  I found that in other files too and resolved earlier today. 

Likewise on the stricter Octave syntax checking.  I was able to get Octave 6.4
running and saw the same.  This too has been resolved.   I assume it will
likewise be OK in 7 and 8.

On the 'large-file.json', it's a good hard-case, but as noted before, this
kind of JSON file is not the main use-case for (Octave or 'fromJSON') and will
definitely choke on 'fromJSON' recursions. If, on the other hand, you were to
try with a numerical dataset (in Octave's  wheelhouse), like 'numerical.json'
(19MB) found here:

https://gist.github.com/kmpatel/e37e4df0a1971f25fcf10fadd2d8368d

'fromJSON' is much, much, much, much faster (~3sec on a 4th gen quad core). 
Would be curious to see difference is speed and output with 'jsondecode'.

Also, regarding the early issue with Seamonkey profiles, I think I might have
found a similar file (attached).

In the attached file, 'dhs_keywords.json', the 3 leading bytes are neither
valid ASCII chars nor, AFAIK, a valid UTF-8 char.  As such, they do even
display on screen (in Octave or any texteditor).  More over, since they are
unquoted, they have no valid *JSON* related reason to be there and causes
problems in 'fromJSON'.   I assume it is some file byte code for govt spying.

Anyways, on the assumption that *THIS* is the "Seamonkey profile" problem, I
performed the necessary bug fix of simply detecting and ignoring any invalid
UTF-8 at beginning of string (on the premise that these are file encoding char
accidently read in by 'fread' or 'fileread').  Anywhere else in the file, if
they are quoted, valid (like emoji's) and invalid UTF-8 chars will pass thru
harmlessly, but unquoted invalid UTF-8 char will continue to be a problem. 
'fromJSON' will fail gracefully with a _<invalid frag>_ warning, showing the
surrounding string, but the invalid UTF-8 chars will remain invisible to the
user.  Unless they carefully inspect the string as a _uint8_ number array, as
far as they can tell, the warning would appear bogus.

The attached 'fromJSON_v_2_7_3.m' has be tested against ~980 random JSON files
found on my linux box. Quite a few were were not, strictly speaking, valid
JSON files. Nevertheless, with the fixes to 'fromJSON' and a little
pre-processing, these  JSON files now parses pretty well (I can account for
those that don't).  I have incorporated some of these extreme conditions found
(like your 'xulstore.json') in the BIST.  (At this point, there is 600 lines
of BIST for 200 lines of code).  

I'm also uploading a improved Octave script I used to find JSON files, collect
into a temp directory, and test against 'fromJSON'.  I think it will work on
Win and Mac, if you want, but you will have to change the directory where all
the JSONs are collected (or at least modify script to simply read in place).


K

(file #52551, file #52552, file #52553)
    _______________________________________________________

Additional Item Attachment:

File name: fromJSON_v2_7_3.m              Size:36 KB
    <https://file.savannah.gnu.org/file/fromJSON_v2_7_3.m?file_id=52551>

File name: dhs_keywords.json              Size:8 KB
    <https://file.savannah.gnu.org/file/dhs_keywords.json?file_id=52552>

File name: random_json_test.m             Size:1 KB
    <https://file.savannah.gnu.org/file/random_json_test.m?file_id=52553>



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/patch/?9980>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]