guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Trouble parsing a response (Was: Re: New library: guile-wikidata)


From: swedebugia
Subject: Re: Trouble parsing a response (Was: Re: New library: guile-wikidata)
Date: Thu, 3 Jan 2019 13:25:37 +0100

On 2018-12-13 23:03, Roel Janssen wrote:


On 13-12-18 17:06, address@hidden wrote:
On 2018-12-13 16:01, address@hidden wrote:
snip


I tried with the file attached but got this because the driver does not
support URIs but only host, port, type, token:

Ah, I saw now that you already implemented URI on master :)
https://github.com/roelj/guile-sparql/blob/master/sparql/driver.scm

When I try calling this
;; Example query to wikidata listing cats
(sparql-query
  "SELECT ?item
WHERE
{
?item wdt:P31 wd:Q146.
}
LIMIT 10
"
  #:uri "https://query.wikidata.org/sparql";
  ;; #:port 80
  #:type "application/sparql-results+json"
  ;;  #:token "..."
  #:store-backend 'blazegraph
  )

I get this fine result:
#<<response> version: (1 . 1) code: 200 reason-phrase: "OK" headers:
((date . #<date nanosecond: 0 second: 12 minute: 32 hour: 15 day: 13
month: 12 year: 2018 zone-offset: 0>) (content-type
application/sparql-results+json (charset . "utf-8")) (transfer-encoding
(chunked)) (connection close) (server . "nginx/1.13.6") (x-served-by .
"wdqs1005") (access-control-allow-origin . "*") (cache-control public
(max-age . 300)) (vary accept accept-encoding) (x-varnish . "644531744,
572094009, 417977651") (via "1.1 varnish (Varnish/5.1)" "1.1 varnish
(Varnish/5.1)" "1.1 varnish (Varnish/5.1)") (accept-ranges bytes) (age .
0) (x-cache . "cp1079 pass, cp3030 pass, cp3030 pass") (x-cache-status .
"pass") (server-timing . "cache;desc=\"pass\"")
(strict-transport-security . "max-age=106384710; includeSubDomains;
preload") (set-cookie .
"WMF-Last-Access=13-Dec-2018;Path=/;HttpOnly;secure;Expires=Mon, 14 Jan
2019 12:00:00 GMT") (set-cookie .
"WMF-Last-Access-Global=13-Dec-2018;Path=/;Domain=.wikidata.org;HttpOnly;secure;Expires=Mon,
14 Jan 2019 12:00:00 GMT") (x-analytics . "https=1;nocookies=1")
(x-client-ip . "83.185.90.53")) port: #<input-output: file 8c37e70>>

My problem now is that I don't know how to separate the header from the
port-file.

Ah, reading here
https://www.gnu.org/software/guile/manual/html_node/Responses.html#Responses
I found (response-port).


Here's what I did in SPARQLing-genomics:
https://github.com/UMCUGenetics/sparqling-genomics/blob/master/web/www/pages/query-response.scm#L86

So basically:

(use-modules
   (ice-9 receive)
   (ice-9 rdelim)
   (web response))

(receive (header port)
  ;; Note: "text/csv" is the only format that is consistent for multiple SPARQL back-ends (Virtuoso, BlazeGraph, ...)
   (sparql-query ... #:type "text/csv")
   (if (= (response-code header) 200) ; This means the query went OK.
     (call-some-function port)
     #f)) ; Deal with errors at the #f.

(define (call-some-function port)
   (let ((line (read-line port)))
     (if (eof-object? line)
       #t
       (begin
         (format #t "Line: ~a~%" line)
         ;; Tail-recurse until we have processed each line.
         (call-some-function port)))))

The SPARQLing-genomics code deals with more error codes, and processes the lines in a more useful way.

Unfortunately this only took me one step further as I run into this
instead when trying to parse the port with (json->scm):

Backtrace:
            7 (apply-smob/1 #<catch-closure 9769550>)
In ice-9/boot-9.scm:
     705:2  6 (call-with-prompt _ _ #<procedure default-prompt-handle…>)
In ice-9/eval.scm:
     619:8  5 (_ #(#(#<directory (guile-user) 9759910>)))
In ice-9/boot-9.scm:
    2312:4  4 (save-module-excursion _)
   3831:12  3 (_)
In sdb-test.scm:
      24:1  2 (_)
In json/parser.scm:
    311:18  1 (json-read-number _)
    148:28  0 (read-number _)

json/parser.scm:148:28: In procedure read-number:
Throw to key `json-invalid' with args `(#<json-parser port:
#<input-output: string 98dea80>>)'.

Maybe this is a bug in (json)?

It looks like the JSON response is not (only) JSON, or simply invalid.
Maybe the "text/xml" or "text/csv" content-type will work better for you.  I noticed that each back-end provides their own structure for XML and JSON, so I used the somewhat quirky CSV format as a work-for-all response type.

I hope this helps.

You were right!

I debugged away and got this in the end after many trial and errors:

$ env |grep ssl
GIT_SSL_CAINFO=/home/egil/.guix-profile/etc/ssl/certs/ca-certificates.crt
SSL_CERT_DIR=/home/egil/.guix-profile/etc/ssl/certs
$ guile --version
guile (GNU Guile) 2.2.4
$ guix --version
guix (GNU Guix) 0.16.0 <- installed from binary 0.16 on parabola.

$ guile -s test2.scm
Line: 1aa
Line: item
Line: http://www.wikidata.org/entity/Q28114532
Line: http://www.wikidata.org/entity/Q28114535
Line: http://www.wikidata.org/entity/Q28665865
Line: http://www.wikidata.org/entity/Q28792126
Line: http://www.wikidata.org/entity/Q30600575
Line: http://www.wikidata.org/entity/Q42442324
Line: http://www.wikidata.org/entity/Q43260736
Line: http://www.wikidata.org/entity/Q48895080
Line: http://www.wikidata.org/entity/Q49581026
Line: http://www.wikidata.org/entity/Q50378472
Line:
Line: 0
Line:
Backtrace:
           9 (apply-smob/1 #<catch-closure 18835c0>)
In ice-9/boot-9.scm:
705:2 8 (call-with-prompt _ _ #<procedure default-prompt-handler (k proc)>)
In ice-9/eval.scm:
    619:8  7 (_ #(#(#<directory (guile-user) 18f2140>)))
In ice-9/boot-9.scm:
   2312:4  6 (save-module-excursion _)
  3831:12  5 (_)
In test2.scm:
    51:14  4 (read #<input-output: string 1c22d20>)
In ice-9/rdelim.scm:
   195:24  3 (read-line _ _)
In unknown file:
           2 (%read-line #<input-output: string 1c22d20>)
In web/client.scm:
142:24 1 (read! #vu8(48 13 10 13 10 103 47 101 110 116 105 116 121 47 81 52 56 56 57 53 48 56 48 13 10 104 116 116 112 58 47 47 119 119 119 46 119 ?) ?)
In unknown file:
           0 (get-bytevector-some #<input-output: string 1cf51c0>)

ERROR: In procedure get-bytevector-some:
Throw to key `gnutls-error' with args `(#<gnutls-error-enum The TLS connection was non-properly terminated.> read_from_session_record_port)'.

Can anyone replicate this? (run the attachment)
Is this a bug in guile?
How do I ignore this error?

--
Cheers Swedebugia

Attachment: test2.scm
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]