guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HTTP Request/Response questions


From: Ian Price
Subject: Re: HTTP Request/Response questions
Date: Sun, 06 Nov 2011 11:18:58 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)

"R. P. Dillon" <address@hidden> writes:

> I'm currently working on a project to gather RSS data using Guile.  I've been
I've done that. I highly recommend sxpath for this job.

> working with both the stable 2.0.3 version and the latest git repository.  I'm
> fairly new to Guile, though, so I might be approaching this the wrong way.
>
> As a test, I wanted to make an HTTP request.  This is a series of commands I
> executed in the REPL to accomplish this (using Geiser in Emacs 24):
>
> (use-modules (web request) (web response) (web uri) (rnrs bytevectors))
>
> (define port (socket PF_INET SOCK_STREAM 0))
> (define address (addrinfo:addr (car (getaddrinfo "www.google.com" "http"))))
> (connect port address)
> (define request (build-request (build-uri 'http #:host "www.google.com")))
> (write-request request port)
> (define response (read-response port))
>
> (read-response ...) consistently fails with Google:
>
> web/http.scm:754:6: In procedure parse-asctime-date:
> web/http.scm:754:6: Throw to key `bad-header' with args `(date "-1")'.
I can confirm this with (call-with-input-string "Date: -1\r\n\r\n" 
parse-headers)

>
> The expiration is set to -1 in the headers, and this seems to cause a problem
> for the web libraries in Guile.
This is not IIRC a valid Date header, but is this common value? If so, it
may be worth making an exception for it.

> This same request seems to work well for my own domain (killring.org).
>
> I attempted a very similar series of commands to get RSS data for Google News:
>
> (define port (socket PF_INET SOCK_STREAM 0))
> (define address (addrinfo:addr (car (getaddrinfo "news.google.com" "http"))))
> (connect port address)
> (define request (build-request (build-uri 'http #:host "news.google.com"
> #:path "/news?pz=1&cf=all&ned=us&hl=en&output=rss")))
> (write-request request port)
> (define response (read-response port))
> (define body-vec (read-response-body response))
>
> In this case, the (read-response-body ...) returns #f, although when I pulled
> the data manually, there was XML data present in the body of the response.
I have also experienced this problem. read-response-body returns #f if
there is no content-length header, which usually means chunked
encoding.

I have a patch to deal with this, but I have not received any
feedback on my proposed functions, so I haven't posted it
yet. Basically, I wanted to add 4 functions, including a
read-chunked-response-body, and to have the (web client) handle
chunked-encoding transparently.

>
> Similarly, when getting RSS information from Slashdot:
>
> (define port (socket PF_INET SOCK_STREAM 0))
> (define address (addrinfo:addr (car (getaddrinfo "rss.slashdot.org" "http"))))
> (connect port address)
> (define request (build-request (build-uri 'http #:host "rss.slashdot.org"
> #:path "/Slashdot/slashdot")))
> (write-request request port)
> (define response (read-response port))
>
> I get the following error when reading the response:
>
> web/http.scm:814:12: In procedure parse-entity-tag:
> web/http.scm:814:12: Throw to key `bad-header' with args `(qstring
> "F+oOJMkOlp2n1IUbAJmq+7qCGuk")'.
>
> which I haven't fully tracked down yet.
I came across this issue already, and in my case it was because some servers
(gws, I think) don't quote their Etags. Feedburner was a common
culprit. All in all, not common, but a nuisance. Using 'declare-header!'
from the (web http) library, you can cause Etags not to be parsed by doing

(declare-header! "Etag" values string? display)

Although, I'd think it much nicer if guile were to expose
declare-opaque-header! directly for these sorts of circumstances.

>
> I have a feeling I'm using the API incorrectly, though I've pored over the
> documentation the best I can to figure out how to make these requests and
> parse the responses.  Short of writing my own implementation, is there
> anything I should be doing to make this work?
No no, you're using it right :) Although the (web client) module will be
more convenient usually. For example,

scheme@(guile−user)> ,use (web client)
scheme@(guile−user)> http-get
$11 = #<procedure http−get (uri #:key port version keep−alive? extra−headers 
decode−body?)>
scheme@(guile−user)> (http-get (string->uri "http://www.google.com";))
$12 = #<<response> version: (1 . 1) code: 302 reason−phrase: "Found" headers: 
((location . #<<uri> scheme: http userinfo: #f host: "www.google.co.uk" port: 
#f path: "/" query: #f fragment: #f>) (cache−control private) (content−type 
text/html (charset . "UTF−8")) (set−cookie . 
"PREF=ID=3c2c9fc50c288823:FF=0:TM=1320578334:LM=1320578334:S=Gtrhd05V1tRopJyZ; 
expires=Tue, 05−Nov−2013 11:18:54 GMT; path=/; domain=.google.com") (date . 
#<date nanosecond: 0 second: 54 minute: 18 hour: 11 day: 6 month: 11 year: 2011 
zone−offset: 0>) (server . "gws") (content−length . 221) (x−xss−protection . 
"1; mode=block") (x−frame−options . "SAMEORIGIN") (connection close)) port: 
#<closed: file 0>>
$13 = "<HTML><HEAD><meta http−equiv=\"content−type\" 
content=\"text/html;charset=utf−8\">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF=\"http://www.google.co.uk/\";>here</A>.\r
</BODY></HTML>\r
"
scheme@(guile−user)> 

>
> Thanks,
> Rick
>

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"



reply via email to

[Prev in Thread] Current Thread [Next in Thread]