I try to write a module for http download tools. When I finished it, I realized some of the procedures could be the part of module/web/client.scm.
So I format a patch.
http-get-uri-head ==> get the response struct under http method 'HEAD', this head is useful to get information of the remote file.
http-client-get-block-from-uri ==> get response body whose size is 'block' which can be specified by user. If you don't specify the 'block',
it will be the length of the target file.
http-client-get-ready-to-continue ==> returns (values pos fd)
If the download target has a part in local space, pos will point to it's broken pointer.
fd is the file port of the local target.
The rest procedures maybe useful. To avoid download a file incorrectly, I think a checksum or md5 is useful. So I just used the ETag which
contains in HTTP protocol. Each target file would generate a "filename.etag" contains the last time ETag. When continue to download file, one may use these procedures to checkout the ETag.
http-client-get-check-string ==> get the ETag string from "filename.etag"
http-client-checkout-etag ==> checkout if the ETag of HEAD and ETag of "filename.etag" are equal.
http-client-remove-check-file ==> when you get a different ETag, you may delete the "filename.etag"
http-client-etag-stamp ==> when you first download a file, this procedure could generate "filename.etag"
and after you finished your downloading, you need to delete "filename.etag"
Here's an simple 'continue-to-download' example to show how to use these procedures:
----------------------------------code begin---------------------------------------
(define* (http-client-retrive-file-continue uri #:key (path (uri-path uri))
(try 5))
(let ([head (http-get-uri-head uri)])
(call-with-values
(lambda ()
(http-client-get-ready-to-continue uri #:path path #:head head))
(lambda (pos port)
(catch #t
(lambda ()
(if (zero? pos)
(begin
(display "download from beginning")
(http-client-etag-stamp uri #:path path))
(format #t "continue from position ~a~%" pos))
(let lp ([data (http-client-get-block-from-uri
uri #:start pos #:head head #:block 4096)]
[pos pos])
(if data
(let* ([dl (bytevector-length data)]
[new-pos (+ pos dl)]
)
(put-bytevector port data)
(force-output port)
(format #t "~a-~a~%" pos new-pos)
(lp (http-client-get-block-from-uri
uri #:start new-pos #:head head #:block 4096)
new-pos))
(format #t "~a has already been done!~%" path))))
(lambda e
(case (car e)
((system-error)
(let ([E (system-error-errno e)])
(if (or (= E ECONNABORTED)
(= E ECONNREFUSED)
(= E ECONNRESET))
(begin
(format #t "~a, try again!~%left ~a times to try~%"
(car (cadddr e)) try)
(close port)
(http-client-retrive-file-continue uri #:path path
#:try (1- try))))))
(else
(display "some error occured!\n")(newline)
(format #t "~a : ~a~%" (car e) (cdr e)))))
)))))
---------------------code end-----------------------------
And you may try this:
#:path "mmr.tar.bz2" #:try 10)
#:path could be used to specify the local target. If you ignore it, it would be the original file name.
#:try is times to try. When try decrease to 0 but downloading has unfinished, it'll quit anyway.
Besides, one may use these procedures to build his/her own threads based downloading tools, say, split the remote file into blocks and use 10 threads to down them separably.