[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: HTTPURLConnection.connect() buffers its entire input.
From: |
Nic Ferrier |
Subject: |
Re: HTTPURLConnection.connect() buffers its entire input. |
Date: |
Fri, 09 Sep 2005 10:46:12 +0100 |
Chris Burdess <address@hidden> writes:
> David Daney wrote:
>> It seems the the current implementation of HTTPURLConnection.connect()
>> buffers the entire response before returning.
>>
>> Is that a correct analysis?
>
> Yes.
>
>> This can be problematical if the content is larger than the heap. It
>> is even worse than that as it makes a copy of the content, so the
>> content can only be half as large as the heap.
>>
>> Does anyone know the rational behind doing it this way?
>
> Our implementation uses the inetlib HTTP client in order to leverage
> numerous HTTP features such as chunked and compressed transfer-codings,
> TLS, and HTTP 1.1.
>
> The design of the inetlib HTTP client is based on callbacks. You
> register a listener to receive notification of HTTP response data,
> rather than pulling the data yourself. This leaves the client in proper
> control of the stream and permits correct handling of HTTP persistent
> connections (reuse of the same TCP connection for multiple HTTP
> requests).
>
> The design of the URLConnection API is pull-based. Therefore we either
> have to buffer an entire response before returning, or use multiple
> threads, a pipe, and a much more complex implementation to manage
> cleanup of resources. Also note that with HTTP 1.1 chunked encoding,
> you can have headers after the response body, which is not something
> that most naive developers will expect. This means that in the
> non-buffered implementation you could have
>
> connection.getHeader("My-Header"); // null
> connection.getInputStream();
> // read until -1
> connection.getHeader("My-Header"); // non-null
>
> In practice I haven't seen this in many servers, but it is still a
> possibility.
>
> Tom Tromey and I have discussed the possibility of this non-buffered
> implementation and of a hybrid model which uses a heuristic based on
> the content length to decide which of these implementations to use, but
> we haven't really had time to thrash it all out yet.
>
> If you are dealing with streaming servers or with very large responses,
> you probably shouldn't be using the URLConnection API in any case -
> consider using the inetlib client directly as it will be more
> efficient.
I have spoken to Chris before about my own http library which uses
non-blocking IO. This would be a solution to this problem but also
require another thread (for the selector).
It also does not have 1.1 features like pipelining though I will add
them if I get the time.
Nic
- HTTPURLConnection.connect() buffers its entire input., David Daney, 2005/09/08
- Message not available
- Re: HTTPURLConnection.connect() buffers its entire input., Tom Tromey, 2005/09/08
- Re: HTTPURLConnection.connect() buffers its entire input., Chris Burdess, 2005/09/09
- Re: HTTPURLConnection.connect() buffers its entire input., Chris Burdess, 2005/09/09
- Re: HTTPURLConnection.connect() buffers its entire input., David Daney, 2005/09/09
- Re: HTTPURLConnection.connect() buffers its entire input., Stephen Crawley, 2005/09/17