[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[cp-patches] Re: Absolute URL parsing bug
From: |
Per Bothner |
Subject: |
[cp-patches] Re: Absolute URL parsing bug |
Date: |
Sat, 02 Jul 2005 10:08:35 -0700 |
User-agent: |
Mozilla Thunderbird 1.0.2-6 (X11/20050513) |
Andrew Haley wrote:
Per Bothner writes:
> Andrew Haley wrote:
> > [ An absolute file URL can look like:
> >
> > absoluteURI = "file" ":" abs_path
> > abs_path = "/" path_segments
> >
> > That is, there is no need for "//".
>
> I don't see that in any of the specs.
I got it from RFC 2396. Which I might have read wrongly, of course.
I read more carefully. In practice, it looks like "//" is optional.
I.e. "file:///foo/bar" is the same as "file:/foo/bar". There is a
subtle difference is the the former has an empty authority, while the
latter has a missing (undefined) authority. Thus resolving the relative
URIs "/tmp/help" vs "//gcc.gnu.org/help" against the base URI
"file://gnu.org/foo/" will yield "file://gnu.org/foo/help" vs
"file://gcc.gnu.org/foo/help" respectively. In practice, people seldom
use non-empty authority components with file: URIs and I'm not sure it's
well-defined. Perhaps when using NFS?
> Technically, "file:/tmp/foo.html" is not a valid URI, as far as I
> can tell. I notice that firefox rewrites it (in the navigation
> bar) to "file:///tmp/foo.html".
>
> Now in practice we may want to allow "file:/tmp/foo.html", but it should
> be viewed as an unofficial short-hand for "file:///tmp/foo.html".
>
> > And indeed, the URL spec in the
> > SDK docs says 'If the spec's path component begins with a slash
> > character "/" then the path is treated as absolute...' ]
>
> The *path* is absolute, but the URI isn't.
>
> This matters when resolving a relative URL against a base URI, such as
> he URL of the containing document.
>
> If we have a base URI "http://bar.com/baz/index.html" and a reference
> "/tmp/foo.html" then the resolved URI is "http://bar.com/tmp/foo.html".
>
> > But we parse the spec looking for "//" to determine if a URL is
> > absolute,
>
> A URL is absolute *only* if it has a "scheme".
I don't really understand what you're suggesting.
My comments are mostly applicative to the URI class rather than the URL
class. (I'm not sure why they're separate. I believe all URLs are
URIs, accdoring to the IEFT, but not the Java class hierarchy.) The URI
class has a bunch of methods like resolve and isAbsolute, which require
more care about these fine points.
Would it be OK to
special-case "file" URIs so that "file:/" is rewritten to ""file:///" ?
Maybe for the URL class, but I don't think so for the URI class.
The difference is the getAuthority method. My reading is that it should
return null for "file:/" and "" for "file:///". But I haven't tested
what JDK does - and it wouldn't surprise me if the URI and URL classes
are different in this respect.
--
--Per Bothner
address@hidden http://per.bothner.com/
- [cp-patches] Absolute URL parsing bug, Andrew Haley, 2005/07/01
- [cp-patches] Re: Absolute URL parsing bug, Per Bothner, 2005/07/01
- [cp-patches] Re: Absolute URL parsing bug, Andrew Haley, 2005/07/02
- [cp-patches] Re: Absolute URL parsing bug,
Per Bothner <=
- [cp-patches] Re: Absolute URL parsing bug, Andrew Haley, 2005/07/04
- [cp-patches] Re: Absolute URL parsing bug, Andrew Haley, 2005/07/04
- Re: [cp-patches] Re: Absolute URL parsing bug, Mark Wielaard, 2005/07/05
- Re: [cp-patches] Re: Absolute URL parsing bug, Andrew Haley, 2005/07/05
- Re: [cp-patches] Re: Absolute URL parsing bug, Andrew Haley, 2005/07/07
- [cp-patches] Re: Absolute URL parsing bug, Per Bothner, 2005/07/05
- [cp-patches] Re: Absolute URL parsing bug, Tom Tromey, 2005/07/02
- Re: [cp-patches] Re: Absolute URL parsing bug, Chris Burdess, 2005/07/02
- Re: [cp-patches] Re: Absolute URL parsing bug, Andrew Haley, 2005/07/03