classpath-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[cp-patches] Re: Absolute URL parsing bug


From: Per Bothner
Subject: [cp-patches] Re: Absolute URL parsing bug
Date: Sat, 02 Jul 2005 10:08:35 -0700
User-agent: Mozilla Thunderbird 1.0.2-6 (X11/20050513)

Andrew Haley wrote:
Per Bothner writes:
 > Andrew Haley wrote:
 > > [ An absolute file URL can look like:
> > > > absoluteURI = "file" ":" abs_path
 > >    abs_path = "/" path_segments
> > > > That is, there is no need for "//". > > I don't see that in any of the specs.

I got it from RFC 2396.  Which I might have read wrongly, of course.

I read more carefully. In practice, it looks like "//" is optional. I.e. "file:///foo/bar" is the same as "file:/foo/bar". There is a subtle difference is the the former has an empty authority, while the latter has a missing (undefined) authority. Thus resolving the relative URIs "/tmp/help" vs "//gcc.gnu.org/help" against the base URI "file://gnu.org/foo/" will yield "file://gnu.org/foo/help" vs "file://gcc.gnu.org/foo/help" respectively. In practice, people seldom use non-empty authority components with file: URIs and I'm not sure it's well-defined. Perhaps when using NFS?


 > Technically, "file:/tmp/foo.html" is not a valid URI, as far as I
 > can tell.  I notice that firefox rewrites it (in the navigation
 > bar) to "file:///tmp/foo.html".
 >
> Now in practice we may want to allow "file:/tmp/foo.html", but it should > be viewed as an unofficial short-hand for "file:///tmp/foo.html". > > > And indeed, the URL spec in the
 > > SDK docs says 'If the spec's path component begins with a slash
 > > character "/" then the path is treated as absolute...' ]
> > The *path* is absolute, but the URI isn't. > > This matters when resolving a relative URL against a base URI, such as > he URL of the containing document. > > If we have a base URI "http://bar.com/baz/index.html"; and a reference > "/tmp/foo.html" then the resolved URI is "http://bar.com/tmp/foo.html";. > > > But we parse the spec looking for "//" to determine if a URL is
 > > absolute,
> > A URL is absolute *only* if it has a "scheme".

I don't really understand what you're suggesting.

My comments are mostly applicative to the URI class rather than the URL class. (I'm not sure why they're separate. I believe all URLs are URIs, accdoring to the IEFT, but not the Java class hierarchy.) The URI class has a bunch of methods like resolve and isAbsolute, which require more care about these fine points.

 Would it be OK to
special-case "file" URIs so that "file:/" is rewritten to ""file:///" ?

Maybe for the URL class, but I don't think so for the URI class.
The difference is the getAuthority method. My reading is that it should return null for "file:/" and "" for "file:///". But I haven't tested what JDK does - and it wouldn't surprise me if the URI and URL classes are different in this respect.
--
        --Per Bothner
address@hidden   http://per.bothner.com/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]