[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Monotone-devel] url schemes
From: |
Markus Schiltknecht |
Subject: |
[Monotone-devel] url schemes |
Date: |
Sat, 22 Mar 2008 16:19:03 +0100 |
User-agent: |
Mozilla-Thunderbird 2.0.0.9 (X11/20080109) |
Hi,
since I've been critiquing Timothy's current extensions of the URL
scheme, I think I need to try coming up with something better. Or at
least help in doing so. First of all, I've put together a list of URL
schemes we are using in and around monotone, including nuskool, which
probably is what we will use someday.
In the first part of the URL, we obviously encode the protocol and
database location in the URL. Existing samples are:
* file:/path/to/monotone/db.mtn
* ssh://host[:port]/path/to/monotone/db.mtn
And for mtndumb, we already have:
* http[s]://host[:port]/path/to/repo
* [s]ftp://[user[:address@hidden:port]/path/to/repo
* file:/path (or file:///path??)
Upcoming URLs to specify a database location might be:
* mtn://host[:port] (as proposed for netsync)
* http://host[:port]/path/to/scgi (as in nuskool)
* xmpp:[//address@hidden/address@hidden
(as recently proposed on IRC - might somehow
work with nuskool, someday)
* pgsql://user:address@hidden:port/database/schema
(pipe dreaming...)
Often enough, specifying a database isn't enough, because we want to
address only parts of the repository, i.e. only a certain brach, only a
revision or even only a single file delta.
Almost all of the above protocols support additional slashes and more
path components after the database. The only exception being pgsql,
which isn't really much of a standard URL scheme anyway, AFAICT. (In
case of an underlying filesystem - i.e. file and ssh - it should be
possible to walk down the path and use the first monotone dabatase or
monotone dumb data directory you find. That would only prevent you from
accessing a monotone database file within a dumb data directory, but
that wouldn't make much sense anyway).
Most protocol types also support an argument list, separated by & - but
not all of them. Exceptions are the dumb ones, which cannot parse
arguments, because there's no clever server to process them. For pgsql,
arguments are often used to specify options for the database connection,
but as mentioned above, it's not really a standard - we could certainly
use some monotone specific arguments, if needed.
Now, the question which started that discussion is, what should the rest
of the URL look like? IMO, we should take a look at existing and planned
use cases. Then take care they don't conflict with each other.
The only existing rest-URL-scheme is from mtndumb. However, that one
uses a rather meaningless scheme to retrieve data from a repository. It
looks like it was designed to resemble the merkle trie, while still
providing a good compromise with round trips required:
$DB/DATA
$DB/HASHES_
$DB/HASHES_?? (multiple times, where ?? are the first two hex chars)
...
Then, there are the planned nuskool commands. Those are currently
encoded entirely in JSON. The HTTP client requests the same URL every
time, and encodes the query in JSON. ATM nuskool doesn't support branch
inclusion or exclusion patterns. The commands currently are:
* inquiring revisions: asks the server if it has certain revisions
* getting descendants: querying the ancestry map of the server
* getting (pulling) a revision
* putting (pushing) a revision
* getting file data
* putting file data
* getting file delta
* putting file delta
These are current facts and observations, or am I missing something
important?
Then, there are wishes and feature requests. I personally find the
following ones very compelling:
* mtn itself should be able to talk to dumb servers
* it should be possible to do checkouts from remote databases
* mtn should feature a simple API for 3rd party tools
* faster and firewall compatible protocol (covered by nuskool)
Taking all of that together, to me this smells very much like we need a
RESTful API. One which is easy to read, understand and remember, simple
to process and universally usable for all supported protocols (as far as
possible). What I have in mind would look somewhat like this:
* GET $DB/capabilities: inquire capabilities of that mtn repository
(i.e. if arguments are supported or not)
* GET/PUT $DB/revision/$HASH/data: pull or push a revision
* GET/PUT $DB/file_data/$HASH: pull or push file data
* GET/PUT $DB/file_delta/$HASH: pull or push file delta
* GET $DB/branch/$BRANCHNAME/heads: get heads of $BRANCHNAME
* GET $DB/revision/$HASH/inquire: inquire *one* revision
* GET $DB/revision/$HASH/descendants: fetch descendants of a revision
This might appear http centric, but think about it: ftp, file and ssh,
maybe even xmpp, all of these provide put and get methods in a way.
(Even if pushing to dumb servers might not work - at least not without
some additional processing on the server side. Or maybe with proper
authentication support, so clients can update meta data on the dumb
server?). And as http is about the best known protocol, so what's bad
about being http centric? ;-)
For browsable protocols which support index files (like http[s] and
ftp[s]) we could offer those for the following URLs:
* GET $DB/: a listing of branches in the repo, general purpose
repository information and statistics, etc..
* GET $DB/revision/$HASH/: a browsable directory tree
* GET $DB/branch/$BRANCHNAME/: some branch information, maybe a graph
with the most recent revisions, links to the branch
heads and to sub-branches
And others, but you get the point...
What's important for me is, that these URL schemes should be compatible
to another. I would find it a waste of opportunity, if we would now specify:
$DB/$BRANCHNAME[?$PATTERNS]
..or similar for the mtn (i.e. netsync) protocol, because it certainly
conflicts with future extensions for other protocols.
While the following is longer and more to type, it's certainly more
cross-protocol compatible and wouldn't prevent future extensions:
$DB/branch/$BRANCHNAME?PATTERNS
In other words: omitting that "branch" in between there would restrict
us from providing other resources. Or forcing us to use different URL
schemes for different protocols, i.e.:
$DB/$BRANCHNAME for mtn://
but:
$DB/branch/$BRANCHNAME for http://
..which would certainly confuse people.
As another, minor point, IMO the second is also easier to read and
understand. A good (but admittedly deprecated) example might be:
http://venge.net/net.venge.monotone
Looks quite confusing to me, where as:
http://venge.net/branch/net.venge.monotone
Makes the thing easier to understand. Especially for starters, I think.
So, that got rather longish now. Thanks for being with me so far. I'm
curious on your opinions, thoughts and criticism.
Regards
Markus
- [Monotone-devel] url schemes,
Markus Schiltknecht <=
- Re: [Monotone-devel] url schemes, Timothy Brownawell, 2008/03/22
- Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/23
- Re: [Monotone-devel] url schemes, Philipp Gröschler, 2008/03/24
- Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/24
- Re: [Monotone-devel] url schemes, Timothy Brownawell, 2008/03/24
- Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/25
- Re: [Monotone-devel] url schemes, Timothy Brownawell, 2008/03/30
- Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/31
- Re: [Monotone-devel] url schemes, Derek Scherger, 2008/03/31
- Re: [Monotone-devel] url schemes, Philipp Gröschler, 2008/03/24