[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: merge mode for XML

From: Glew, Andy
Subject: RE: merge mode for XML
Date: Mon, 13 May 2002 10:31:36 -0700

> > Motivation: schema changes in most existing relational databases are
> > onerous.
> For very good reason.

And what is that reason?

OK, I admit that some RDBMS applications in production
need stability - just like some systems software applications
(the kind Greg seems to work on, the kind I used to
work on) value stability above all else, and actively
want to make it hard to change things.

However, there are other application domains
- in programming, the domains attacked by agile
methodologies like XP (eXtreme Programming).
{Donning asbestos underwear, expecting Greg
to flame.}

An application area that I frequently work in nowadays
is experimental databases - databases for experimental data.
I want to archive all of my experimental data in a form that
allows me to do arbitrary SQL-like queries over it.

Problem is, as I continue my research, the format of
my records is continually changing.  For example, a few years
ago I might have recorded CPU MHz and Cache Size as 
configuration parameters - now I have to record at least
3 different cache sizes, as well as multiple clock domain 
frequencies. Not to mention that the observations that
I record are constantly changing.
        Rather than continually reformatting my database,
adding new fields which are "Unknown" or "Null" on old data,
I find it easier to add records containing fields that were not
known earlier.

I've tried to do this in a traditional RDBMS database.
I've asked database experts like deWitt and the guy who
invented transactions whose name I can't remember now...
and the answer always comes that the traditional RDBMS way
is to create a database in fully normalized form,
of the form Experiment#:Metric:Value.
Worse, it may be ncessary to create several different tables
for each type.  It is impossible for ordinary humans
to write queries in such a form.

Yet, "self-schematization" makes it trivial to do.
All that is needed is more flexible handling of nulls
than most RDBMSes support - more like the handling
that Codd, Date, and Darwent(sp?) advocate.

> I suspect Dewitt is thinking a little bit deeper than you suspect.
> Certainly data can be self-describing -- that's what OO is all about.
> OO databases can effectively be queried about their schemas...
> An RDBMS, however, is not an OODBMS.

Well, deWitt is the big advocate of ORDBMS
- Object Relational DBMS.

> Whether an XML document without a DTD and/or schema can be considered
> self-describing enough to be independent like an object instance or a
> set of object instances, is probably what you're trying to 
> argue, but I won't go any further since such a thing is strictly outside 
> the scope of XML proper and is way outside the scope of what a common tool

> like CVS should ever deem worthy of dealing with.

Fair enough.

My original email was prompted by email from you,
Greg, that sounded like "CVS should not have support for
XML, like supporting file-format-specific diff and merge,
because XML without a DTD is meaningless."

I reject that as a specious argument.

Your remaining argument, that nobody has stepped up to do
external diff and merge, emains valid. (Ditto wrt file renaming,
multiple repositories, etc.)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]