fenfire-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp


From: Tuomas Lukka
Subject: Re: [Fenfire-dev] PEG swamp_easier--benja: An easier API for Swamp
Date: Mon, 22 Sep 2003 11:22:23 +0300
User-agent: Mutt/1.5.4i

I like the idea. Comments below.

        Tuomas

On Mon, Sep 22, 2003 at 05:09:44AM +0300, Benja Fallenstein wrote:
> 
> .. Issues
>    ======
> 
> A flavor of the API
> ===================
> 
> First of all, we need a good way for iterating
> through a set of triples. I propose the following
> interface::
> 
>     for(Triples t = graph.get(_, RDF.type, _); t.loop();) {
>         System.out.println(t.sub+" is instance of "+t.ob);
>     }

This is nice.

ISSUE: Name for that call: get(...)? We have find() so far.

> I.e., have our own iterator-like thing, which iterates
> through a set of *triples*-- rather than nodes-- but doesn't
> need to create objects for every one of these triples.

*VERY* nice.

However,

ISSUE: Name for the iterator-like thing that goes through triples.
"Triples" says it contains several triples while it has only one 
at a time. "TripleIterator", "TripleIter", ...?


> For good measure, here's how the above code would look
> in the current API::
> 
>     for(Iterator i=graph.findN_X1A(RDF.type); i.hasNext();) {
>         Object sub = i.next();
>         for(Iterator j=graph.findN_11X(sub, RDF.type); j.hasNext();) {
>             Object ob = j.next();
>             System.out.println(sub+" is instance of "+t.ob);
>         }
>     }
> 
> However, to be fair, my code isn't how it would look
> when efficiency is at a premium. (Then again, when I print
> to the console inside the loop, efficiency isn't at a
> premium anyway... but whatever...) The *fast* version
> would look like this::

Umm, you should note here that the efficiency difference is in the call,
not in the actual code, as get() can be just a set of if clauses
and actually I think that hotspot might be able to handle it.

However, there's another performance difference with the Triples objects
which you haven't mentioned: *all* members need to be fetched each
time.

>     for(Triples t = graph.get_A1A(RDF.type); t.loop()) {
>         System.out.println(t.sub+" is instance of "+t.ob);    
>     }

Note: missing a semicolon.

ISSUE: Naming. I'd think find_X1X_Triples would make more sense here.

> Changes
> =======
> 
> We'll make it a convention that classes using the API
> have this at the top::
> 
>     static final _ = null;

static final **Object** _ = null; ?

> You don't have to have this, but it makes things easier to read.


> ``ConstGraph``
> --------------
> 
> ``ConstGraph`` shall have the following API
> for getting triples::
> 
>     /** Get an iterator through all triples in the graph
>      *  matching a certain pattern.
>      *  If <code>subject</code>, <code>predicate</code> and/or
>      *  <code>object</code> are given, the triples must match these.
>      *  If any of the parameters is <code>null</code>,
>      *  any node will match it.
>      */
>     Triples get(Object subject, Object predicate, Object object);
> 
>     // Versions that don't allow wildcards (``null``)
>     Triples get_AA1(Object predicate, Object object);
>     Triples get_1A1(Object subject, Object object);
>     ...
> 
>     /** Get the subject of the triple matching a certain pattern.
>      *  If <code>subject</code>, <code>predicate</code> and/or
>      *  <code>object</code> are given, the triple must match these.
>      *  If any of the parameters is <code>null</code>,
>      *  any node will match it.
>      *  @returns The subject of the triple, if there is one,
>      *           or <code>null</code> if there is no such triple.
>      */
>     Object getSubject(Object subject, Object predicate, Object object);
> 
>     Object getSubject_A1A(Object predicate);
>     ...

ISSUE: If there is more than one?

> Note: The reason for having ``subject`` as a parameter
> for ``getSubject()`` is that it's easier to read. It will
> almost always be "``_``" (i.e., ``null``). It shall work
> consistently, though: If a subject is given, and there is
> such a triple in the graph, return that subject; otherwise,
> return ``null``.
> 
>     /** Get the subjects of all triples matching a certain pattern.
>      *  If <code>subject</code>, <code>predicate</code> and/or
>      *  <code>object</code> are given, the triple must match these.
>      *  If any of the parameters is <code>null</code>,
>      *  any node will match it.
>      *  <p>
>      *  The set is immutable; it is <em>not</em> backed
>      *  by the graph (i.e., changing the graph does not
>      *  change the set.)
>      */
>     Set getSubjects(Object subject, Object predicate, Object object);
> 
> (Backing is harder to program and I don't see the pay-off,
> since the ``getXXXs`` functions won't be used that often.)
> 
>     Set getSubjects_AA1(Object object);
>     ...
> 
>     // getObject(), getObjects() similarly
>     // getPredicate(), getPredicates() similarly
> 
> ``getPredicate()`` is essentially useless, but we'll have it
> for symmetry. ``getPredicates()`` is useful, mostly for
> getting *all* predicates used in a graph.
> 
> Note that we don't have ``X`` in the function variants
> any more, just ``1`` and ``A``, with ``A`` being equivalent
> to passing ``null`` in that position to the generic method.
> 
> (E.g., ``getSubjects_AAA()`` is equivalent to
> ``getSubjects(_, _, _)``, returning the set of all subjects
> in the graph.)
> 
> 
> ``Triples``
> -----------
> 
> The iterator-like object, ``Triples``, shall have
> the following API::
> 
>     Object sub, pred, ob;

Issue: Names. subj, pred, obj would be more consistent, i.e.
up to the *end* of the second consonant group.


> (These are ``null`` when the object hasn't been
> initialized, i.e., ``next()`` hasn't been called yet.)
> 
>     /** Advance to the next triple. */
>     void next();
> 
>     /** Whether there are any more triples to iterate through. */
>     boolean hasNext();
> 
>     /** Indicate that this <code>Triples</code> object won't be
>      *  used any more.
>      *  This shall only be called by the code that has requested
>      *  this object from <code>ConstGraph</code> (through
>      *  <code>.get()</code>). It's purpose is to tell the
>      *  <code>ConstGraph</code> that it can be re-used for the
>      *  next <code>get()</code>; <code>ConstGraph</code> can then
>      *  cache <code>Triples</code> objects, making life easier
>      *  for the garbage collector.
>      *  <p>
>      *  Calling this method is not obligatory. (If you don't,
>      *  this object will be garbage-collected normally.)
>      */
>     void free();
> 
>     boolean loop() {
>         if(hasNext()) {
>             next();
>             return true;
>         } else {
>             free();
>             return false;
>         }
>     }
> 
> The purpose of ``loop()`` is to enable the common loop
> pattern, ::
> 
>     for(Triples t = graph.get(...); t.loop();) {
>         // ...
>     }
> 
> which would otherwise have to be written as::
> 
>     Triples t;
>     for(t = graph.get(...); t.hasNext(); t.next()) {
>         // ...
>     }
>     t.free();

This should go into the javadoc.

> This isn't just harder to read, it also scopes ``t``
> wrongly. With the ``loop()`` pattern, the scope of ``t``
> is the body of the loop, which is exactly the code
> executed before ``free()`` is called.
> 
> 
> ``Graph``
> ---------
> 
> For changing graphs, the following API shall be used::
> 
>     /** Add a triple to this graph. */
>     void add(Object subject, Object predicate, Object object);
> 
>     /** Remove all triples matching a certain pattern from this graph.
>      *  If <code>subject</code>, <code>predicate</code> and/or
>      *  <code>object</code> are given, the triple must match these.
>      *  If any of the parameters is <code>null</code>,
>      *  any node will match it.
>      */
>     void remove(Object subject, Object predicate, Object object);
> 
>     void remove_A1A(Object predicate);
>     void remove_1AA(Object subject);
>     ...
> 
>     /** Replace all triples with the given predicate and object
>      *  with the given triple.
>      */
>     void setSubject(Object subject, Object predicate, Object object);
> 
>     /** Replace all triples with the given subject and predicate
>      *  with the given triple.
>      */
>     void setObject(Object subject, Object predicate, Object object);
> 
> We don't have ``setPredicate()`` because it is essentially useless
> and potentially harmful-- someone using it almost certainly
> intended to do something else.

You're not marking exactly what the **diff** to current practice
is here, and why.


> This is never a problem because the ``setXXX()`` methods
> are only a convenience. You can always do::
> 
>     graph.remove(_, predicate, _);
>     graph.add(subject, predicate, object);
> 
> if you *do* happen to have some esoteric use for it.
> 
> 
> Conclusion
> ==========
> 
> I believe this API will be substantially simpler to use
> than the one we have at the moment, and not lose
> anything w.r.t. speed. In fact, it may speed things up
> in the future, because we can cache the ``Triples`` objects.


        Tuomas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]