Re: [Texmacs-dev] More comments on David's document

texmacs-dev
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Texmacs-dev] More comments on David's document

From:	David Allouche
Subject:	Re: [Texmacs-dev] More comments on David's document
Date:	Thu, 11 Apr 2002 15:56:15 +0200
First I want to make one thing absolutely clear:

  I am talking of interfaces.

I do have some possible implementations in mind, but all I am currently 
saying is about designing the best possible interfaces to meet the modularity 
and performance requirements of the project.

I do not say "the class implements this method that way", but I say "the 
classes defines this interface with that contract".

I am making this point because I have the feeling the discussion is 
getting fussy because you are introducing much more implementation 
considerations that are required for designing the interface. I agree that 
implementation strategies will strongly affect the interfaces since there are 
very high performance requirements, but too much mixing the two discussions 
just make both points more obscure.

On Thursday 11 April 2002 11:35, Joris van der Hoeven wrote:
> On Thu, 11 Apr 2002, David Allouche wrote:
>
> > I think proper delegation mandates that stree be a subclass of ttree.
> > The stree class extends the tree operations methods to update the inverse
> > paths.
> >
> > Moreover ttree is a simple class encapsulating a data structure, but
> > stree implement in addition an observer registration interface, send
> > notifications on edition operations, and define a a "commit" method.
> >
> > On the longer term, stree could also implement a complete transactional
> > interface, with concurrent access, locking and rollback. That would be
> > useful to allow real time collaborative editing of a document.
>
> No, all these additional methods are part of the editor,
> not of the source tree, which should really stay a *data* object.

I think those methods must be part of stree and not of editor for the 
following reasons:

  1. A classical object oriented design principle: classes implement 
concepts.  The stree class implements the parsed representation of the 
document main file, so it should define all the low-level methods which 
operate on a the file structure. 

  2. When you have a class with two very different usage patterns, one of 
these requiring a lot of additional code, a subclass must be defined to 
encapsulate additional behaviour.

  3. There is a fundamental feedback loop in the system:

     stree->rewriter->typesetter->editor->stree

     In that loop, rewriters are perfectly decoupled of "stree" and 
"typesetter" by the use of a simple Facade-Observer design pattern. The 
typesetter/editor interface is out of the scope of my work, but you say 
yourself that it is clean. My point is that the last interface, editor/stree, 
must also be kept as clean as possible.

     One specific implementation I have in mind is the real time concurrent 
edition of a stree by several editors. That could be implemented by replacing 
stree by a proxy object handling a remote object protocol. In that 
implementation, there is no concept of a "current editor" which is in charge 
of performing concrete structure operation on the stree and sending 
notifications to rewriters.

     Anyway, rewriters are conceptually associated to a stree, not to an 
editor, so according the "classes implement concepts" principle, rewriters 
must be notified by stree.

  4. There no overhead. If you want a dumb data structure (with associated 
elementary operations) you can use ttree. The methods of stree and ttree will 
not be virtual, so there will be zero overhead induced by subclassing.

     The real-time concurrent edition system would indeed require some kind 
of abstraction, but that can be made with no time overhead by making the 
editor class a template class (though I think that would not give a 
significant performance gain over the other approach, which is defining 
virtual methods in stree and redefining them in remote_stree)

> > > > Observers
> > > > ---------
> > >
> > > Like in the case of tree and stree, there is no real difference between
> > > the rewriter, tree_observer and typesetter classes: they all refer to
> > > the same abstract class.
> > >
> > > In the case of rewriter, tree_observer and typesetter,
> > > I nevertheless feel that the difference in purpose is greater,
> > > so you may use typedefs to let the different classes become synonyms.
> > > (I also prefer "observer" to "tree_observer", since we will never
> > > observe other types of objects). However, I am against doing
> > > the same thing for "stree".
> >
> > I really do not understand what you are thinking of when your are talking
> > of using "typedefs to let the different classes become synonyms". Why not
> > simply use inheritance with an abstract "observer" class which is
> > implemented by "rewriter" and "typesetter"???
>
> I want typedefs for the moment, because this is faster.
> I repeat: efficiency is a *major* design goal, whether you like that or
> not. In any case, in the foreseeable future, these classes will be perfect
> synonyms, which is better expressed using a typedef. If they ever turn out
> to be different, then this can always be changed later.

If understand what you mean, you want to avoid the virtual method invocation 
overhead, right?

But you just cannot avoid it.

You cannot avoid it virtual method call when notifying rewriters because 
rewriter must be an abstract class in order to seamlessly support mixing 
several of transformations type in the same chain.

What overhead you could avoid, is the virtual method calls from the last 
rewriter to the typesetter, but that would require defining a whole new 
interface to break the encapsulation of the last rewriter. That would make it 
obligatory that the last rewriter is a TMSL interpreter, but I see no 
justification for that constraint. Moreover the performance gain would be 
negligible since it would be ONE virtual method call for every elementary 
operation from the TMSL rewriter to the typesetter plus the commit virtual 
method call at the end of every edition operation.

But maybe you have some incremental implementation plan in mind that I just 
do not know, in case that would make a lot more sense.

> > Edit tree
> > ---------
> >
> > I think there is a need for more terminology here. Since the current edit
> > tree is very overloaded and we are going to separate it in several parts,
> > we need more words.
>
> You do not have to care about the edit tree anyway:
> you care about the source tree.

Today, as far as I know, the two are the same. The rewriters will separate 
the source and edit trees, so we must decide which responsibility goes 
where. So I care about the edit tree because it is the semantic complement of 
the edit tree relative to the current editor's "et" data member.


> > I see TeXmacs as an interactive
> > structured typesetting system. The typesetting system is made of a
> > typesetter tightly coupled with an editor. That means that the
> > typesetting language not only has primitives for layout, but also for
> > controlling the editor.
>
> Wrong: the typesetting language has no primitives whatsoever for
> controlling the editor and this independence is a major advantage.
> This does not withhold the editor to *communicate* with typeset boxes,
> in order to associate logical paths to physical positions or
> to find a hyperlink associated to a region of text.
> In other words, there is a clear separation between structure and
> rendering on the implementation level.
>
> > For example, it could be useful to decouple the region concept (as
> > implemented by varexpand) from the environment concept (as implemented by
> > most structures). A region affects only the typesetter, while an
> > environment also affect the editor (infinitesimal positions).
>
> Yes, this will be a result of what you are doing.
> We are working towards a clean separation of
>
> 1. Structure
> 2. Rewriting and/or scripting
> 3. Rendering

So you do not agree with my generic formulation, but you do agree with my 
example... I guess that is consequence of divergence or a misunderstanding 
somewhere else...


[from earlier your mail]
> > I think that the behaviour of the editor must not be directly affected by
> > the source tree, but only by the object tree.
>
> No: the source tree contains the structure; the object tree is only
> obtained after rewriting and the boxes after typesetting.
> The only sensible structure the editor operates on is the source tree.
[end of moved part]
> > Since the source tree structure is completely independent from the object
> > tree structure, we cannot rely on the source tree for controlling the
> > editor. Instead, the editor will only send edition notifications to the
> > stree.
>
> No, we can not rely on the object tree for controlling the editor,
> because the structure of the object tree does not directly correspond to
> the structure of the source tree, which we are editing.

I will try to make my point clear because I think there is an essential 
misunderstanding here.

  1. The behaviour of the editor must only be controlled by the object tree.

  2. Any edition operation is directly applied to the source tree, at the 
location provided by the inverse path tag of the innermost object tree, that 
is the source tree position corresponding to the cursor position, or the 
source tree position corresponding to the innermost object tree containing 
the whole selection.

Point 2 is only a consequence of point 1, that is the way to make sense of 
edition operations requested by the user on the object tree, that is on what 
the user actually see.

Point 1 is required to support generic transformations. For example, suppose 
you are editing a source document which has the following structure:

<addbk|\
  <persons|\
    <person|<first|Joris><last|Hoeven><part|van der><job|psud>>\
    <person|<first|Ralph><last|Treinen><job|psud>>\
    <person|<<first|David><last|Allouche>job|eisti>>>\
  <jobs|\
    <job|psud|<name|Universite de Paris Sud><city|Orsay>>\
    <job|eisti|<name|EISTI|Ecole Internationale...><city|Cery>>>>

Yes, that is essentially a relational database, and yes it is XML oriented.

And that your first rewriter gives a document which looks like:

<body\>
  <section|Jobs>
  <description\>
    <item*|<concat|Joris|van der|Hoeven>:><concat|Universite de Paris Sud>
    <item*|<concat|Ralph|Treinen>:><concat|Universite de Paris Sud>
    <item*|<concat|David|Allouche>:><concat|Ecole Internationale...>
  <description/>
<body/>

Yes, I know concat is not a feature of TeXmacs externalisation style, that is 
just an example.

As you see, the structure of the second tree (which is what the user sees) is 
completely different from the structure of the source tree.

I want how you intend to control the editor from the source tree in a way 
that makes sense for the user, who sees the second tree. It only make sense 
to control editor from the object tree (which is essentially the same as the 
second tree), and to give feedback to the user after modifications to the 
source tree have made their way to the object tree.

Dynamic validator
-----------------

> Yes, that is the next step, which I already mentioned in a discussion:
> I plan to incorporate DTD support in TeXmacs. At that point we will have
> four levels:
>
> 1. DTD and validated logical trees
> 2. Brute non validated trees (source trees for rewriting)
> 3. Rewriting and/or scripting
> 4. Rendering
>
> Most editor routines in the extension language will operate on level 1.
> Some routines will operate directly on level 2 for efficiency reasons.

That is another discussion, but I do not think we should do it that way. I 
have not yet really thought of it, but, at first, I think that things should 
be controlled by another feedback loop to allow the use of plugin validators 
(DTD, various schemas, scripting languages).

The feedback loop would look like:

              +------- pointer path --------+       
              v                             |
  stree -> validator -> user interface -> editor -> validator -> stree

The ideas is to only use the validator where validation can be broken, that 
is during edition operations. That approach make it unnecessary to make a 
distinction between validated and non-validated trees.

Again, a simple protocol will have to be designed between all the components.

> > Did you ever used different views of the same document using different
> > stylesheets? That is very useful, especially when one of the stylesheet
> > exposes otherwise invisible data. So we need that the complete
> > transformation chain as well as the object tree be local to an editor.
> > Things could indeed be optimised by using more independent abstractions
> > of editors and views so that the same rewriting would not be redone for
> > two views displaying the same document with the same transformation.
>
> Yes, I sometimes use this feature (although not very often).
> But I agree, we need to make this feature more and more powerful.

So, we agree that there should one object tree for each editor. The 
disagreement seems only to be on what is really the edit tree. As I 
previously said, I think the edit tree must be the object tree.

Force method
------------

It looks like we are eventually coming to a agreement on that issue :)

> >
> > So my prototype "box force(tree, path context)" is the right one, because
> > it does not mandate anything but the existence of a rewriter for the
> > whole tree.
>
> I am still not completely convinced.
>
> Mainly, I do not see the purpose of the tree argument.
> It is used to modify the rendering (like putting it in bold),
> then I feel that this should really be done in the tree structure itself,
> because we might rewrite that structure in a way that it does what you
> want anyway. The tree argument is only needed as a query,
> but this, in its turn, is only needed if we do not return a box,
> since the box already contains all information.
>
> Also, the path argument is not a context but a path to a subtree.

In the RFC, I said:
The tree will be typeset as if it was located at the given <var|context> path

What I mean is that returned box is the result of the typesetting of tree 
parameter after rewriting. We need to pass the tree to typeset as a parameter 
because that information cannot be inferred from the identity of the notified 
rewriter object, because rewriter is a Facade.

To rewrite and typeset the parameter properly, we need to specify a context. 
The path parameter specifies a position in the notional output ttree of the 
rewriter issuing the force message. The tree parameter will be rewritten and 
typeset as if it was inserted at that position.

That allows the forcing of a document fragment which is not actually present 
in the tree. Maybe you do not agree with that, but I think that approach is 
sounder from a functional programming perspective.

First pass typesetting
----------------------

> > I am one of those people who feel comfortable when they now a bit more
> > than they strictly have to, so I assume the same for my readers. I know I
> > have not yet a full understanding of how the typesetter works, but I
> > think the little I said was correct. If it is not, I would really like
> > the correct version.
>
> Yes, I will give this to you, but I have no time for that right now,
> so I want to concentrate on the information that you really need most.

Thanks in advance. I will leave that section in place until I have more 
precise information. Since I had quite a look at the typesetting system 
before sending my last patch (about moving the caret out of multicols), I 
think it not too wrong.

RFC layout
----------

> > I will distribute the document in A5 papyrus, since I think that A4
> > papyrus is too wide for comfortable on-screen reading. Given the
> > typessetting time of the document, automatic page type is not an option.
>
> That is fine. Please use 600 dpi fonts by the way.

I you really want me too, I will do it.

But I want to point that the optimal anti-aliasing quality is obtained when 
using shrinking=3 (so, dpi=360). With shrinking=2 (dpi=240), the text is 
still ragged, but with shrinking=5 (dpi=600), the text has much less 
contrast. If you want to be convinced, just use xmag on the same text with 
different dpi and shrinking settings.

I agree that more antialiasing makes the page overall more beautiful, but 
more contrast makes the text less tiring to work with, since the eye can 
accommodate on the screen more easily.
-- 

                                  -- David --
[Prev in Thread]
Current Thread
[Next in Thread]
[Texmacs-dev] More comments on David's document, Joris van der Hoeven, 2002/04/07
- Re: [Texmacs-dev] More comments on David's document, David Allouche, 2002/04/10
  - Re: [Texmacs-dev] More comments on David's document, Joris van der Hoeven, 2002/04/10
    - Re: [Texmacs-dev] More comments on David's document, David Allouche, 2002/04/11
    - Re: [Texmacs-dev] More comments on David's document, Joris van der Hoeven, 2002/04/11
    - Re: [Texmacs-dev] More comments on David's document, David Allouche <=
    - Re: [Texmacs-dev] More comments on David's document, David Allouche, 2002/04/11
    - Re: [Texmacs-dev] More comments on David's document, David Allouche, 2002/04/11
    - Re: [Texmacs-dev] More comments on David's document, Joris van der Hoeven, 2002/04/11
Prev by Date: Re: [Texmacs-dev] More comments on David's document
Next by Date: Re: [Texmacs-dev] More comments on David's document
Previous by thread: Re: [Texmacs-dev] More comments on David's document
Next by thread: Re: [Texmacs-dev] More comments on David's document
Index(es):
- Date
- Thread