Re: <reductions>

bison-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: <reductions>

From:	Wojciech Polak
Subject:	Re: <reductions>
Date:	Tue, 09 Oct 2007 18:26:45 +0200

Hello,

On Sun, 30 Sep 2007, Akim Demaille wrote:
> I can be wrong, but I'd feel better if the XML file was
> without redundancy, even if that requires a bit more work
> from the XSLT tools.  Work that I guess can be factored with
> an XLST library tailored to our XML format (I'm using words
> I understand, but which I never practiced for real, so I
> might suggest stupid things here :).

I'd feel better too, but sometimes rendundancy can simplify
the processing. Read below...

On Sun, 30 Sep 2007 20:57:40 -0400 Joel E. Denny wrote:
> In the automaton, instead of:
>      <itemset>
>         <rule number="0">
>           <lhs>$accept</lhs>
>           <rhs>
>             <symbol class="nonterminal">exp</symbol>
>             <symbol class="terminal">$end</symbol>
>             <point/>
>           </rhs>
>         </rule>
>       </itemset>
> 
> we could have:
> 
>       <itemset>
>         <item rule-number="0" marker="2" kernel="true" />
>       </itemset>

I wrote the patch which generates the code as above (except
for the kernel attribute which I haven't finished)
and the result is that XML is smaller, but the processing
time is longer (XSLT via xsltproc). File sizes:

Before (with --report=all):
212K    anubis.xml
1,4M    awk.xml
144K    bison.xml
28K     calc.xml
2,1M    c.xml
8,0K    errors.xml
1,1M    pascal.xml
964K    rewrite.xml
96K     sieve.xml

After (less redundancy, still with --report=all):
116K    anubis.xml
592K    awk.xml
108K    bison.xml
16K     calc.xml
796K    c.xml
8,0K    errors.xml
504K    pascal.xml
392K    rewrite.xml
60K     sieve.xml

And the processing time:

Before (processing above XML files with --report=state and --report=all):

$ time make text
for i in xml-state/*.xml; do \
 xsltproc xslt/xml2text.xsl $i >output-state-from-xml/`basename $i`.output; \
done
for i in xml-all/*.xml; do \
 xsltproc xslt/xml2text.xsl $i >output-all-from-xml/`basename $i`.output; \
done

real    0m6.994s
user    0m6.500s
sys     0m0.147s

After (with less redundancy):

$ time make text
for i in xml-state/*.xml; do \
 xsltproc xslt/xml2text.xsl $i >output-state-from-xml/`basename $i`.output; \
done
for i in xml-all/*.xml; do \
 xsltproc xslt/xml2text.xsl $i >output-all-from-xml/`basename $i`.output; \
done

real    0m21.371s
user    0m21.216s
sys     0m0.115s

(perhaps my xml2text.xsl is not perfect, but still...)

Although XML is smaller, the processing with XSLT is a little more
difficult (I adjusted xml2text.xsl to generate exactly the same
output as the original one in CVS) and slower. For me this is
disk space vs performance and processing ease. Even if XSLT is
quite easy to adjust, it can be very difficult or even impossible
straight way to process less-redundancy XML with SAX as it
is a stream event-driven processing.

My patch for C and XSLT against yesterday CVS head is attached,
although it won't work after Joel's today commits.

On 30 Sep 2007, Joel E. Denny wrote:
> xml2xhtml.xsl and xml2text.xsl now share a template for computing 
> conflicts.  As Akim suggested, I've started a library.  I named it 
> bison.xsl. 

Very good idea!

> I committed the following.

One more thing, I thought the bison-patches (and similar lists)
is a list for putting stuff before committing it (so we can
discuss the best solutions and etc.), and not after...
Anyway, good work Joel.

> As we refactor the XML implementation to remove redundancies,
> this will make regression testing much easier.

Finally, I would be very careful while trying to remove all
redundancy from XML. Disk space is cheap, but processing time,
performance and/or processing ease might be not (XSLT is not
the only way to process XML)... But of course we should try
to achieve the best of it :).

Regards,
Wojciech

bison-cvs.diff
Description: Text Data

[Prev in Thread]

Current Thread

[Next in Thread]

Re: <reductions>, Joel E. Denny, 2007/10/09
- Re: <reductions>, Tim Van Holder, 2007/10/09
  - Re: <reductions>, Joel E. Denny, 2007/10/09
    - Re: <reductions>, Tim Van Holder, 2007/10/10
    - Re: <reductions>, Joel E. Denny, 2007/10/10
- Re: <reductions>, Wojciech Polak <=
  - Re: <reductions>, Joel E. Denny, 2007/10/09
    - Re: <reductions>, Joel E. Denny, 2007/10/10
    - Re: <reductions>, Wojciech Polak, 2007/10/10
    - Re: <reductions>, Joel E. Denny, 2007/10/10
    - Re: <reductions>, Wojciech Polak, 2007/10/11
- Re: <reductions>, Joel E. Denny, 2007/10/14
  - Re: <reductions>, Wojciech Polak, 2007/10/16
    - Re: <reductions>, Joel E. Denny, 2007/10/16
    - Re: <reductions>, Wojciech Polak, 2007/10/17
- Re: <reductions>, Joel E. Denny, 2007/10/17

Prev by Date: Re: <reductions>
Next by Date: Re: Bison XML
Previous by thread: Re: <reductions>
Next by thread: Re: <reductions>
Index(es):
- Date
- Thread