[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: idea for Google Summer of Code project: html-reading info
From: |
Gavin Smith |
Subject: |
Re: idea for Google Summer of Code project: html-reading info |
Date: |
Fri, 12 Feb 2016 07:25:17 +0000 |
On 10 February 2016 at 19:42, Per Bothner <address@hidden> wrote:
> I suggest the texinfo project sponsor the following proposed project
> for Google Summer of Code, under the GNU umbrella. Ideally we'd
> want two mentors. I can be one of them, but it would be good to
> have someone familiar with the internals of the info program.
>
> ** Enhance GNU info documentation reader to read html files **
I don't know a huge deal about Google Summer of Code and if it can be
useful. Does it have a good track record at producing an improvement
that continues after the project is finished?
> There is no styling,
> and since lines are pre-wrapped, info can't adjust to different terminal
> widths.
Styling on a text terminal display is limited anyhow. The text filling
point is true. Different text widths of Info files can be generated
when the Info files are built, but this doesn't help later on if
different terminal widths are used. However, I expect that most of the
times when people are using various widths of terminal display, they
are using a graphical UI (X11), and they might prefer to use a
graphical reader, as long as it is good enough.
> We would like to deprecate the info format as a distribution format, while
> still
> using the texinfo source format and tool-chain. Instead, using html as the
> primary distribution format makes sense; we already have the tools to
> generate html.
It would make sense if the Info format didn't already exist and the
files were already being installed as HTML. Then it would make sense
to use the contents of those HTML files for displaying on a terminal.
But doing it this way now would add little to what we've already got,
apart from the text filling issue you mentioned.
The goal of having a reader that can read html files would be to take
advantage of graphical displays. Creating such a thing would be a
better use of someone's time (for example, Google Summer of Code). In
theory, it wouldn't have to process HTML files or written in
JavaScript - that's just assumed to be the easiest way. (Could be in
FooML and FooScript for values of Foo.)
So here's the main question: good ways to encourage someone to write a
graphical browser in JavaScript.
I'm not opposed to displaying HTML files on a text terminal, but it
seems like a waste of time at this stage. Maybe if/when HTML files are
regularly installed, that would the right time to process HTML for the
text terminal, to avoid two separate lots of documentation files being
installed (HTML and Info).
(I've thought about ways of refilling Info, but haven't come up with
anything really reliable yet. I think Emacs has an option to do it,
but it didn't work when I tried it. The two main ideas I've had are to
mark re-fillable (or non-refillable) text with a trailing space on
each line. I believe there is an email transmission format that does
this, and can also be recognized by the Vim editor. The other idea
would be to detect non-refillable text by the fact that it didn't use
the whole of the available line, and possibly by the line indentation
as well. It's likely that any method would be unreliable, or at least
you could construct examples where it failed. Another complication is
detecting an end-of-sentence at the end of a line so that extra space
can be added after the full stop.)
> The task is to enhance the existing info program (which is part of the
> texinfo distribution)
> so it can search for and read either html-format files or info-format files.
> If it finds
> an html-format file, it needs to parse the html file and display it
> more-or-less
> the same way as if it found an info-format file, and respond the keystrokes
> in the same way.
>
> The task includes stripping out html tags; line-wrapping when appropriate;
> recognizing
> links, and applying minimal formatting. The easiest approach might be for
> the info
> program to convert on-the-fly each section ("node") of the html file to
> plain text similar
> to the info format, since the info program already know how to handle that,
> and so you'd need
> minimal changes to the user interface. A slight complication is one might
> want to
> include ansi escape sequences for highlighting or colors.
>
> The converter does not need to handle generic html - just the html generated
> by the
> conversion program (makeinfo) from texinfo to html.
Adding HTML-reading capabilities to the Info program would not be a
good idea, from my experience of working on its source. In fact, that
could be a good way to kill it, by making it buggy and complicated.