web-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: HURD source code at hurd.gnu.org


From: Jim Franklin
Subject: RE: HURD source code at hurd.gnu.org
Date: Thu, 10 May 2001 05:53:04 -0700

Beauty! This is far better than I anticipated. If you can get any of the lxr
cool features running that would be icing on the cake =)

Jim

-----Original Message-----
From: address@hidden [mailto:address@hidden Behalf Of
Ian Duggan
Sent: Thursday, May 10, 2001 3:35 AM
To: web-hurd
Subject: Re: HURD source code at hurd.gnu.org



> Great! Drop him a line directly (email address above) since he is not a
> member of the web-hurd mailing list and cc copies of the correspondence to
> address@hidden

I already spoke with him today. The gist of the conversations was that
we are limited to static HTML and we shouldn't depend on playing with
the mime types. It should still be workable, however.

We will have to setup a script that runs LXR and produces a static HTML
tree from it. This might involve mimicing calls to the script as a CGI
and spidering the resulting pages to produce a tree of static HTML
representing all the possible LXR outputs. There will be some
translation of links and filenames as well, since we'll need .html
extensions on all the pages.

I'm working on a prototype of it right now.

I'm attaching the messages we exchanged.

--Ian








> Subject:
>          Hurd, LXR, and gnu.org
>    Date:
>          Tue, 08 May 2001 23:42:29 -0700
>    From:
>          Ian Duggan <address@hidden>
>      To:
>          address@hidden
>
>
>
>
> Hi,
>
> I think you were party to the recent discussions on address@hidden
> relating to adding LXR or GNU Global to the Hurd website for use in
> browsing the cross-indexed code.
>
> Jim Franklin asked me to get in touch with you and go over the issues
> that might arise in trying to use LXR in a batch mode. The general
> strategy I had was:
>
> 1) Setup LXR on a workstation and point it to a code checkout.
> 2) "wget -r" the local LXR to produce a tree of possible LXR outputs.
> 3) Translate filenames and links within HTML files so as not to invoke
> CGI actions when called (URL decode them, basically)
> 4) rsync it all to the website.
>
> Things that might prevent this strategy:
>
> 1) Does gnu.org allow overriding mime types in .htaccess files via
> DefaultType or ForceType?
> 2) Is it possible to have a regular rsync _to_ the hurd.gnu.org document
> tree?
> 3) Need to test and see how big the tree would be. It's all text, so I
> imagine it would be manageable. What are the file limits there?
>
> What are you thoughts/comments on this?
>
> --Ian








> Subject:
>              Re: Hurd, LXR, and gnu.org
>        Date:
>              Wed, 9 May 2001 18:41:30 -0400
>        From:
>              Paul Visscher <address@hidden>
>          To:
>              Ian Duggan <address@hidden>
>  References:
>              1
>
>
>
>
>
> Ian Duggan address@hidden said:
> > 1) Setup LXR on a workstation and point it to a code checkout.
> > 2) "wget -r" the local LXR to produce a tree of possible LXR outputs.
> > 3) Translate filenames and links within HTML files so as not to invoke
> > CGI actions when called (URL decode them, basically)
> > 4) rsync it all to the website.
>
> Were you thinking we would do this once, or every night, or ?. Do you
> think it's possible to script this so we can say on gnudist "go get the
> hurd stuff and DTRT" and have it all magically work?
>
> > Things that might prevent this strategy:
> >
> > 1) Does gnu.org allow overriding mime types in .htaccess files via
> > DefaultType or ForceType?
>
> I honestly don't know. But, I would like to avoid this if possible. It
> tends to be un-portable to the mirrors, especially if they aren't apache
> or are apache and don't support .htaccess for whatever reason.
>
> > 2) Is it possible to have a regular rsync _to_ the hurd.gnu.org document
> > tree?
>
> Do you mean from gnudist to hurd.gnu.org? Sure, I don't see why that
> would be a problem. We are doing similar things(excecpt with CVS
> checkouts instead of rsync) currently.
>
> > 3) Need to test and see how big the tree would be. It's all text, so I
> > imagine it would be manageable. What are the file limits there?
>
> address@hidden [03:38pm] ~$ df -h
> Filesystem            Size  Used  Avail  Capacity Mounted on
> /dev/sda2             8.1G  6.2G   1.4G     81%   /
>
> so we've got a bit to play with. If you don't think this will be enough,
> we can talk to the system hackers and have them try to clean some stuff
> up.
>
> Actually, we're looking to mirror ftp on gnudist, too, so things may get
> tight, but we can worry about that later.
>
> [ I just realized you may not know this -- gnudist.gnu.org is
www.gnu.org ]
>
> --paulv






> Subject:
>              Re: Hurd, LXR, and gnu.org
>        Date:
>              Wed, 09 May 2001 16:01:54 -0700
>        From:
>              Ian Duggan <address@hidden>
>          To:
>              Paul Visscher <address@hidden>
>  References:
>              1 , 2
>
>
>
>
> > > 1) Setup LXR on a workstation and point it to a code checkout.
> > > 2) "wget -r" the local LXR to produce a tree of possible LXR outputs.
> > > 3) Translate filenames and links within HTML files so as not to invoke
> > > CGI actions when called (URL decode them, basically)
> > > 4) rsync it all to the website.
> >
> > Were you thinking we would do this once, or every night, or ?. Do you
> > think it's possible to script this so we can say on gnudist "go get the
> > hurd stuff and DTRT" and have it all magically work?
>
> It was mentioned that once a week would be plenty. I was imagining that
> the heavy lifting would occur on someone's workstation or a machine with
> less restrictions than the gun.org machines seem to have. I don't know
> anything about the gnu.org setup, so I'm learning from you as we speak.
>
> The final product would be transferred to the relevant gnu.org machine
> once a week, either by push or by pull. I don't think it matters which,
> though push would put the onus on the person managing the process. I
> think that might be better.
>
> The process could be easily scripted.
>
> > > 1) Does gnu.org allow overriding mime types in .htaccess files via
> > > DefaultType or ForceType?
> >
> > I honestly don't know. But, I would like to avoid this if possible. It
> > tends to be un-portable to the mirrors, especially if they aren't apache
> > or are apache and don't support .htaccess for whatever reason.
>
> Hmm. The mirrors seem to be the problem. So our only guarantee for the
> lowest common denominator across all the mirrors is plain old static
> html? That leaves out a lot of neat things that could be done to make
> development easier. It would be nice if there was a minimum level of
> capability required to be a mirror, but that is a different discussion.
>
> Does everything have to work on the mirrors, or even be mirrored? I
> don't think a source code reference would receive a lot of traffic, but
> then again, I've never set one up for a busy project before.
>
> > > 2) Is it possible to have a regular rsync _to_ the hurd.gnu.org
document
> > > tree?
> >
> > Do you mean from gnudist to hurd.gnu.org? Sure, I don't see why that
> > would be a problem. We are doing similar things(excecpt with CVS
> > checkouts instead of rsync) currently.
>
> I was meaning from the heavy lifting machine to the final resting place
> for the web pages. I don't think the actual indexing would occur on the
> gnudist machine. Is hurd.gnu.org a completely different machine, or just
> a virtual domain or something?
>
> > > 3) Need to test and see how big the tree would be. It's all text, so I
> > > imagine it would be manageable. What are the file limits there?
> >
> > address@hidden [03:38pm] ~$ df -h
> > Filesystem            Size  Used  Avail  Capacity Mounted on
> > /dev/sda2             8.1G  6.2G   1.4G     81%   /
> >
> > so we've got a bit to play with. If you don't think this will be enough,
> > we can talk to the system hackers and have them try to clean some stuff
> > up.
> >
> > Actually, we're looking to mirror ftp on gnudist, too, so things may get
> > tight, but we can worry about that later.
>
> I think that would be plenty of space. I believe we are talking about
> tens of megs, rather than hundreds here.
>
> --Ian








> Subject:
>              Re: Hurd, LXR, and gnu.org
>        Date:
>              Wed, 9 May 2001 19:20:57 -0400
>        From:
>              Paul Visscher <address@hidden>
>          To:
>              Ian Duggan <address@hidden>
>  References:
>              1 , 2 , 3
>
>
>
>
>
> Ian Duggan address@hidden said:
> > It was mentioned that once a week would be plenty. I was imagining
> > that the heavy lifting would occur on someone's workstation or a
> > machine with less restrictions than the gun.org machines seem to
> > have. I don't know anything about the gnu.org setup, so I'm learning
> > from you as we speak.
> >
> > The final product would be transferred to the relevant gnu.org machine
> > once a week, either by push or by pull. I don't think it matters
> > which, though push would put the onus on the person managing the
> > process. I think that might be better.
> >
> > The process could be easily scripted.
>
> I'm not sure that running it on a machine outside of GNUs control is a
> good idea. But I'm not convinced it's a bad one, either. I need to think
> about it more.
>
> Other than that, this sounds fine.
>
> > Hmm. The mirrors seem to be the problem. So our only guarantee for the
> > lowest common denominator across all the mirrors is plain old static
> > html? That leaves out a lot of neat things that could be done to make
> > development easier. It would be nice if there was a minimum level of
> > capability required to be a mirror, but that is a different
> > discussion.
>
> Yeah. That's why the site is very few graphics and basic formatting --
> so all the different web browsers out there can render it, and nothing
> dynamic because we can't make the mirrors run certain things.
>
> It really does limit our options, which is a bit annoying, but once you
> convince yourself that it really is for the best, solutions become more
> clear. For example, lots of the internal stuff is generated from scripts
> that run every night.
>
> > Does everything have to work on the mirrors, or even be mirrored? I
> > don't think a source code reference would receive a lot of traffic,
> > but then again, I've never set one up for a busy project before.
>
> The idea behind the mirrors is a few fold - (a) take some load off our
> servers(in the past, we have succumbed to the slashdot effect), (b)
> provide fast access to people who aren't in the US, (c) redundancy if
> our site is offline for whatever reason, and (d) try to get the message
> out to more people.
>
> It is nice to have a complete copy of the website on the mirrors. I'd
> like to have this stuff on all the mirrors, and I think we can pull it
> off, somehow.
>
> > I was meaning from the heavy lifting machine to the final resting place
> > for the web pages. I don't think the actual indexing would occur on the
> > gnudist machine. Is hurd.gnu.org a completely different machine, or just
> > a virtual domain or something?
>
> hurd.gnu.org is a virtual host on gnudist.gnu.org.
>
> > I think that would be plenty of space. I believe we are talking about
> > tens of megs, rather than hundreds here.
>
> Oh, we'll be fine, then.
>
> --paulv




> Subject:
>              Re: Hurd, LXR, and gnu.org
>        Date:
>              Wed, 09 May 2001 17:25:49 -0700
>        From:
>              Ian Duggan <address@hidden>
>          To:
>              Paul Visscher <address@hidden>
>  References:
>              1 , 2 , 3 , 4
>
>
>
>
> > > It was mentioned that once a week would be plenty. I was imagining
> > > that the heavy lifting would occur on someone's workstation or a
> > > machine with less restrictions than the gun.org machines seem to
> > > have. I don't know anything about the gnu.org setup, so I'm learning
> > > from you as we speak.
> > >
> > > The final product would be transferred to the relevant gnu.org machine
> > > once a week, either by push or by pull. I don't think it matters
> > > which, though push would put the onus on the person managing the
> > > process. I think that might be better.
> > >
> > > The process could be easily scripted.
> >
> > I'm not sure that running it on a machine outside of GNUs control is a
> > good idea. But I'm not convinced it's a bad one, either. I need to think
> > about it more.
> >
> > Other than that, this sounds fine.
>
> Ok. The reason I was thinking that was because part of the processes
> involves sucking down all the possible outputs of a CGI script. I was
> hoping for a quicker solution, but there is definitely a way to do it
> all statically, and have it run on gnudist as well, if needed.
>
> It should be possible to wrap the LXR cgi as a regular script and have
> the whole thing run locally, without the need for sockets and such. At
> most, we could probably get away with a FIFO or pipe or something. I'll
> pull down the pieces and look at it.
>
> What would be permissible on gnudist in terms of sockets and such? Tcp?
> Unix? etc...
>
> > It really does limit our options, which is a bit annoying, but once you
> > convince yourself that it really is for the best, solutions become more
> > clear. For example, lots of the internal stuff is generated from scripts
> > that run every night.
>
> Ok. I think there's a way to do that. We'll just have to do renaming of
> files to be .html and such, so that the mime types work. It's definitely
> possible, it'll just take a little bit of extra work.
>
> > It is nice to have a complete copy of the website on the mirrors. I'd
> > like to have this stuff on all the mirrors, and I think we can pull it
> > off, somehow.
>
> Agreed. I'll look at the stuff and find a way to do it scripted and
> without messing with the apache mime types. That should definitely be
> possible.
>
> --Ian

_______________________________________________
Web-hurd mailing list
address@hidden
http://mail.gnu.org/mailman/listinfo/web-hurd




reply via email to

[Prev in Thread] Current Thread [Next in Thread]