[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: robots.txt in git and online are not the same
From: |
Mark Polesky |
Subject: |
Re: robots.txt in git and online are not the same |
Date: |
Sun, 30 Jun 2013 14:18:08 -0700 (PDT) |
Phil Holmes wrote:
>> If robots.txt was getting updated properly, all of our
>> Google search bar problems would be solved. We could
>> then stop telling Google to restrict the search results
>> to a patrticular version from the search box itself. The
>> robots.txt file only allows the current stable docs to
>> be indexed.
>
> No - it would (AFAICS) prevent indexing docs prior to
> current stable. It would still index current development,
> which I believe remains correct. I know I've been out of the loop, but when
> was it decided
that we should allow Google to index the development docs? The CG indicates
that the robots.txt file should disallow
the current devel docs with the line
"Disallow: /doc/v2.CURRENT-DEVELOPMENT/":
http://lilypond.org/doc/v2.17/Documentation/contributor/major-release-checklist#Housekeeping-requirements
>> By the way, fixing that would kill 3 items in the
>> tracker with one blow:
>>
>> Issue 2909: Manual search returns results from wrong
>> Issue 3209: Searching stable release documentation
>> Issue 3367: Web/Docs: LilyPond version is not clear on
>
> Again - I don't think it would fix this, because users
> would still confuse current stable and current
> development. We had a lot of discussion about this
> problem on -user, and I think this is still a positive
> fix. But current development docs should not appear on Google. I
thought that was decided years ago:
http://lists.gnu.org/archive/html/lilypond-devel/2009-11/msg00221.html > OK -
I've checked the server, and you're quite right -
> there appears no mechanism for
> git/Documentation/web/server/robots.txt to update the root
----- Original Message -----
> From: Phil Holmes <address@hidden>
> To: address@hidden
> Cc:
> Sent: Sunday, June 30, 2013 7:49 AM
> Subject: Re: robots.txt in git and online are not the same
>
>& quot;Mark Polesky" <address@hidden> wrote in message
> news:address@hidden
>
>> If robots.txt was getting updated properly, all of our Google
>
>> search bar problems would be solved. We could then stop telling
>
>> Google to restrict the search results to a patrticular version
>
>> from the search box itself. The robots.txt file only allows the
>
>> current stable docs to be indexed.
>
>
>
> No - it would (AFAICS) prevent indexing docs prior to current stable. It
> would
> still index current development, which I believe remains correct.
>
>
>
>> If robots.txt was getting updated properly, then only 2.16 should
>
>> appear in search results, and would continue to appear for a
>
>> little while after we release 2.18. Then, the next time the
>
>> Google bot visits lilypond.org, it would read robots.txt, remove
>
>> the 2.16 docs from the search results, then follow every link from
>
>> the homepage that isn't "disallowed", i.e. the entire
> website,
>
>> with only the 2.18 docs. That's exactly what we want, so let's
>
>> just fix the problem with robots.txt.
>
>>
>
>> By the way, fixing that would kill 3 items in the tracker with one
>
>> blow:
>
>>
>
>> Issue 2909: Manual search returns results from wrong version
>
>> http://code.google.com/p/lilypond/issues/detail?id=2909
>
>>
>
>> Issue 3209: Searching stable release documentation should only return
> results from stable release
>
>> http://code.google.com/p/lilypond/issues/detail?id=3209
>
>>
>
>> Issue 3367: Web/Docs: LilyPond version is not clear on docs web pages
>
>> http://code.google.com/p/lilypond/issues/detail?id=3367
>
>
>
> Again - I don't think it would fix this, because users would still confuse
> current stable and current development. We had a lot of discussion about
> this
> problem on -user, and I think this is still a positive fix.
>
>
>
>> - Mark
>
>
> OK - I've checked the server, and you're quite right - there appears no
> mechanism for git/Documentation/web/server/robots.txt to update the root of
> the
> web server. I believe that make website copies it to /website/robots.txt,
> which
> is essentially useless. As I see it, there are 3 options: 1) I could
> manually
> copy robots.txt. This is not a long-term solution, but would be a step
> forward
> right now. If Mark wants me to do this and no-one shouts, I will. 2) We
> could
> have a Cron job on the server to do this. This strikes me as less good than
> 3)
> we could update make website to do this.
>
> Please let me know.
>
> -- Phil Holmes
> Bug Squad
>
>
> _______________________________________________
> bug-lilypond mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/bug-lilypond
>
> of the web server. That is a bug, and if no one has a solution ready, it needs
to be added to the tracker, either as a new issue or as an
addendum to #2909, #3209, or #3367. I think all 3 could
profitably be merged into one. > I believe that make website copies it to
> /website/robots.txt, which is essentially useless. As I
> see it, there are 3 options:
>
> 1) I could manually copy robots.txt. This is not a
> long-term solution, but would be a step forward right
> now. If Mark wants me to do this and no-one shouts,
> I will.
>
> 2) We could have a Cron job on the server to do this.
> This strikes me as less good than
>
> 3) we could update make website to do this. Option no. 3! I'm not opposed to
> option 1 right now, as
long as option 3 is recorded in the tracker. Or if anyone
knows how to fix it, feel free to chime in!
Thanks.
- Mark