[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [shell-script] Futuro desta lista de discussão
From: |
Jorge Barros de Abreu |
Subject: |
Re: [shell-script] Futuro desta lista de discussão |
Date: |
Fri, 1 Nov 2019 06:23:22 -0300 |
User-agent: |
Mutt/1.6.1 (2016-04-27) |
Algue'm ja' fez o backup????
Se todo mundo começar a fazer backup ao mesmo tempo o
yahoo vai entender mal essas tentativas de backup.
Abaixo segue dois README's de programas sugeridos de backup
postados aqui na lista (nao me lembro por quem, desculpe)
# YahooGroups-Archiver
#### A simple python script that archives all messages from a public Yahoo Group
YahooGroups-Archiver allows you to make a backup copy of all the messages in a
public group. Not only is all the message content downloaded, but also all
other raw data that Yahoo uses to display the messages.
Messages are downloaded in a JSON format, with one .json file per message.
There is support for private groups, but this requires that you have a Yahoo
groups account that has access to the private groups you want to archive. See
the 'Private Groups' section for more info.
Works with both Python 2 and Python 3.
## Usage
**`python archive_group.py <groupName> [options] [nologs]`**
where *`<groupName>`* is the name of the group you wish to archive (e.g:
hypercard)
**Options**
* *`update`* - the default., Archive all new messages since the last time the
script was run
* *`retry`* - Archive any new messages, and attempt to archive any messages
that could not be downloaded last time
* *`restart`* - Delete all previously archived messages and archive again from
scratch
Please note that you can only have one *Option*, if you specify more than one,
only the first will be used, with the others being ignored.
By default a log file called <groupname>.txt is created and stores information
such as what messages could not be received. This is entirely for the benefit
of the user: it's not needed at all by the script during any re-runs (although
re-runs will append new information to the log file). If you don't want a log
file to be created or added to, add the `nologs` keyword when you call the
script.
## Private Groups
It is possible to archive private groups using this tool, but the way to go
about doing this is slighly fiddly at the moment. Rather than simply providing
your login information for the Yahoo account that has access to the private
groups, you need to provide two pieces of information from Yahoo's login
cookies (small files created by web browsers to store data for various uses,
such as allowing you to login to websites and then stay logged in for a certain
period of time).
Cookie information can be found through the use of a plug-in for your web
browser. (I use 'Cookie Manager' on FireFox, although there are many other
options for FireFox and other browsers). The two cookies you are looking for
are called *Y* and *T*, and they are linked to the domain *yahoo.com*. Extract
the data from these cookies, and paste it into the appropriate variables in the
*archive_groups.py* script. You should now be able to archive a private group.
Please note that this support is still experimental. One important issue to
consider is that a cookie will expire after a certain amount of time, which
varies between computers. This means that you may have to re-fetch the *Y* and
*T* cookie data every few days, or you will not be able to archive private
groups.
## Note
Yahoo attempts to block connections that it deems to be "spamming", and so
after around 15,000 messages have been downloaded it is highly likely that
Yahoo will block you. This is OK, the script will automatically stop, and Yahoo
should unblock you after around two hours. Running the script again once you
have been unblocked will just continue where it left off. (Unless you run with
the *`restart`* *[option]*, of course!
## Credits
Thanks to the [Archive Team](http://archiveteam.org/) for making [information
about the Yahoo Groups
API](http://www.archiveteam.org/index.php?title=Yahoo!_Groups) available.
**********************
# yahoo-groups-backup
A python script to backup the contents of Yahoo! groups, be they private or
public.
## Setup/Requirements
The project requires Python 3.5+, Mongo, and a computer with a GUI as Selenium
is used for the scraping (to be able to handle private groups).
[virtualenv](https://virtualenv.pypa.io/en/stable/) is recommended.
git clone https://github.com/csaftoiu/yahoo-groups-backup.git
cd yahoo-groups-backup
pip install -r requirements.txt
## Example
To scrape an entire site, say the `concatenative` group:
./yahoo-groups-backup.py scrape_messages concatenative
This will shove all the messages into a Mongo database (default
`localhost:27017`), into the database of the same name as the group.
To scrape the files as well (though this group has no files):
./yahoo-groups-backup.py scrape_files concatenative
To dump the site as a human-friendly, fully static (i.e. viewable from the file
system) website:
./yahoo-groups-backup.py dump_site concatenative concatenative_static_site
Then simply open `concatenative_static_site/index.html` and browse!
## Full Usage
To see the full usage:
./yahoo-groups-backup.py -h
--
Data Estelar 2458788,887708
http://sites.google.com/site/ficmatinf
Desejo-lhe Paz, Vida Longa e Prosperidade.
São Bem Vindas Mensagens no Formato texto UTF-8 com Acentos.
- Re: [shell-script] Futuro desta lista de discussão,
Jorge Barros de Abreu <=
- Re: [shell-script] Futuro desta lista de discussão, Jamenson Ferreira Espindula de Almeida Melo, 2019/11/01
- Re: [shell-script] Futuro desta lista de discussão, Andre Lopes da Silva, 2019/11/01
- Re: [shell-script] Futuro desta lista de discussão, Ernander (Nander), 2019/11/04
- Re: [shell-script] Futuro desta lista de discussão, Rafael Nery Brito, 2019/11/04
- Re: [shell-script] Futuro desta lista de discussão, Omar, 2019/11/06
- Re: [shell-script] Futuro desta lista de discussão, Jamenson Ferreira Espindula de Almeida Melo, 2019/11/06