shell-script-pt
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [shell-script] Futuro desta lista de discussão


From: Jorge Barros de Abreu
Subject: Re: [shell-script] Futuro desta lista de discussão
Date: Fri, 1 Nov 2019 06:23:22 -0300
User-agent: Mutt/1.6.1 (2016-04-27)

Algue'm ja' fez o backup????

Se todo mundo começar a fazer backup ao mesmo tempo o
yahoo vai entender mal essas tentativas de backup.

Abaixo segue dois README's de programas sugeridos de backup
postados aqui na lista (nao me lembro por quem, desculpe)

# YahooGroups-Archiver

#### A simple python script that archives all messages from a public Yahoo Group

YahooGroups-Archiver allows you to make a backup copy of all the messages in a 
public group. Not only is all the message content downloaded, but also all 
other raw data that Yahoo uses to display the messages.

Messages are downloaded in a JSON format, with one .json file per message.

There is support for private groups, but this requires that you have a Yahoo 
groups account that has access to the private groups you want to archive. See 
the 'Private Groups' section for more info.

Works with both Python 2 and Python 3.

## Usage
**`python archive_group.py <groupName> [options] [nologs]`**
where *`<groupName>`* is the name of the group you wish to archive (e.g: 
hypercard)

**Options**
* *`update`* - the default., Archive all new messages since the last time the 
script was run
* *`retry`* - Archive any new messages, and attempt to archive any messages 
that could not be downloaded last time
* *`restart`* - Delete all previously archived messages and archive again from 
scratch

Please note that you can only have one *Option*, if you specify more than one, 
only the first will be used, with the others being ignored.

By default a log file called <groupname>.txt is created and stores information 
such as what messages could not be received. This is entirely for the benefit 
of the user: it's not needed at all by the script during any re-runs (although 
re-runs will append new information to the log file). If you don't want a log 
file to be created or added to, add the `nologs` keyword when you call the 
script.

## Private Groups
It is possible to archive private groups using this tool, but the way to go 
about doing this is slighly fiddly at the moment. Rather than simply providing 
your login information for the Yahoo account that has access to the private 
groups, you need to provide two pieces of information from Yahoo's login 
cookies (small files created by web browsers to store data for various uses, 
such as allowing you to login to websites and then stay logged in for a certain 
period of time).

Cookie information can be found through the use of a plug-in for your web 
browser. (I use 'Cookie Manager' on FireFox, although there are many other 
options for FireFox and other browsers). The two cookies you are looking for 
are called *Y* and *T*, and they are linked to the domain *yahoo.com*. Extract 
the data from these cookies, and paste it into the appropriate variables in the 
*archive_groups.py* script. You should now be able to archive a private group.

Please note that this support is still experimental. One important issue to 
consider is that a cookie will expire after a certain amount of time, which 
varies between computers. This means that you may have to re-fetch the *Y* and 
*T* cookie data every few days, or you will not be able to archive private 
groups.

## Note
Yahoo attempts to block connections that it deems to be "spamming", and so 
after around 15,000 messages have been downloaded it is highly likely that 
Yahoo will block you. This is OK, the script will automatically stop, and Yahoo 
should unblock you after around two hours. Running the script again once you 
have been unblocked will just continue where it left off. (Unless you run with 
the *`restart`* *[option]*, of course!

## Credits
Thanks to the [Archive Team](http://archiveteam.org/) for making [information 
about the Yahoo Groups 
API](http://www.archiveteam.org/index.php?title=Yahoo!_Groups) available.

**********************

# yahoo-groups-backup
A python script to backup the contents of Yahoo! groups, be they private or 
public.

## Setup/Requirements

The project requires Python 3.5+, Mongo, and a computer with a GUI as Selenium 
is used for the scraping (to be able to handle private groups).

[virtualenv](https://virtualenv.pypa.io/en/stable/) is recommended.

    git clone https://github.com/csaftoiu/yahoo-groups-backup.git
    cd yahoo-groups-backup
    pip install -r requirements.txt

## Example

To scrape an entire site, say the `concatenative` group:

    ./yahoo-groups-backup.py scrape_messages concatenative

This will shove all the messages into a Mongo database (default 
`localhost:27017`), into the database of the same name as the group.

To scrape the files as well (though this group has no files):

    ./yahoo-groups-backup.py scrape_files concatenative

To dump the site as a human-friendly, fully static (i.e. viewable from the file 
system) website:

    ./yahoo-groups-backup.py dump_site concatenative concatenative_static_site

Then simply open `concatenative_static_site/index.html` and browse!

## Full Usage

To see the full usage:

    ./yahoo-groups-backup.py -h

-- 
Data Estelar 2458788,887708
http://sites.google.com/site/ficmatinf
Desejo-lhe Paz, Vida Longa e Prosperidade.
São Bem Vindas Mensagens no Formato texto UTF-8 com Acentos.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]