Boto3 – AWS SDK for Python

danso · on June 22, 2015

Wow, what a coincidence, I literally just installed this yesterday because I wanted to write my own command-line S3 uploader...yes I know there's an official CLI [1] but I wanted some customized behavior and options (such as simultaneously backing up to s3 and glacier) and thought it'd be a good chance to better understand AWS in general. I don't know what the previous boto was like but boto3 has been pretty pain-free. Many of the error messages I've run into have been very actively discussed/patched/hacked-around in the repo issues.

[1] http://aws.amazon.com/cli/

pdelgallego · on June 23, 2015

> such as simultaneously backing up to s3 and glacier

I you want that for every object in the bucket (or specific files with a prefix), you are be able achieve that by adding a lifecycle rule:

1 Select the bucket 2 Add a new lifecycle rule 3 Action = Archive only 4 Archive to the Glacier Storage Class 0 after the object's creation date. (Enter '0' for same-day archival) 5 Save rule

You can automate that using aws cli

vosper · on June 22, 2015

Sorry to go off topic, but why would you want to simultaneously backup to S3 and Glacier? Unless you're using S3 with Reduced Redundancy then you're not going to lose any files, so Glacier seems pointless. Also, working with S3 is way easier than working with Glacier, so if you're using S3 already I would just stick with that.

Psyonic · on June 22, 2015

Redundancy is not the same as backup. You may not lose files due to machine failure, but a code error, some data scientist deleting the wrong folder, etc, could easily delete the files. Having a backup that isn't automatically tied to the s3 bucket can help prevent catastrophe in these cases.

vosper · on June 22, 2015

That's solved by using Object Versioning in S3. That makes it easy to recover from accidental deletion. You can still permanently remove an object by deleting it's versions, but that's a separate operation and not one you'd likely do by accident. Proper permissioning can stop it even being possible.

Beyond that you're probably thinking about black-swan scenarios like AWS going out of business or a hacker breaching your system and deleting everything [1]. In that case you need to be completely separated from AWS (and careful about maintaining the separation) so also using Glacier won't help anyway.

[1] This happened, and put Code Spaces out of business: http://arstechnica.com/security/2014/06/aws-console-breach-l...

elithrar · on June 22, 2015

> Proper permissioning can stop it even being possible.

100% this. A little OT, but my standard Postgres backup procedure is streaming via WAL-E to S3. The user WAL-E runs as cannot delete versions. Any attempt to overwrite base backups or WAL parts versions them. I have the versions expire after a couple of months (to reduce storage costs). I do the same for selected logs in /var/log/ via duply.

danso · on June 23, 2015

Yeah that's true. I think I meant more toward understanding the granularity of operations common to AWS, with the assumption that the logistics of moving data onto S3 to Glacier would be the same. Since you can control expire-times on S3 files, there might be situations in which you want files on S3 for a short period, maybe 30 days, because you want a window for quick access just in case you were prematurely archiving, while making sure they're also on Glacier...I frequently find myself hoarding files, keeping them on S3 (or worse, USB keys) for longer than I had intended...

teej · on June 23, 2015

S3 Lifecycle Management does this for you! http://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecy...

samstave · on June 22, 2015

https://github.com/bloomreach/s4cmd

omni · on June 22, 2015

I find it really hard to be excited about anything Boto-related when whoever's in charge of the main Boto project has allowed it to become an unmaintained disaster. Would much rather see them support their existing project. https://github.com/boto/boto/pulls

zimbatm · on June 23, 2015

I am not intimately familiar with the project but the boto3 rewrite has been in the doing for a long time, all while boto2 was still getting new services and features added to it. Just the single lib is a huge project in itself, it must have been a long grind doing both at the same time.

pekk · on June 23, 2015

Boto3 as a successor addresses the problem of maintenance, doesn't it?

hoodoof · on June 23, 2015

Anyone here from Amazon to address this? Is this incorrect?

nhumrich · on June 23, 2015

This is correct. Boto3 (and its child dependency botocore) have been in the process for quite some time now. I'm not on the team that works on botocore, but I do interface with them (I work on the EB CLI which uses botocore).

hoodoof · on June 23, 2015

Are you confirming that boto is "an unmaintained disaster"?

It seems there is some confusion here about what is boto, what is boto3 and what state the boto project is in.

nhumrich · on June 23, 2015

I can see how that is confusing. I was only confirming that boto3 has been in the making for a while. Boto3 is a replacement to Boto, so you will probably start to notice support for Boto slow down. Boto3 will be maintained and supported.

andrus · on June 23, 2015

I am glad the docs have improved. For example, RoutingRules: you used to have to go to the code to figure out how to use them.

cbaleanu · on June 23, 2015

boto currently has 3 APIs, depending on the module's version. Having both Table.query and Table.query_2 is confusing for newcomers, especially with the poor documentation available.

Rant over, this release looks cool, love the .all() chaining.

arthursilva · on June 22, 2015

Breaking backwards compatibility is dangerous in any major project. Hopefully the community will migrate reasonably fast and this don't turn up the same as Python 3.

jessaustin · on June 22, 2015

This is even more ridiculous than the increasingly desperate FUD we hear about py3. If you want to keep using boto, you can. If you want to use boto3 on py2, you can. If you want to use boto3 on py3, you can. As noted in TFA, if you want to use the old boto on py3, you can, and some of us have since last year.

rectangletangle · on June 22, 2015

I've been using boto's S3 client in Python 3 for quite some time, and have had no issues. Python 3 isn't PERL 6, it takes time to supplant something as ubiquitous as Python 2; look how many banks are still running COBOL...

pbreit · on June 22, 2015

Wow: 100s of 1000s of weekly downloads!

vanessa98 · on June 22, 2015

http://stackoverflow.com/questions/9648015/pypi-download-cou...

sciurus · on June 22, 2015

The title is slightly misleading; boto3 is a new major version of boto that works on Python 2 and 3.

anezvigin · on June 22, 2015

It includes more functionality as well. I've been using the Import/Export API [1] for example. From what I've seen it's a major improvement over boto2.

[1]: http://boto3.readthedocs.org/en/latest/reference/services/im...