Wow, what a coincidence, I literally just installed this yesterday because I wanted to write my own command-line S3 uploader...yes I know there's an official CLI [1] but I wanted some customized behavior and options (such as simultaneously backing up to s3 and glacier) and thought it'd be a good chance to better understand AWS in general. I don't know what the previous boto was like but boto3 has been pretty pain-free. Many of the error messages I've run into have been very actively discussed/patched/hacked-around in the repo issues.
> such as simultaneously backing up to s3 and glacier
I you want that for every object in the bucket (or specific files with a prefix), you are be able achieve that by adding a lifecycle rule:
1 Select the bucket
2 Add a new lifecycle rule
3 Action = Archive only
4 Archive to the Glacier Storage Class 0 after the object's creation date. (Enter '0' for same-day archival)
5 Save rule
Sorry to go off topic, but why would you want to simultaneously backup to S3 and Glacier? Unless you're using S3 with Reduced Redundancy then you're not going to lose any files, so Glacier seems pointless. Also, working with S3 is way easier than working with Glacier, so if you're using S3 already I would just stick with that.
Redundancy is not the same as backup. You may not lose files due to machine failure, but a code error, some data scientist deleting the wrong folder, etc, could easily delete the files. Having a backup that isn't automatically tied to the s3 bucket can help prevent catastrophe in these cases.
That's solved by using Object Versioning in S3. That makes it easy to recover from accidental deletion. You can still permanently remove an object by deleting it's versions, but that's a separate operation and not one you'd likely do by accident. Proper permissioning can stop it even being possible.
Beyond that you're probably thinking about black-swan scenarios like AWS going out of business or a hacker breaching your system and deleting everything [1]. In that case you need to be completely separated from AWS (and careful about maintaining the separation) so also using Glacier won't help anyway.
> Proper permissioning can stop it even being possible.
100% this. A little OT, but my standard Postgres backup procedure is streaming via WAL-E to S3. The user WAL-E runs as cannot delete versions. Any attempt to overwrite base backups or WAL parts versions them. I have the versions expire after a couple of months (to reduce storage costs). I do the same for selected logs in /var/log/ via duply.
Yeah that's true. I think I meant more toward understanding the granularity of operations common to AWS, with the assumption that the logistics of moving data onto S3 to Glacier would be the same. Since you can control expire-times on S3 files, there might be situations in which you want files on S3 for a short period, maybe 30 days, because you want a window for quick access just in case you were prematurely archiving, while making sure they're also on Glacier...I frequently find myself hoarding files, keeping them on S3 (or worse, USB keys) for longer than I had intended...
I find it really hard to be excited about anything Boto-related when whoever's in charge of the main Boto project has allowed it to become an unmaintained disaster. Would much rather see them support their existing project. https://github.com/boto/boto/pulls
I am not intimately familiar with the project but the boto3 rewrite has been in the doing for a long time, all while boto2 was still getting new services and features added to it. Just the single lib is a huge project in itself, it must have been a long grind doing both at the same time.
This is correct. Boto3 (and its child dependency botocore) have been in the process for quite some time now. I'm not on the team that works on botocore, but I do interface with them (I work on the EB CLI which uses botocore).
I can see how that is confusing. I was only confirming that boto3 has been in the making for a while. Boto3 is a replacement to Boto, so you will probably start to notice support for Boto slow down. Boto3 will be maintained and supported.
boto currently has 3 APIs, depending on the module's version. Having both Table.query and Table.query_2 is confusing for newcomers, especially with the poor documentation available.
Rant over, this release looks cool, love the .all() chaining.
Breaking backwards compatibility is dangerous in any major project. Hopefully the community will migrate reasonably fast and this don't turn up the same as Python 3.
This is even more ridiculous than the increasingly desperate FUD we hear about py3. If you want to keep using boto, you can. If you want to use boto3 on py2, you can. If you want to use boto3 on py3, you can. As noted in TFA, if you want to use the old boto on py3, you can, and some of us have since last year.
I've been using boto's S3 client in Python 3 for quite some time, and have had no issues. Python 3 isn't PERL 6, it takes time to supplant something as ubiquitous as Python 2; look how many banks are still running COBOL...
It includes more functionality as well. I've been using the Import/Export API [1] for example. From what I've seen it's a major improvement over boto2.
[1] http://aws.amazon.com/cli/