Jul 22

This is a brief guide to getting up and running against S3 in Python.

You will need the following things:

  1. AWS Developer account (you can sign up for it on http://aws.amazon.com)
  2. Make sure you sign up for S3 (select Amazon Simple Storage Service in the sidebar on the left of the AWS page and then click ‘Sign Up For This Web Service’ over on the right)
  3. Python (I’m assuming you have it set up and working already)

A great library to use for access to AWS is called boto and its homepage is here: http://code.google.com/p/boto. You can download the code directly and then run its setup.py to install it, or if you already have easy_install you can make it do all of the work:

C:boto-1.3a>python setup.py install

or

C:>python easy_install.py boto

Either way, you can verify that boto is properly installed by trying to import it in python:

C:>python
Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import boto
>>>

Once boto is set up correctly get your AWS identifiers from the “Your Web Services Account” dropdown in the upper right of the AWS pages. You’ll need to use both the Access key and the Secret key when you make the connection.

At the highest level AWS stores data in ‘buckets’. These are unique across the entire service and each account is limited to a max of 100 of them. Within the bucket you create key/data pairs that look like filenames.

The code to do this is super simple. You’ll probably need to replace ‘examplebucket’ with something else:

from boto.s3.connection import S3Connection
from boto.s3.key import Key
conn = S3Connection('', '')
bucket = conn.create_bucket('examplebucket')
k = Key(bucket)
k.key = 'foo'
k.set_contents_from_filename('foo.png')

Doing this will upload the file foo.png to the bucket ‘examplebucket’. That would mean that the file is accessible at: http://s3.amazonaws.com/examplebucket/foo. If you attempt to go to the appropriate URL for that file you’ll get an access denied error back. By default new objects in the bucket aren’t publicly readable. Making the object readable takes advantage of a shortcut built into boto. In the boto.s3.acl module there is a list of ‘ready-to-use’ modes:

CannedACLStrings = ['private', 'public-read', 'public-read-write', 'authenticated-read']

Now we just set the ‘public-read’ ACL on our key and it will become accessible through the URL:

k.set_acl('public-read')

From here, check out the official documentation and tools such as S3Fox that will make your S3 experience much easier.

You should also grab the bota documentation archive and save it locally to use as a reference.


3 comments so far...

  • Sam Said on February 18th, 2009 at 6:02 am:

    Hi,

    This is great to see. I was wondering if boto works on Windows. I hope that is windows you are running. Pls confirm. I m writing a cross platform app and i would be delighted to know. Pls let me know if there are any dependencies that i need to take care of.

    Thanks a ton

    Sam

  • admin Said on February 20th, 2009 at 12:03 am:

    As you can see from the command prompts, the example I had in my blog was done on Windows. It shouldn’t really matter since the code doesn’t have any platform specific dependencies.

  • Jim Said on May 14th, 2009 at 8:32 am:

    Thanks for publishing this getting started guide; it saved me some time trying out S3.

    The first issue I had was that I created a bucket name with uppercase letters. That blows with a weird 403 Forbidden error, leading you to think it’s a problem with credentials. But change everything to lowercase and it works.

    The problem I have now is that your small sample program takes 20 seconds to run! I changed it to get_bucket instead of create_bucket (after creating the bucket), and it still took 20 seconds. I looked around the source code and saw that get_bucket has an extra argument, validate=False, and when I used that, the get_bucket completed quickly, but now set_contents_from_filename takes a long time (20 seconds). I have tried other things like s3put in the boto distribution, and it also takes 20 seconds.

    Any ideas? It sounds like some kind of DNS problem on my end, but I can ping s3.amazonaws.com instantly.

    Thanks!
    Jim

leave a reply

Spam Protection by WP-SpamFree