Simon Palmer’s blog

March 21, 2010

*Really* simple S3 persistence from Grails

Filed under: code, Grails — simonpalmer @ 10:16 pm

I recently posted about really simple file upload in my app and once I had that working I moved onto my next problem, which was in two parts, 1) making the persistence location flexible – and scalable – and 2) providing a public url to the images.

Taking the second problem first, I thought I was going to need to find some clever way of mapping a URI to a folder on the server and I investigated about 25 ways of doing that in tomcat, jetty, apache, and anywhere else I thought might work. There are solutions, but they normally mean an additional web server and subdomain, or some such configuration, to serve static content. There is also a grails plugin for serving static images. All of this felt like a lot of infrastructure and complexity, not to mention server load, which runs against my basic “do it elsewhere, not on the server” philosophy.

For the first issue (scalable persistence) I looked at where in the standard AWS AMI I was using I could place content and the choices are limited because I wanted sub-folders for each of my users and giving rights to processes on the server instance to create folders was also looking like a lot of non-standard config.

In steps Amazon S3. The solution is really very simple. Sign up for an AWS S3 account, create a bucket with a unique name for the app, add buckets for each user (they already have a UID to identify them) and back the asynchronous upload with a simple write to S3 from my controller. If you make the assets publicly readable then you can expose them through the regular s3.amazonaws.com url and you don’t need any other server infrastructure for your static file serving (*jumps up and down with glee*). Of course, there’s a Grails plugin for S3, so that’s where I started, and before I go much further I should say that I think that does the job very thoroughly and you should try that before you take my hack – at the very least download the code and study it closely, it is well written.

But… I didn’t want to involve my database and my file system in the upload process and I never anticipate needing a background process for doing the upload to the buckets, so it was breaking my self-imposed KISS rules.

So, “how hard could it be”? Here’s how.

First it is a good idea to grab the Jets3t library, which is brilliant. I found the install all a bit weird, but there is a jar file in the jets3t-XXX/applets folder – at least there was in the download I grabbed – and you basically need to stick that in your lib folder under your Grails app. If you are using the STS Eclipse plug-in you’ll need to add the jar as an external library to your project properties to have it compile, run, work, etc.

It’s a good idea to write a Grails Service for this, so you start by doing that

grails create-service AmazonS3

And here’s the code for the service… this is admittedly very simplistic and it needs some error handling, but you’ll figure that out…

import org.jets3t.service.S3Service
import org.jets3t.service.security.AWSCredentials
import org.jets3t.service.impl.rest.httpclient.RestS3Service
import org.jets3t.service.model.S3Bucket
import org.jets3t.service.model.S3Object
import org.jets3t.service.acl.AccessControlList

class AmazonS3Service {
	
	static String accessKey="PUTYOUROWNONEINHERE"
	static String secretKey="PUTYOUROWNONEINHERE"
	static RestS3Service s3 = new RestS3Service(new AWSCredentials(accessKey, secretKey))
        boolean transactional = false
	String rootBucketPath="yourbucket/subbucket/whatever/"
	String defaultBucketLocation=S3Bucket.LOCATION_EUROPE
	
	Map mimeExtensionMap = [
			"png" : "image/png",
			"jpg": "image/jpeg",
			"gif": "image/gif",
			"tiff": "image/tiff",
			"pdf": "application/pdf",
			"mpeg": "video/mpeg",
			"mp4": "video/mp4",
			"mov": "video/quicktime",
			"wmv": "video/x-ms-wmv",
			"html": "text/html",
			"xml": "text/xml",
			"mp3": "audio/mpeg",
			"flv": "application/octet-stream"
	]
	
	S3Bucket makeBucket(uid)
	{
		S3Bucket bucket = s3.getOrCreateBucket((rootBucketPath + uid), defaultBucketLocation)
		bucket.setAcl AccessControlList.REST_CANNED_PUBLIC_READ
		return bucket
	}
	
	void put(inputstream, name, uid, ext, length)
	{
		if (mimeExtensionMap.containsKey(ext.toLowerCase()))
		{
			String mime = mimeExtensionMap[ext.toLowerCase()]; 
			S3Bucket bucket = makeBucket(uid)
			S3Object up = new S3Object()
			up.setAcl AccessControlList.REST_CANNED_PUBLIC_READ
			up.setContentLength length
			up.setContentType mime
			up.setDataInputStream inputstream
			up.setKey name
			up.setBucketName bucket.getName()
			s3.putObject bucket, up
		}
	}
	void putXML(text, name, uid)
	{
		String mime = mimeExtensionMap["xml"]; 
		S3Bucket bucket = makeBucket(uid)
		S3Object up = new S3Object(bucket, name, text)
		up.setAcl AccessControlList.REST_CANNED_PUBLIC_READ
		up.setContentLength text.length()
		up.setContentType mime
		s3.putObject bucket, up
	}
}

This gives you a single API entry point (well, two, put and putXML) and will give you read-only assets on your S3 bucket that follow this sort of path…

yourbucket/subbucket/whatever/an-identifier-of-your-coice/image1.jpg
yourbucket/subbucket/whatever/an-identifier-of-your-coice/image2.jpg
yourbucket/subbucket/whatever/an-identifier-of-your-coice/image3.jpg
yourbucket/subbucket/whatever/82757c2a-cfa9-49bf-89fa-9efdaf9bf418/image1.jpg
yourbucket/subbucket/whatever/82757c2a-cfa9-49bf-89fa-9efdaf9bf418/image2.jpg
yourbucket/subbucket/whatever/82757c2a-cfa9-49bf-89fa-9efdaf9bf418/image3.jpg

These can then be got at via urls which reference these buckets via the normal s3 url…


To call this from a controller used to manage the asynchronous file upload you just need to know about one thing, which is how to get from a Spring MultipartFile to something which the JetS3t API can consume.

def amazonS3Service;
def uploadfile = {

	if(request.method == 'POST') 
	{

		Iterator itr = request.getFileNames();

		String vloc = "";
		while(itr.hasNext()) 
		{
			MultipartFile mpFile = request.getFile(itr.next());
			if (!mpFile.isEmpty()) 
			{
				// success
				String _file = mpFile.getOriginalFilename().replace(" ", "_");
				vloc += h.uid + "/" + _file;
				String ext = _file.substring(_file.lastIndexOf(".")+1);
				amazonS3Service.put (mpFile.getInputStream(), _file, h.uid, ext, mpFile.getSize())

			}
		}

		render vloc
	}
}

Basically mpFile.getInputStream() will give something which can be sent directly to S3 without being persisted to the local file system – which is what pretty much every other example I came across does. A couple of things to point out at this point, first is that h.uid is not visible in this code, it is just a unique value I ascribe to each user in my system and you should choose your own. It forms the rightmost bucket name in my S3 key structure. Second is that I use the file name as the key and I modify it to replace the spaces with underscores to avoid any nasty problems with filenames.

Eh voila, you have static image persistence to an almost infinitely scalable server infrastructure (at reasonable cost) and serving via the cloud. Normal caveats apply, and this is far from perfect code, but it works and I hope it helps you get started.

Advertisements

10 Comments »

  1. Wow, thats great, have to try this out soon!

    Comment by Daxon — March 21, 2010 @ 10:54 pm

  2. I am assuming that your server is located in Amazon EC2. Otherwise, the upload would take twice as long (once to upload to your server, then to upload to S3 server).

    Have you done any timing on how long it takes to upload the file from EC2 to S3, i.e., the time it takes for the call, amazonS3Service.put(). I think if the the EC2 server and S3 bucket are in the same availability zone, then this overhead will not be a concern.

    Comment by Roshan Shrestha — March 22, 2010 @ 1:35 pm

  3. Roshan, yes, it’s all running on Amazon EC2 and my EC2 and S3 buckets are in the same zone. I have done some unscientific timings based on what I think are typical usage patterns for my users (~1Mb image files) and it takes about 15 seconds from clicking the upload button in my app to the image being available via S3. I think that’s tolerable for my use case. I admit I could do some better measurement, but I want to get a load of users complaining about that before I optimise it.

    Comment by simonpalmer — March 22, 2010 @ 2:21 pm

  4. Another question, are the images uploaded by the user private or public? If private, then the URL:

    would be publicly available.

    I can think of two approaches to this problem:

    1. Make the bucket private and have your server pull the content from S3 and push it to the user. This would not be a problem if the server is in EC2 and in the same availability zone as S3.

    2. Have the browser contact your server, generate “time-bound” urls (e.g., valid for 10-20 seconds), and redirect the browser to this URL. S3 API, unfortunately, does not support “single-use” URLs.

    Comment by Roshan Shrestha — March 22, 2010 @ 2:42 pm

  5. Sorry, I did not look at your code. I do see your bucket is public (AccessControlList.REST_CANNED_PUBLIC_READ).

    Comment by Roshan Shrestha — March 22, 2010 @ 2:44 pm

  6. […] 05.50 *Really* simple S3 persistence from Grails Contributing to Open Source projects Grails First Impressions – GORM vs. […]

    Pingback by Link droppings [week 4 of 03.2010] | Tru North Ware — March 23, 2010 @ 11:50 am

  7. […] server side code processes the multipart file upload and persists the file into Amazon S3 (see my other post about that). I then render the file as a virtual location relative to the current page back into my page so I […]

    Pingback by *Really* simple asynchronous file upload in Grails | Simon Palmer’s blog — April 13, 2010 @ 8:47 am

  8. Great Article better then using the plugin.
    thanks

    Comment by Sapan Parikh — June 5, 2011 @ 6:18 am

  9. Awesome. It helped me a lot. If we just install the Grails amazon-s3 plugin and then use the Code Given by you it works like a charm. I was adding the Jets3t libraries explicitly in my lib folder and i was getting exceptions and compilation errors because of some missing dependencies. So Installing amazon-s3 plugin also cleans out lib folder.. 🙂

    Thanks much

    Comment by Sikander — December 6, 2012 @ 6:19 pm

  10. Incredibly, helpful, thanks (2 years later).
    Curious if anyone can help configure logging. The jets3t API seems to spew tons of logs in my grails instance and I can’t figure out how to get it to respond to log4j settings.

    Comment by viv — December 25, 2012 @ 10:22 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: