Originally published at https://pjambet.github.io/
EDIT
Everything detailed in this article has been wrapped up in this gem, you should give it a look !
Anyway, I still advise you to read this article as it will probably help you how everything works !
Preface
Since beginning of september, Amazon added CORS support to S3. As this is quite recent, there are not yet a lot of documentation and tutorials about how to set eveything up and running for your app.
Furthermore, this jQuery plugin is awesome, mainly for the progress bar handling, but sadly the example in the wiki is obsolete.
If somehow you’re working with heroku you might have already faced the 30s limit on each requests. There are some alternatives, such as the extension of the great carrier wave gem, carrierwave direct. I gave it a quick look, but I found it quite crappy, as it forces you to change your carrier wave settings (removing the store_dir method, really ?) and it only works for a single file. So I thought it would be better to handle upload manually for big files, and rely on vanilla carrier_wave for my other small uploads.
I found other interesting examples but they all lacked important things, and none of them worked out of the box, hence this short guide. This tutorial is inspired by that post and that one.
Setup your bucket
First you’ll need to setup your bucket to enable CORS under certain conditions.
<CORSConfiguration>
<CORSRule>
<AllowedOrigin>*</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<AllowedMethod>POST</AllowedMethod>
<AllowedMethod>PUT</AllowedMethod>
<AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>
Of course those settings are only for development purpose, you’ll probably want to restrict the Allowed Origin rule to your domain only. Documentation about those settings is quite good.
Setup your server
In order to send your files to s3, you have to include a set of options as described in the official doc here.
One solution would be to directly write the content of all those variables in the form, so it’s ready to be submitted, but I believe that most of those value should not be written in the DOM. So we’ll create a new route we’ll use to fetch those data.
This example is written with Rails, but writing the same for another framework should be really simple
MyApp::Application.routes.draw do
resources :signed_url, only: :index
end
Now that we have our new route, let’s create the controller which will send back our data to the s3 form
class SignedUrlsController < ApplicationController
def index
render json: {
policy: s3_upload_policy_document,
signature: s3_upload_signature,
key: "uploads/#{SecureRandom.uuid}/#{params[:doc][:title]}",
success_action_redirect: "/"
}
end
private
# generate the policy document that amazon is expecting.
def s3_upload_policy_document
Base64.encode64(
{
expiration: 30.minutes.from_now.utc.strftime('%Y-%m-%dT%H:%M:%S.000Z'),
conditions: [
{ bucket: ENV['S3_BUCKET'] },
{ acl: 'public-read' },
["starts-with", "$key", "uploads/"],
{ success_action_status: '201' }
]
}.to_json
).gsub(/\n|\r/, '')
end
# sign our request by Base64 encoding the policy document.
def s3_upload_signature
Base64.encode64(
OpenSSL::HMAC.digest(
OpenSSL::Digest::Digest.new('sha1'),
ENV['AWS_SECRET_KEY_ID'],
s3_upload_policy_document
)
).gsub(/\n/, '')
end
end
The policy and signature method are stolen from the linked blog posts above with one exception, I had to include the “starts-width” constraint, otherwise s3 was yelling 403 to me. Everything else is quite straight forward, there’s just a small detail to consider if you set the acl to ‘private’, but more on that later.
One last detail, the key value is actually the path of your file on your bucket, so set it to whatever you want but be
sure it matches the constraint you set in the policy. Here we’re using params[:doc][:file]
to read the name of the
file we’re about to upload. We’ll see more about that when setting the javascript.
That’s basically everything we have to do on the server side
Add the jQueryFileUpload files
Next you’ll have to add the jQueryFileUpload files. The plugins ships with a lof of files, but I found most of them useless, so here is the list
vendor/jquery.ui.widget
jquery.fileupload
Setup the javascript client side
Now let’s setup jQueryFileUpload to send the correct data to s3
Based on what we did on the server, the workflow will be composed of 2 requests, first, it’s going to fetch the needed data from our server, then send everything to s3.
Here is the form I’m using, the order of parameter is important.
%form(action="https://#{ENV['S3_BUCKET']}.s3.amazonaws.com" method="post" enctype="multipart/form-data" class='direct-upload')
%input{type: :hidden, name: :key}
%input{type: :hidden, name: "AWSAccessKeyId", value: ENV['AWS_ACCESS_KEY_ID']}
%input{type: :hidden, name: :acl, value: 'public-read'}
%input{type: :hidden, name: :policy}
%input{type: :hidden, name: :signature}
%input{type: :hidden, name: :success_action_status, value: "201"}
%input{type: :file, name: :file }
- # You can recognize some bootstrap markup here :)
.progress.progress-striped.active
.bar
$(function() {
$('.direct-upload').each(function() {
var form = $(this)
$(this).fileupload({
url: form.attr('action'),
type: 'POST',
autoUpload: true,
dataType: 'xml', // This is really important as s3 gives us back the url of the file in a XML document
add: function (event, data) {
$.ajax({
url: "/signed_urls",
type: 'GET',
dataType: 'json',
data: {doc: {title: data.files[0].name}}, // send the file name to the server so it can generate the key param
async: false,
success: function(data) {
// Now that we have our data, we update the form so it contains all
// the needed data to sign the request
form.find('input[name=key]').val(data.key)
form.find('input[name=policy]').val(data.policy)
form.find('input[name=signature]').val(data.signature)
}
})
data.submit();
},
send: function(e, data) {
$('.progress').fadeIn();
},
progress: function(e, data){
// This is what makes everything really cool, thanks to that callback
// you can now update the progress bar based on the upload progress
var percent = Math.round((e.loaded / e.total) * 100)
$('.bar').css('width', percent + '%')
},
fail: function(e, data) {
console.log('fail')
},
success: function(data) {
// Here we get the file url on s3 in an xml doc
var url = $(data).find('Location').text()
$('#real_file_url').val(url) // Update the real input in the other form
},
done: function (event, data) {
$('.progress').fadeOut(300, function() {
$('.bar').css('width', 0)
})
},
})
})
})
So quick explanation about what’s going on here :
The add
callback allows us to fetch the missing data before the upload. Once we have the data, we simply insert them
in the form
The send
and done
callbacks are only used for UX purpose, they show and hide the progress bar when needed. The real
magic is the progress
callback as it gives you the current progress of the upload in the event argument.
In my example, this form sits next to a ‘real’ rails form which is used to save an object which has amongst its attributes a file_url, linked to the “big file” we just uploaded. So once the upload is done I fill the ‘real’ field so my object is correctly created with the good url without having to handle extra things. After submitting the real form my object is saved with the URL of the file uploaded on S3.
If you’re uploading public files, you’re good to go, everything’s perfect. But if you’re uploading private file (this is set with the acl params), you still have a last thing to handle.
Indeed the url itself is not enough, if you try accessing it, you’ll face some ugly xml like that. The solution I used was to use the aws gem which provides a great method : AWS::S3Object#url_for. With that method, you can get an authorized url for the desired duration with your bucket name and the key (the path of your file in the bucket) of your file
So my custom url accessor looked something like this :
def url
parent_url = super
# If the url is nil, there's no need to look in the bucket for it
return nil if parent_url.nil?
# This will give you the last part of the URL, the 'key' params you need
# but it's URL encoded, so you'll need to decode it
object_key = parent_url.split(/\//).last
AWS::S3::S3Object.url_for(
CGI::unescape(object_key),
ENV['S3_BUCKET'],
use_ssl: true)
end
This involves some weird handling with the CGI::unescape
, and there’s probably a better way to achieve this, but this is one way to do it, and it works fine.
Live example
I’ll set up a live example running on heroku, on which you’ll be able to upload files in more than 30s coming soon
Finally !
The demo if finally here : http://direct-upload.herokuapp.com and code source can be found here : https://github.com/pjambet/direct-upload
EDIT
I changed every access to AWS variables (BUCKET, SECRET_KEY and ACCESS_KEY) by using environment variables. By doing so you don’t have to put the variables directly in your files, but you just have to set correctly the variables :
export S3_BUCKET=<YOUR BUCKET>
export AWS_ACCESS_KEY_ID=<YOUR KEY>
export AWS_SECRET_KEY_ID=<YOUR SECRET KEY>
When deploying on heroku you just have to set the variables with
heroku config:add AWS_ACCESS_KEY_ID=<YOUR KEY> --app <YOUR APP>
...