Quite often I want to share simple (static) web pages with other colleagues or clients. For example, I may have written a report using R Markdown and rendered it to HTML. AWS S3 can easily host such a simple web page (e.g. see here), but it cannot, however, offer any authentication to prevent anyone from accessing potentially sensitive information.
Yegor Bugayenko has created an external service S3Auth.com that stands in the way of any S3 hosted web site, but this is a little too much for my needs. All I want to achieve is to limit access to specific S3 resources that will be largely transient in nature. A viable and simple solution is to use ‘query string request authentication’ that is described in detail here. I must confess to not really understanding what was going on here, until I had dug around on the web to see what others have been up to.
This blog post describes a simple R function for generating authenticated and ephemeral URLs to private S3 resources (including web pages) that only the holders of the URL can access.
Creating User Credentials for Read-Only Access to S3
Before we can authenticate anyone, we need someone to authenticate. From the AWS Management Console create a new user, download their security credentials and then attach the
AmazonS3ReadOnlyAccess policy to them. For more details on how to do this, refer to a previous post. Note, that you should not create passwords for them to access the AWS console.
Loading a Static Web Page to AWS S3
Do not be tempted to follow the S3 ‘Getting Started’ page on how to host a static web page and in doing so enable ‘Static Website Hosting’. We need our resources to remain private and we would also like to use HTTPS, which this option does not support. Instead, create a new bucket and upload a simple HTML file as usual. An example html file - e.g.
index.html - could be,
<!DOCTYPE html> <html> <body> <p>Hello, World!</p> </body> </html>
An R Function for Generating Authenticated URLs
We can now use our new user’s Access Key ID and Secret Access Key to create a URL with a limited lifetime that enables access to
index.html. Technically, we are making a HTTP GET request to the S3 REST API, with the authentication details sent as part of a query string. Creating this URL is a bit tricky - I have adapted the Python example (number 3) that is provided here, as an R function (that can be found in the Gist below) -
aws_query_string_auth_url(...). Here’s an example showing this R function in action:
path_to_file <- "index.html" bucket <- "my.s3.bucket" region <- "eu-west-1" aws_access_key_id <- "DWAAAAJL4KIEWJCV3R36" aws_secret_access_key <- "jH1pEfnQtKj6VZJOFDy+t253OZJWZLEo9gaEoFAY" lifetime_minutes <- 1 aws_query_string_auth_url(path_to_file, bucket, region, aws_access_key_id, aws_secret_access_key, lifetime_minutes) # "https://s3-eu-west-1.amazonaws.com/my.s3.bucket/index.html?AWSAccessKeyId=DWAAAKIAJL4EWJCV3R36&Expires=1471994487&Signature=inZlnNHHswKmcPfTBiKhziRSwT4%3D"
And here’s the code for it as inspired by the short code snippet here:
Note the dependencies on the