Fairwinds | Blog

Build a serverless URL shortener with AWS Lambda and API Gateway services

Written by Dave Konopka | Apr 3, 2016 5:25:00 PM

I’ve had my eye on Amazon’s Lambda and API Gateway services for a few months. Running a web application without the costs or headaches of maintaining servers is attractive. Of course, no platform is without tradeoffs. So to get myself familiar with the finer points of serverless apps I decided to launch a simple project: figuring out how to build a url shortener

With a few hours of work and a few dollars per month in cost I got a decent prototype working capable of handling millions of requests per month. This post walks through the setup along with the hitches I hit. There’s also an accompanying code repo that you can use to try this out yourself and learn how to build a url shortener.

Project Overview

Here were the basic requirements for the project:

  • One HTTP endpoint to accept a JSON POST containing a short token and destination URL. These values would need to be stored somewhere.
  • A second HTTP endpoint to take a short token via GET, lookup a corresponding destination URL, and return a 301 redirect.
  • The redirection endpoint should be accessible without authentication while the POST should require authentication.
  • Define everything via code as much as possible to make it reproducible.
  • Put the service on a custom domain.

To accomplish this I needed a few components:

  • A front-end to take in HTTP requests.
  • A back-end to do something with the requests and generate responses.
  • A datastore to keep all the associated short tokens and destination URL’s.

Front-end: Amazon API Gateway

At the outset I wasn’t sure Amazon API Gateway would handle everything needed for the front-end. It would need to support both application/json and text/html content as well as setting custom response codes and headers. Different content types are supported. And support for mapping Lambda function results to response headers was added at the end of 2015. API Gateway looked like it would fit the bill.

Creating an API

An API Gateway service is composed of resources and methods. Resources are essentially URL endpoints. Methods correspond to HTTP methods like GET, POST, PUT, etc.

The URL shortener’s endpoints are limited. There’s one GET to turn a provided short URL into a redirect. And a POST to take in short token and destination URL associations for storage.

Method Components

Each API Gateway method has four components. This is an example for the /{token} endpoint GET method that accepts a token and returns a redirect:

  • Method Request- defines the incoming request including any parameters that get pulled from the request path, query string, and headers. It also supports enabling authorization for the particular method.
  • Integration Request- maps the request to a backend. In this case I’m using Lambda Function but it also supports proxying to another HTTP service or mocking responses for development purposes. Here you pick which Lambda function will handle a single request and map request parameters to a backend data payload.
  • Method Response- defines a collection of response status codes and headers that your service supports.
  • Integration Response- maps return data from Lambda function execution to appropriate response codes. This can be done using regex matchers. More detail to follow on that. It also allows you to set custom response headers and setup templates to transform Lambda results.

Mapping Lambda Results to Response Headers

Part of the magic for the URL shortener is mapping Lambda function results to a response header. I needed to set the Location header to a destination URL provided by a Lambda function based on a short token lookup.

Below is the Integration Response settings for the /{token} endpoint GET method. Note the Location header mapping to a value in the Lambda function’s return JSON.

Requiring Authentication

The url shortener’s POST method requires authentication to control who can post entries. One of the authentication options API Gateway offers is an API key. API keys can be generated and associated with applications. When enabled on a method, all requests to that method must include a valid API key header to execute:

x-api-key: bkayZOMvuy8aZOhIgxq94K9Oe7Y70Hw55

This approach allowed me to require authentication on the URL POST and leave the GETendpoint open to the world.

Managing API as Code

One of my goals was to manage this project in code instead of clicking through the Amazon control panel. API Gateway offers the ability to export and import application configuration in Swagger format.

Swagger is a specification aimed at describing and documenting RESTful API’s in JSON or YAML format. A variety of services can consume Swagger files to visualize, test, or implement an API service.

I have to concede that getting started with Amazon’s API Gateway service was much easier using the web based control panel. I ultimately was able to export the service definition into a YAML file. I was also able to apply changes and clone the API using the aws-apigateway-importer tool provided by Amazon.

All that said, I created the service first by clicking through the web panel. It greatly helped with my understanding of how API Gateway applications are structured. I recommend using the web panel as a starting point.

Back-end: Amazon Lambda

Lambda supports Java, JavaScript, and Python languages. Lambda also supports uploading and executing binaries so other languages are possible. I’m most familiar with JavaScript so I started there.

Function Inputs and Outputs

Lambda functions are executed on demand. Unlike EC2, AWS ensures the compute environment whenever the function is executed. There’s no server to maintain and capacity scales as needed.

Each function can take data in the form of parameters, execute functions, and return data. Functions can import external libraries for things like connecting to other AWS services.

Lambda handler functions receive two parameters when invoked: event and context. When executed by an API Gateway event includes parameters translated from the incoming request. In the case of a token lookup, the token value from the URL is mapped to a JSON property named token. The context argument provides runtime information about the function’s execution environment.

URL Shortener Functions

redir_lookup_token Takes in a token value and returns a corresponding destination URL.

redir_post_token Takes in a short URL and destination URL association and stores it.

Handling Errors with API Gateway

The Lambda function’s context argument also includes methods for generating a response. It offers succeed, fail, and done methods. The last method is a helper wrapper for the first two. It takes two parameters and if the first is anything other than null it triggers a fail.

API Gateway offers regex pattern matching for mapping fail error messages to appropriate response status codes. This allows for translating backend failures to meaningful client responses. Various response status codes can be defined and associated with a regex pattern. If the Lambda function returns an error message matching one of the patterns, the client response will return the matching status code.

Each response code has its own template mapping. This allows response content to be tailored individually to each response condition. For the purposes of the short URL lookup, a 301 is the default response code for a successful request. If someone attempts a lookup for a token that doesn’t exist then a 404 should be returned.

The following shows an error condition in the token lookup Lambda function (line 7) and the corresponding 404 message response from the API Gateway.

API Gateway response statuses

Lambda function error result handling

Managing Lambda Functions with Apex

Lambda functions can be created directly in the AWS web panel or uploaded via zip file package. There are frameworks that help manage the coding and deployment of Lambda applications from a local development setup.

I considered Serverless, an extensive framework for building applications with Lambda and API Gateway. I also looked at Apex, a minimal Lambda function manager.

Apex provides a lightweight structure for organizing function code and metadata. It offers a CLI command for deploying and executing Lambda functions. I decided on the simpler approach of Apex for this project. With Apex each function is represented by a directory with a JSON metadata file and a JavaScript file containing the function code.

As you write code you can deploy changes using the apex deploy command.

> apex deploy
   • deploying                 function=post_token
   • deploying                 function=lookup_token
   • created build (1.1 kB)    function=lookup_token
   • created build (1.2 kB)    function=post_token
   • config unchanged          function=post_token
   • code unchanged            function=post_token
   • config unchanged          function=lookup_token
   • code unchanged            function=lookup_token

Once uploaded, you can execute functions from your laptop with sample input using the apex invoke command.

For small projects Apex’s structure is easy to get working quickly and it doesn’t get in the way. The project mentions plans to include API Gateway management in the future.

Datastore: DynamoDB

In keeping with the serverless theme, DynamoDB was a natural fit for storing the short and destination URLs. With DynamoDB instead of standing up a server you specify allotments of read and write capacity. Amazon ensures the allotted read and write capacities are available. You can scale up or down capacity units as needed.

DynamoDB tables are schemaless. Primary indexes can be a single value or a composite of two fields. An important limitation of composite keys is that they are hierarchical. This means that the second field is selectable in the context of the primary field. Selecting it on its own results in expensive table scans. Since the URL shortener primary index is a single short token string this was not a concern.

How much does this cost?

So how much does this all cost? The following examples assume a working example of roughly 1 million hits over a month. Prices do not include data transfer costs which will vary.

Tables

Service Explanation Cost
API Gateway per 1 million requests per month $3.50
Lamba (hits * typical execution seconds) * (memory/1024) * $.00001667 $1.04
DynamoDB 1 read, 1 write per second $0.58
  Grand Total $5.12/month

What are the gotchas?

  • API Gateway response times can vary greatly. There is support for response caching though which greatly improves latency.
  • Importing API Gateway from Swagger doesn’t create required Lambda permissions. I had to manually reselect each Lambda function through the web panel to apply the appropriate permissions for a service newly cloned from Swagger.
  • API Gateway Swagger exports don’t seem to include any model schemas you create. The web panel seems to show an error message if you have no models.
  • API Gateway custom domains require SSL. There’s no connection with AWS Certificate Manager of Route 53 at this point. Certificates must be manually posted.
  • Custom domains are pointed at API Gateway using a CNAME record. This means using a subdomain which isn’t ideal for short URLs.