Sharp edges in serverless

I recently gave a talk on how to approach cloud-native (and serverless) development specifically for new startups or those coming from traditional development backgrounds.

I am a huge advocate for using cloud services, but one of the key themes of the talk was “beware of sharp edges”; basically that there are some seriously non-obvious pitfalls to look out for when relying on managed services.

Even as a seasoned engineer (one who has designed cloud services at AWS), I am still surprised by some of the usability traps hiding in the cloud. They seem to jump out when you least expect it, and the solutions/workarounds are often not pretty.

The devil is in the details, and the details are in the limits pages

When you start working with managed services, you quickly learn how important, and nuanced, service limits can be. I was bitten by this more than once as a junior engineer at AWS, and still am surprised occasionally when working with clients.

As a service designer, limits are a fundamental way to protect your service and downstream services and create well-understood/tested boundaries on your service.

As a service user, the key here is to study the limits pages like your livelihood depends on it (maybe it does), and to understand the limits at architecture time. You do not want to encounter an unmovable object once you’re in production. What are the fundamental scaling limits of the service? Which limits can be increased and which can’t? How do the limits vary by region?

While cloud services may advertise themselves as “infinitely scalable”, reality is more complicated. For example, a Kinesis stream can contain an “infinite” number of shards, but each shard is limited to 1MB/s throughput. Similarly, a DynamoDB table can be provisioned for “infinitely” high throughput, but there is still a fundamental limit on the throughput a single partition/host can support. In both of these examples, you’d better architect your application in such a way that your data is distributed across partitions/shards, or you’re going to run into a fundamental scaling wall, probably once you’re already in production.

On CloudFormation and Usability

I am a huge advocate for CloudFormation as a fundamental building block in AWS. In the past I have referred to it as the “gold standard” for IaC on AWS, and “the assembly language of the cloud”.

I maintain that CloudFormation, and tools that output CloudFormation such as the Cloud Development Kit and Serverless Framework, are the best option for IaC due to CFN’s focus on safe, predictable deployments, rollbacks, and first-class integrations with AWS services.

However, some design decisions seem to have traded-off these characteristics over usability. It’s no secret that CloudFormation has usability challenges – its declarative format comes with a steep learning curve, the dev/test cycle is slow, errors are obscure, and IDE/tooling support is basically nonexistent (CDK shows a lot of promise here).

Other tools, such as Terraform, seem to prioritize usability over safety – providing a clean syntax and partial deployments (no rollbacks) by default. The S3-based state store in particular should instill fear into the hearts of oncalls everywhere.

Obstacles to the serverless future

I maintain that serverless architectures have so many advantages that they should be the default choice for the majority of new applications. However, there are some sharp edges that may surprise you when building serverless applications on AWS.

I was working with a client recently on a relatively simple greenfield serverless app with 23 functions plus supporting resources. We quickly built out an MVP and a CI/CD pipeline and we were well on our way to changing the world. Right in the middle of one of the busiest sprints of the year:

Error --------------------------------------------------

The CloudFormation template is invalid: Template format error: Number of resources, 201, is greater than maximum allowed, 200

Fire your architect, because he didn’t anticipate this limit (it can’t be increased, I checked):

Yes, CloudFormation limits the number of resources in a single stack to 200, and yes their official advice is to split the resources across multiple stacks.

That’s all well and good, except CloudFormation doesn’t really support moving resources between stacks. If you’re in production, prepare yourself for a painful blue/green style migration (especially your data stores), probably updates to your CI/CD tools, and expect your development progress to screech to a halt.

Of course, it makes perfect sense that CloudFormation has a limit to the number of resources in a stack. The service has to resolve a complex dependency graph and execute a distributed workflow to create, update, and rollback this complex graph of resources. This doesn’t change the fact that the limit represents an obstacle to serverless development on AWS because 1) it is too low and 2) the workarounds are too painful.

Serverless applications specifically demand a staggering number of resources: API Gateway methods for all of your APIs, including options methods for CORS, IAM roles for least-privilege access control, permissions resources, etc. This will only get worse with time as higher-level abstractions like the CDK gain in popularity.

In many cases, it makes sense to decompose your architecture into multiple stacks, but for teams getting off the ground, this may be an arbitrarily-imposed architectural decision.

If this is indeed a best-practice, then AWS tools should make it easy to do. To date, tools such as SAM, CDK, and Serverless framework do not use multiple stacks by default, with panicked users such as myself instead relying on community-driven solutions.

AWS is heavily invested in serverless but there are still plenty of usability obstacles to overcome. In addition to making it easy to get started on a serverless project, AWS should also enable users to evolve an application into a mature production-grade application with as little friction as possible.

 

API Gateway Regional vs. Edge-Optimized Endpoints

API Gateway recently launched regional endpoints, a deceivingly simple feature that has important implications:

  • lower latency for clients located in the same AWS region (i.e. running in EC2 or Lambda)
  • ability to manage your own CloudFront distribution or WAF for your API
  • ability to manage DNS routing for your custom domain name

In my opinion, the biggest win here is the ability to integrate Route53 DNS routing with your REST APIs. If you replicate your APIs to multiple regions (using OpenAPI import, for example), you can take advantage of powerful Route53 features such as latency-based routing, regional failover, and blue green deployments.

There are distinct advantages for both options. Here’s my personal take on when to use each:

When to use regional endpoints:

  • your clients are predominantly located in the same AWS regions (i.e. running in EC2 or Lambda)
  • you want to manage your own CloudFront distribution and use CloudFront features such as custom routing rules, edge caching, WAF, Lambda@Edge, etc
  • you want to take advantage of DNS routing for your custom domain name

When to use edge-optimized endpoints:

  • You have geographically distributed clients
  • You don’t want to pay for and manage your own CloudFront distribution

A note on latency benchmarking:

A common pattern I’ve seen is for developers to conduct performance tests against API Gateway with traffic originating from a single EC2 region, or worse, from a single development machine. These types of tests will likely produce better latency results using regional endpoints. However, keep in mind that if you have geographically distributed clients, synthetic tests will not represent the client experience. The best way to truly measure this is to track client-side latency metrics from your API clients.

Congrats to the API Gateway team on a very important release.

Cheers,

Ryan

Generic Amazon API Gateway Java Client (SDK)

I’ve recently released apigateway-generic-java-sdk, a simple generic Java client for Amazon API Gateway endpoints for those that don’t necessarily want to generate a strongly-typed SDK. This is particularly useful when the API definition is changing rapidly or when you don’t want to go through the effort of generating and bundling an SDK, such as when prototyping or scripting.

It is optimized to run from a Lambda function and does not require any extra dependencies beyond the AWS SDK, which is already bundled in the Lambda runtime.

Features

  • AWS SigV4 request signing. Supports APIs authenticated with IAM auth using standard AWSCredentialsProvider interface
  • API Keys
  • Custom headers
  • Throws exceptions for non-2xx response codes
  • Compatibility with existing AWS SDK client configuration (connections, retry policies, etc)
  • Runs in AWS Lambda functions with no additional dependencies

Example

GenericApiGatewayClient client = new GenericApiGatewayClientBuilder()
        .withClientConfiguration(new ClientConfiguration())
        .withCredentials(new EnvironmentVariableCredentialsProvider())
        .withEndpoint("https://XXXXXX.execute-api.us-east-1.amazonaws.com")
        .withRegion(Region.getRegion(Regions.fromName("us-east-1")))
        .withApiKey("XXXXXXXXXXXXXXX")
        .build();

Map headers = new HashMap<>();
headers.put("Content-Type", "application/json");

try {
    GenericApiGatewayResponse response = client.execute(
            new GenericApiGatewayRequestBuilder()
                    .withBody(new ByteArrayInputStream("foo".getBytes()))
                    .withHttpMethod(HttpMethodName.POST)
                    .withHeaders(headers)
                    .withResourcePath("/stage/path").build());
    
    System.out.println("Response: " + response.getBody());
    System.out.println("Status: " + response.getHttpResponse.getStatusCode());
    
} catch (GenericApiGatewayException e) {   // exception thrown for any non-2xx response
    System.out.println(String.format("Client threw exception with message %s and status code %s", 
            e.getMessage(), e.getStatusCode()));
}

To get the code and for more examples, see the GitHub repo.

How to send response headers for AWS Lambda function exceptions in API Gateway

In a previous post on error handling in API Gateway I discussed various ways to map errors from your Lambda function to appropriate API status codes and how to build the response body appropriately for different types of errors.

One common pattern we’ve seen come up is the requirement to return a specific response header value depending on the type of error from the Lambda function. While this is most easily achieved with proxy integrations, some prefer to use the explicitly-mapped “AWS” integration type. This allows their Lambda function implementation to use native error types and decouples it from their API Gateway configuration.

In this example I will show how to manipulate the HTTP status code, the response body, as well as a response header value based on the Lambda error outcome.

Note, this example is specific to the NodeJS runtime, and will only allow to set a single header – please post in the comments if you have similar solutions for other runtimes.

This technique makes use of the fact that Lambda serializes the exception type in the errorType field, which can be mapped to a header value in API Gateway. This is a workaround solution until API Gateway supports JSON-parsing in parameter mapping expressions.

Lambda function

exports.handler = (event, context, callback) => {
   function MyError(message) {
      Error.captureStackTrace(this, this.constructor); // mapped to $.stackTrace
      this.customMessage = message; // custom field
      this.name = 'test header value' // mapped to $.errorType
      this.message = JSON.stringify(this); // mapped to $.errorMessage
    }
    require('util').inherits(MyError, Error);
    callback(new MyError("BadRequest - my error message"));
};

Observe that this Lambda function sets the error.name field to the value desired in the response header. When the error is serialized by Lambda, this becomes the “errorType” field in the Lambda response. You can also set custom properties in the Lambda error which can be used when rendering the API response body.

This Lambda function outputs the following response:

{
"errorMessage": "{\"customMessage\":\"BadRequest - my error message\",\"name\":\"test header value\"}",
"errorType": "test header value",
"stackTrace": [
"exports.handler.message (/var/task/index.js:12:14)"]
}

API definition

---
swagger: "2.0"
info:
  version: "2017-01-25T19:26:48Z"
  title: "API Gateway Test API"
host: "79rptjwqbk.execute-api.us-east-1.amazonaws.com"
basePath: "/test"
schemes:
- "https"
paths:
  /{proxy+}:
    x-amazon-apigateway-any-method:
      produces:
      - "application/json"
      parameters:
      - name: "proxy"
        in: "path"
        required: true
        type: "string"
      responses:
        200:
          description: "200 response"
        400:
          description: "400 response"
          headers:
            test-header:
              type: "string"
        500:
          description: "500 response"
      x-amazon-apigateway-integration:
        responses:
          default:
            statusCode: "200"
          .*BadRequest.*:
            statusCode: "400"
            responseParameters:
              method.response.header.test-header: "integration.response.body.errorType"
            responseTemplates:
              application/json: "#set ($errorMessageObj = $util.parseJson($input.path('$.errorMessage')))\n\
                \n{\n\"my-message\" : \"$errorMessageObj.customMessage\"\n}"
        uri: "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-1:XXXXXXXX:function:errors/invocations"
        passthroughBehavior: "when_no_match"
        httpMethod: "POST"
        contentHandling: "CONVERT_TO_TEXT"
        type: "aws"

Note that the method response header is set to the value of the “errorType” field in the Lambda error response for 400 responses.

Zooming in on the mapping template for the 400 response:

#set ($errorMessageObj = $util.parseJson($input.path('$.errorMessage')))
{
   "my-message" : "$errorMessageObj.customMessage"
}

The “stringified” errorMessage is parsed by the mapping template so that all properties (including custom properties) of the error object can be accessed in the mapping template to build the response body.

Invoking this method produces the following results, appropriately setting the trifecta of status code, response body, and response header.

Request: GET /test
Status: 400
Response Body
{
  "my-message": "BadRequest - my error message"
}
Response Headers
{"test-header":"test header value","X-Amzn-Trace-Id":"Root=1-5888fee9-79b51571774c74ebeeb3eb62","Content-Type":"application/json"}

Comments, questions, and improvements are welcome!

-Ryan

Bare-bones Swagger Example for API Gateway Simplified Proxy Features

Amazon API Gateway just made it a LOT easier to build an API to front an existing HTTP backend or Lambda functions.

Recent additions of a few simple but powerful new features reduce the amount of configuration needed to build an API Gateway proxy by several times. No more mapping templates, parameter mapping, response mappings, etc. (unless you need them).

Here’s a super simple example demonstrating the 3 new features (greedy path parameter, “ANY” method, and proxy integration types).

This API will accept requests using any HTTP method to any subpath of any depth under either /http or /lambda. Any request under /http will proxy all headers, path parameters, and query string parameters to the HTTP integration (httpbin.org). Any request under /lambda will invoke a Lambda function with complete API request data in accordance with the proxy convention defined here.

---
swagger: "2.0"
info:
  version: "2016-09-23T22:23:23Z"
  title: "Simple Proxy Example - Ryan Green"
host: "zte3bswjjb.execute-api.us-east-1.amazonaws.com"
basePath: "/demo"
schemes:
- "https"
paths:
  /http/{proxy+}:
    x-amazon-apigateway-any-method:
      parameters:
      - name: "proxy"
        in: "path"
      x-amazon-apigateway-integration:
        type: "http_proxy"
        uri: "http://httpbin.org/{proxy}"
        httpMethod: "ANY"
        passthroughBehavior: "when_no_match"
        requestParameters:
          integration.request.path.proxy: "method.request.path.proxy"
  /lambda/{proxy+}:
    x-amazon-apigateway-any-method:
      parameters:
      - name: "proxy"
        in: "path"
      responses: {}
      x-amazon-apigateway-integration:
        type: "aws_proxy"
        uri: "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-1:[MY_ACCOUNT_ID]]:function:[MY_FUNCTION_NAME]]/invocations"
        passthroughBehavior: "when_no_match"
        httpMethod: "POST"

Easy API Gateway/Lambda “Serverless” API Logging/Debugging

Developing and testing “serverless” APIs using Amazon API Gateway and AWS Lambda can be made much easier with built-in support for CloudWatch Logs.

In Lambda functions you can use log statements to send log events to CloudWatch Log streams, and API Gateway automatically submits log events for requests to APIs with logging enabled.

However, it can be difficult to reconcile log events for a serverless API sent across multiple CloudWatch log groups and log streams. Tracking down logs for a specific request or tailing request logs for a serverless API can be a cumbersome experience.

To help improve the serverless dev/debug/test experience, I’ve released a fork of the excellent awslogs project to include native support for API Gateway/Lambda serverless APIs. Given an API Gateway REST API ID and Stage name, this tool produces an aggregated stream of time-ordered*, color-coded log events emitted by API Gateway and all Lambda functions attached to your API. The log events can then be further filtered and processed by standard command-line tools.

i.e. stream all log events emitted from API Gateway as well as from all Lambda functions attached to the API:

apilogs get --api-id xyz123 --stage prod --watch

or search APIG/Lambda logs for events from a specific request ID in the past hour:

apilogs get --api-id xyz123 --stage prod --start='1h ago' | grep "6605b081-6f04-11e6-97ac-c34deb0b3dd9"

Tip: To correlate API Gateway request IDs with Lambda invocations, send $context.requestId to your Lambda function via a mapping template and include it in your Lambda log messages (i.e. console.log(event.apiRequestId + ” – log message”);)

Check out ‘apilogs’ here. Fixes and contributions are greatly appreciated.

Happy debugging!
Ryan

An API Gateway mapping template to “send everything” to your Lambda function

If you’re trying to get up to speed with developing microservices on API Gateway and Lambda, one of the first things you will want to try is to send basic API request data to your Lambda function.

Since all Lambda function input data must go in the request body, you must use an API Gateway mapping template to build a JSON representation of your data.

Here’s a master template to “send everything” API Gateway provides (as of 02/22/2016) to your Lambda function. This should serve as a good starting point and can be modified to suit your use-case. This will include all HTTP parameters, context data, stage variables, and the full method request body.

## API Gateway "Send Everything" Mapping Template - Ryan Green - ryang@ryang.ca
##  See http://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html
#set($allParams = $input.params())
{
"body-json" : "$input.json('$')",
"params" : {
#foreach($type in $allParams.keySet())
    #set($params = $allParams.get($type))
"$type" : {
    #foreach($paramName in $params.keySet())
    "$paramName" : "$util.escapeJavaScript($params.get($paramName))"
        #if($foreach.hasNext),#end
    #end
}
    #if($foreach.hasNext),#end
#end
},
"stage-variables" : {
#foreach($key in $stageVariables.keySet())
"$key" : "$util.escapeJavaScript($stageVariables.get($key))"
    #if($foreach.hasNext),#end
#end
},
"context" : {
    "account-id" : "$context.identity.accountId",
    "api-id" : "$context.apiId",
    "api-key" : "$context.identity.apiKey",
    "authorizer-principal-id" : "$context.authorizer.princialId",
    "caller" : "$context.identity.caller",
    "cognito-authentication-provider" : "$context.identity.cognitoAuthenticationProvider",
    "cognito-authentication-type" : "$context.identity.cognitoAuthenticationType",
    "cognito-identity-id" : "$context.identity.cognitoIdentityId",
    "cognito-identity-pool-id" : "$context.identity.cognitoIdentityPoolId",
    "http-method" : "$context.httpMethod",
    "stage" : "$context.stage",
    "source-ip" : "$context.identity.sourceIp",
    "user" : "$context.identity.user",
    "user-agent" : "$context.identity.userAgent",
    "user-arn" : "$context.identity.userArn",
    "request-id" : "$context.requestId",
    "resource-id" : "$context.resourceId",
    "resource-path" : "$context.resourcePath"
    }
}

Here’s a Gist with the code.

For more information, check out the API Gateway mapping template reference.

Cheers,

Ryan

How To: HTTP redirects with API Gateway and Lambda

Update (2017-08-15): Recent service updates have removed the need for the workarounds described below, though they may still be useful in some cases, or for historical context.

Achieving HTTP redirects with API Gateway and Lambda is now trivial with the addition of Proxy integrations.

Simply define an API with an ‘aws_proxy’ integration type, and implement your Lambda function to explicitly return the redirect status code and Location header.

Lambda Function

i.e.

'use strict';
 
exports.handler = function(event, context, callback) {
    var response = {
        statusCode: 301,
        headers: {
            "Location" : "http://ryangreen.ca"
        },
        body: null
    };
    callback(null, response);
};

Swagger

---
swagger: "2.0"
info:
  version: "2017-08-15T19:29:52Z"
  title: "redirect test"
basePath: "/prod"
schemes:
- "https"
paths:
  /redirect2:
    x-amazon-apigateway-any-method:
      x-amazon-apigateway-integration:
        uri: "arn:aws:apigateway:[REGION]:lambda:path/2015-03-31/functions/arn:aws:lambda:[REGION]:[ACCOUNT_ID]:function:[FUNCTION_NAME]/invocations"
        passthroughBehavior: "when_no_match"
        httpMethod: "POST"
        type: "aws_proxy"

——

API Gateway recently released support for mapping response bodies to response headers. One application of this feature is to enable conditional HTTP redirects in your API Gateway/Lambda API.

There are a couple of ways to achieve 30x redirects in your API Gateway/Lambda API.

Option 1: 30X as “default” response

This option is preferred if your API method always redirects (i.e. never returns a normal 2XX response).

1) Define a method response with status 302, and a “Location” header defined
2) Define a “default” integration response mapping with blank regex, mapping to 302.
3) For this response, define a “Location” header mapping from the redirect URL returned in your Lambda function. i.e. “integration.response.body.location”
3) Configure your lambda function to return the redirect location in the body, i.e.

context.succeed({location : "http://example.com"})

Swagger example

/lambdaredirect-default:
    get:
      produces:
      - "application/json"
      parameters: []
      responses:
        200:
          description: "200 response"
          schema:
            $ref: "#/definitions/Empty"
          headers: {}
        302:
          description: "302 response"
          headers:
            Location:
              type: "string"
      x-amazon-apigateway-integration:
        responses:
          default:
            statusCode: "302"
            responseParameters:
              method.response.header.Location: "integration.response.body.location"
        uri: "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-1:[ACCOUNT_ID]:function:redirect-default/invocations"
        httpMethod: "POST"
        type: "aws"

Lambda function

exports.handler = function(event, context) {
    context.succeed({
        location : "https://example.com"
    });
};

Option 2: 30X as an error response

This option allows your method to return both “successful” (2XX) and redirect outcomes, but requires you to model redirects in your lambda function as errors.

1) Define a method response with status 302, and a “Location” header defined. Leave the “default” integration 2XX response with blank regex, mapped to your 2XX method response.
2) Define the redirect integration response mapping with regex “http.*”, mapped to your 30X response.
3) For this response, map the redirect URL returned in the error message of your lambda function to your “Location” header: “integration.response.body.errorMessage”
3) Configure your lambda function to return the redirect location as the error message, i.e.

context.fail("http://example.com")

or

throw new RuntimeException("http://example.com")

4) Optional: If you don’t want to expose the lambda error response body to the client, define a mapping template on the redirect response to nullify the response body. You can use a template with a comment to render an empty response.

Swagger example

 /lambdaredirect-error:
    get:
      produces:
      - "application/json"
      parameters: []
      responses:
        200:
          description: "200 response"
          schema:
            $ref: "#/definitions/Empty"
          headers: {}
        302:
          description: "302 response"
          headers:
            Location:
              type: "string"
      x-amazon-apigateway-integration:
        responses:
          default:
            statusCode: "200"
          https://.*:
            statusCode: "302"
            responseParameters:
              method.response.header.Location: "integration.response.body.errorMessage"
            responseTemplates:
              application/json: "## intentionally blank"
        uri: "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-1:[ACCOUNT_ID]]:function:redirect-error/invocations"
        httpMethod: "POST"
        type: "aws"

Lambda function

exports.handler = function(event, context) {
    context.fail("https://example.com");
};

Here’s a full Gist with the example.

Cheers,
Ryan

Sending all HTTP parameters in API Gateway

Here’s an API Gateway mapping template to send all HTTP path, query string, and header parameters to your backend integration (i.e. Lambda function).

#set($allParams = $input.params())
{
    "body-json" : "$input.json('$')",
    "params" : {
    #foreach($type in $allParams.keySet())
        #set($params = $allParams.get($type))
        "$type" : {
            #foreach($paramName in $params.keySet())
                "$paramName" : "$util.escapeJavaScript($params.get($paramName))"
                #if($foreach.hasNext),#end
            #end
        }
        #if($foreach.hasNext),#end
    #end
    }
}

will produce something like:

{
   "body-json":"{}",
   "params":{
      "path":{
         "pathParamName1":"pathParamValue1",
         "pathParamName2":"pathParamValue2",
         "pathParamName3":"pathParamValue3"
      },
      "querystring":{
         "queryParamName1":"queryParamValue1",
         "queryParamName2":"queryParamValue2",
         "queryParamName3":"queryParamValue3"
      },
      "header":{
         "headerParamName1":"headerParamValue1",
         "headerParamName2":"headerParamValue2",
         "headerParamName3":"headerParamValue3"
      }
   }
}

The parameters can then be accessed in your Lambda function, i.e.

exports.myHandler = function(event, context) {
   console.log("pathParamName1 = " + event.params.path.pathParamName1);
   context.succeed("");
}

Here’s a Gist with the example.

Cheers,
Ryan