Custom rust runtime for AWS Lambda

2020-09-19

I recently decided to utilize my newfound love of rust in the form of an AWS lambda. Great idea, right? Unfortunately it seems that Rust is not natively supported by lambda as-of yet. In this article, I will describe exactly how to overcome that obstacle by way of a custom runtime.

The first thing we need to grok is how a lambda works. You can read all about custom runtimes in the documentation but, well, it's surprisingly straightforward. You see, lambdas are just programs running in a container and the only requirement is that you invoke a specific REST API when the main application is done with its work. There are some nuanced endpoints you can call, for example:

  • There's one dedicated to retrieving work details
  • And another for reporting your lambda completing in a success state
  • A final one for reporting a failure state

Which brings us to the first decision of our runtime - how do we make http requests?

Runtime Dependencies

I rejected some of the biggest-name client http-client libraries because they bloated my final runtime considerably. I wanted to keep the footprint lightweight, so I settled on http_req. It's a very easy-to-use library and really small. Win-Win!

My library doesn't actually require very many dependencies. In addition to the http engine, we also need to serialize/deserialize some json and that's it. To provide a kind and benevolent interface, I also included async-trait. And my demo uses tokio.

With this set of libraries, the final runtime is only going to be around 10mb.

The Code

The first thing to note is our dependencies and constants. Nothing too outrageous of note here. We'll be using async_trait for defining the handler entrypoint. Since the function must be asynchronous, this helps creates a lovely interface that is easy to use.

Additionally, I've defined a few convenience constants and methods. These are consistent with the runtime documentation.

#![allow(dead_code)]
use async_trait::async_trait;
use std::collections::BTreeMap;
use std::{env};
use http_req::request;
use http_req::response::Headers;

pub type LambdaStatus = Result<String, String>;

const API_VERSION: &str = "2018-06-01";
const REQUEST_ID_HEADER: &str = "Lambda-Runtime-Aws-Request-Id";
const TRACE_ID_HEADER: &str = "Lambda-Runtime-Trace-Id";
const FUNCTION_ARN_HEADER: &str = "Lambda-Runtime-Invoked-Function-Arn";

/* Most meta information is provided to us through
environment variables. Use this to access it. */
fn get_env() -> BTreeMap<String, String> {
    let mut result = BTreeMap::new();
    for (key, value) in env::vars() {
        result.insert(key, value);
    }
    return result;
}

/* Convenience method to accept http_req headers
and convert them to map structure  */
fn headers_to_map(headers: &Headers) -> BTreeMap<String, String> {
    return headers.iter()
        .fold(BTreeMap::new(), |mut acc, header| {
            let key = header.0.to_string();
            let val = header.1.to_string();
            acc.insert(key, val);
            return acc;
        });
}

If you have used lambda before, you may be familiar with the handler function. As the entrypoint into your program, it will accept a payload from the lambda runtime which is effectively the arguments into your function. Rust does not allow traits to have asynchronous methods, at least not without a little help. This is why the async_trait dependency was added.

The use of generics here helps ensure the payload can be as dynamic as possible.

#[async_trait]
pub trait LambdaHandler<T> {
    async fn handle(&self, payload: T) -> LambdaStatus;
}

By the way, all subsequent code will be living inside a module named runtime.

/// This is the lambda runtime helper. It has the ability to
/// execute specific functions within the lambda ecosystem.
pub mod runtime {
    use super::*;

}

Invocation Request

InvocationRequest as I call it may be more familiar to some as "context". This structure defines the necessary information to work within the confines of the lambda runtime. It will be the main structure we use to pass around details of the request.

  • request_id - this is a guid provided to us which represents the particular request. If the function invocation failed and is being retried, it may have an identical request_id.
  • trace_id - this is guaranteed to always be unique.
  • function_arn - not strictly necessary, but nice to have for logging purposes.
struct InvocationRequest<T> {
    pub payload: T,
    request_id: String,
    trace_id: String,
    function_arn: String,
}
/// This method will take a relative path, and build a fully-qualified
/// path compatible with the internal lambda routes.
fn build_uri(path: &str) -> String {
    let env = get_env();
    let api = if env.contains_key("AWS_LAMBDA_RUNTIME_API") {
      env.get("AWS_LAMBDA_RUNTIME_API").unwrap()
    } else {
      "127.0.0.1"
    };
    return format!("http://{}/{}/runtime{}", api, API_VERSION, path).to_string();
}

/// This method will ask the lambda runtime for a new job. If one is available,
/// it will deserialize the request payload and return the full context.
async fn next_invocation<R> () -> Option<InvocationRequest<R>>
    where R : serde::de::DeserializeOwned {

    let mut stream: Vec<u8> = Vec::new();

    // Lambda provides this /invocation/next endpoint as an internal
    // mechanism for polling new jobs.
    let res = request::get(build_uri("/invocation/next"), &mut stream);

    if res.is_ok() {
        let content = String::from_utf8_lossy(&stream).to_string();
        let payload: R = serde_json::from_str(content.as_str().clone()).unwrap();
        let ok_val = res.ok();
        let headers = headers_to_map(ok_val.as_ref().unwrap().headers());

        if headers.contains_key(REQUEST_ID_HEADER) &&
           headers.contains_key(TRACE_ID_HEADER) &&
           headers.contains_key(FUNCTION_ARN_HEADER)
        {
            // If the request is valid, build up the context object and return it.
            return Some(InvocationRequest {
                request_id: headers.get(REQUEST_ID_HEADER).unwrap().to_string(),
                trace_id: headers.get(TRACE_ID_HEADER).unwrap().to_string(),
                function_arn: headers.get(FUNCTION_ARN_HEADER).unwrap().to_string(),
                payload: payload,
            });
        }
    }

    return None;
}

/// This method is responsible for reporting back to lambda the final status of our
/// job. It will pass along whatever output that is generated from the main
/// worker function.
async fn send_response<R>(request: &InvocationRequest<R>, result: LambdaStatus) {
    let payload: (String, String) = match result {
        Ok(code) => (
            build_uri(
                format!(
                    "/invocation/{}/response",
                    request.request_id.as_str()
                ).as_str()
            ),
            code.to_string()
        ),
        Err(err) => (
            build_uri(
                format!(
                    "/invocation/{}/error",
                    request.request_id.as_str()
                ).as_str()
            ),
            err.to_string()
        ),
    };

    request::post(
        payload.0.as_str(),
        payload.1.as_bytes(),
        &mut Vec::new()
    ).unwrap();
}

/// This is the main method which is invoked.
pub async fn process<T, R: serde::de::DeserializeOwned>(app: T)
    where T : LambdaHandler<R>, R : Clone
    {
        println!("fetching initial invocation request");
        let mut current_invocation = runtime::next_invocation().await;

        // While there is still work to do, collect the details and pass it along
        // to the application handler. Take the response and report back to lambda.
        while current_invocation.is_some()  {
            let context = current_invocation.unwrap();
            let outcome = app.handle(context.payload.clone()).await;
            runtime::send_response(&context, outcome).await;
            current_invocation = runtime::next_invocation().await;
        }
    }

A Few Words on Compiling

Our program ultimately must be cross-compiled in a way that is able to be run on an ec2 container in the aether. To achieve this, we'll need to first install the correct system target.

rustup target add x86_64-unknown-linux-musl

This references a thing called linux-musl which is a specific toolchain compatible with amazonlinux instances. And speaking of... you probably need to download it! You can download a fresh version if you don't want to use my old-school cool src.

cd ~/
wget https://git.musl-libc.org/cgit/musl/snapshot/musl-1.2.1.tar.gz
tar -xsvf musl-1.2.1.tar.gz
cd musl-1.2.1
./configure
make
sudo make install

After running that last command, you will need to update your PATH variable. Assuming you have bash... add this to the end of your ~/.profile file.

export PATH="/usr/local/musl/bin:$PATH"

You are now ready to cross-compile!

The first thing that rust and AWS think of when they hear http web request is apparently openSSL. It wouldn't be my first choice, but here we are. To make matters worse, the specific RHEL flavor of container that lambdas run on does not play nicely with the newest openSSL version. This means we need to build and link our lambda using the downgraded 1.0.1k branch.

cd ~/
wget https://www.openssl.org/source/old/1.0.1/openssl-1.0.1k.tar.gz
tar -xvf openssl-1.0.1k.tar.gz
cd openssl-1.0.1k
./config
make
sudo make install

After running that last command, you will need to update your PATH variable. Assuming you have bash... add this to the end of your ~/.profile file.

export PATH="/usr/local/ssl/bin:$PATH"

Compilation

An example makefile. Just put all this stuff on one line and you'll be golden.

build:
    OPENSSL_DIR=/usr/local/ssl \
    PKG_CONFIG_ALLOW_CROSS=1 \
    cargo build --release --target x86_64-unknown-linux-musl

Final Deliverable

The finished directory that we create will ultimately look like so:

| function
| - bootstrap.sh
| - binary

Before our lambda can be executed it needs an entrypoint to orchestrate everything. This entrypoint needs to be a file called bootstrap.sh which is executable.

bootstrap.sh

#!/bin/sh
cd $LAMBDA_TASK_ROOT
./binary

Note: binary is simply the final rust executable.

You can go ahead and zip all this up now

    zip -j function.zip ./function/*

And now you've got a rust lambda!