---
title: "Placing an R Lambda Runtime in a Container"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Placing an R Lambda Runtime in a Container}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE
)
```

## Introduction

This article shows how to bundle up your R code into a Docker container and
deploy it as an _AWS Lambda_ function using **lambdr**. We provide a
self-contained working example that can be used as a test/template for your
own project.

It is not an exhaustive or authoritative resource on all things _AWS Lambda_
and Docker. Rather, it is a guide intended to get you up and running.

### Pre-requisites

You will need an AWS account. Remember, **you are responsible for the account
and any billing costs that are incurred.**

You need to be comfortable with R and have at least a vague understanding of
Dockerfiles, Docker images, and containers. If you are not familiar with Docker,
see the article [A Primer on Docker for lambdr](docker-primer.html).

For AWS, you'll need a vague understanding of one or more of the AWS Console,
AWS CLI, or AWS CDK, depending on which deployment examples you intend to follow
later in this article.

## Minimal Deployment Ready Example

In this section you will see everything that goes into making an absolutely
minimal deployment-ready R project with **lambdr**.

Ideally, _Lambda_ functions should be pretty simple and do their business
briskly. In reality, they may take several minutes to run and interact with
various other AWS resources such as databases and S3 buckets.

The example given here is basic enough to understand, and realistic enough to
demonstrate some good project structure habits when making a _Lambda_. It
purposefully **doesn't** require access to other AWS resources, like S3 buckets
or databases: Managing and granting permissions are far out of the scope of
**lambdr**.

### Example project structure

The example project is called **flags**. When given a full or partial country
name it queries the [REST Countries](https://restcountries.com/) API and returns
information about the country's flag.

```
.
├── Dockerfile
└── R
    ├── functions.R
    └── runtime.R
```

The minimal structure for a project only needs to contain two elements

1) A directory `R/` containing the R scripts
2) Dockerfile
 
The Dockerfile packages up the R scripts in a Linux distribution with R, any
required R packages, and system dependencies for both R and the R packages.

One of the R scripts should always be called `runtime.R`.

### Project file contents

First let's look at the R code.

#### functions.R

The functions in this file get sourced and used in `runtime.R`.

Don't worry if you aren't familiar with calling RESTful APIs. In summary,
`create_request()` makes a request object to ask
https://restcountries.com about a country's flag. Then `perform_request()` sends
the request to the website, checking the response with the helper function
`unsuccessful()`.

```r
library(httr2)

create_request <- function(country) {
  stopifnot("'country' must be a string containing a full or partial country name, e.g. 'ghana'" = nzchar(country))
  country <- utils::URLencode(country)

  base_url <- "https://restcountries.com/v3.1/name/"

  httr2::request(base_url) |>
    httr2::req_user_agent(
      "lambdr example (https://github.com/mdneuzerling/lambdr/)"
    ) |>
    httr2::req_url_path_append(country) |>
    httr2::req_url_query(fields = "flags")
}

unsuccessful <- function(resp) {
  body <- httr2::resp_body_json(resp)
  msg <- sprintf("\nHTTP %s: %s.\n", body$status, body$message)
  stop(
    msg,
    "Check supplied country name is valid and/or the server status."
  )
}

perform_request <- function(req) {
  resp <- req |>
    httr2::req_error(is_error = \(x) FALSE) |>
    httr2::req_perform()

  if (resp[["status_code"]] != 200L) {
    unsuccessful(resp)
  }

  return(httr2::resp_body_json(resp))
}
```

#### runtime.R

This file orchestrates the rest of the R code and starts up the **lambdr**
runtime interface client. 

`runtime.R` is short and simple. There are three main sections

1) Set up: loading packages, sourcing functions, and indicating a logging
threshold
2) The `handler()` function definition
3) Start **lambdr**

Your _Lambda_ function will likely^[Though typically _Lambdas_ will take an
input, you can make them to simply be invoked with no arguments, e.g. a
_Lambda_ that scrapes the same website every day and runs on a schedule.] take
a JSON payload with some inputs. **lambdr** will convert that JSON into an R
list and pass the items to `handler()`.

For example, a payload for this `flags` _Lambda_ could be
`{"country": "ghana"}`. **lambdr** would convert it to `list(country = "ghana")`,
then pass it to `handler()` for us. 

Note that if `lambdr::start_lambda()` is called interactively it will throw an
error. This is intentional. **lambdr** relies on the presence of environment
variables that are only available when in the deployed _Lambda_ execution
environment. However, you may want to test your code in interactive sessions
during development, so we simply wrap the function in `if (!interactive())`.

```r
library(lambdr)
library(logger)
source(file.path("R", "functions.R"))

logger::log_threshold(logger::DEBUG)

handler <- function(country) {
  logger::log_info("Event received: ", country)

  req <- create_request(country)
  resp <- perform_request(req)

  return(resp)
}

if (!interactive()) {
  lambdr::start_lambda()
}
```

#### Dockerfile

The Dockerfile builds on the one in
[A Primer on Docker for lambdr](docker-primer.html). That article explains each
item in the Dockerfile step-by-step. Here, we emphasise what makes the
Dockerfile below production-ready.

Essentially, it's all about version control.

Once the project is ready for deployment you should pin the base image to a
specific version. To do this you provide the image's **digest**, which is a hash
value. Here we use `@sha256:429...` which is for the amd64 version of
`rocker/r-ver:4.4` at time of writing. Using the tag is not sufficient because
the image it refers to can be subject to change - but the image specified by the
digest will not.

To find a digest for an image you've already pulled you can use
`docker images --digests`. Or you can get it from the repository where you found
the image.

The other version control aspect here is using the 
[Posit Public Package Manager](https://packagemanager.posit.co/client/#/repos/cran/setup)
to get amd64 Ubuntu binaries from a snapshot of CRAN. This is simple, but not
necessary. For example, you could use
[renv](https://rstudio.github.io/renv/index.html) instead. But the point is, 
version the R packages - somehow!

```{.dockerfile}
FROM docker.io/rocker/r-ver:4.4@sha256:429c1a585ab3cd6b120fe870fc9ce4dc83f21793bf04d7fa2657346fffde28d1

# options(warn=2) will make the build error out if package doesn't install
RUN Rscript -e "options(warn = 2); install.packages('pak')"
# Using {pak} to install R packages: it resolves Ubuntu system dependencies AND
# the R dependency tree
RUN Rscript -e "options( \ 
    warn = 2, \
    repos = c(CRAN = 'https://p3m.dev/cran/__linux__/jammy/2024-07-06') \
    ); \ 
    pak::pak( \ 
    c( \ 
    'httr2', \
    'lambdr' \
    ) \
    )"

# Lambda setup
RUN mkdir /R
COPY R/ /R
RUN chmod 755 -R /R

ENTRYPOINT Rscript R/runtime.R
CMD ["handler"]
```

#### Completed project

That's it! These three files are all you need

```
.
├── Dockerfile
└── R
    ├── functions.R
    └── runtime.R
```

The following sections expand on considerations such as local testing and how to
deploy the project into AWS.

## Choosing base images

For R you can use any base image you want. So long as you have all the system
dependencies required for R and the R packages, you only need to have **lambdr**
installed and used as in the minimal example given above.

### Do use

We recommend [Rocker's r-ver](https://rocker-project.org/images/versioned/r-ver)
images.

They offer a tested, versioned R stack. Newer versions can be used with the 
[Posit Public Package Manager](https://packagemanager.posit.co/client/#/repos/cran/setup)
to install Ubuntu binaries of packages as at a certain date on CRAN. They also
allow use of [pak](https://pak.r-lib.org/), which makes life very convenient in
terms of finding and installing R package system dependencies.

### Do not use

The [AWS Lambda 'provided'](https://gallery.ecr.aws/lambda/provided) images. 

They are minimal OS images. That means the base image is fairly small, which is
positive. However, you have to install R and any dependencies yourself. 

More importantly, at time of writing the current OS in the newer images is
Amazon Linux 2, which is not based on a singular Linux distro, but rather a
blend of multiple. That makes installing R package binaries and system
dependencies more challenging, and slow.

The previous version of the OS (Amazon Linux 2023) is drastically different to
Amazon Linux 2 and will no longer receive support from AWS by the end of summer
2025.

## Dev vs deployment Dockerfiles

Up until this point we have only shown and discussed a Dockerfile for
deployment. While you are doing development work, it is a good idea to have a
dev Dockerfile.

The dev Dockerfile should mimic the deployment one as much as possible, but will
also have any extra features you need to make your life easier as a developer.

### Dockerfile.dev

The example given below can be added to the `flags` project

```
.
├── Dockerfile
├── Dockerfile.dev
└── R
    ├── functions.R
    └── runtime.R
```

```{.dockerfile}
FROM ghcr.io/rocker-org/devcontainer/r-ver:4.4@sha256:e99cfe63efd5d79f44146d8be8206019fd7a7230116aa6488097ee660d6aa5dc

# Install the Lambda Runtime Interface Emulator, which can be used for locally
# invoking the function.
# See https://github.com/aws/aws-lambda-runtime-interface-emulator for details
RUN apt-get update && apt-get -y install --no-install-recommends curl 
RUN curl -Lo /usr/local/bin/aws-lambda-rie https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie && \
    chmod +x /usr/local/bin/aws-lambda-rie

# options(warn=2) will make the build error out if package doesn't install
RUN Rscript -e "options(warn = 2); install.packages('pak')"
# Using {pak} to install R packages: it resolves Ubuntu system dependencies AND
# the R dependency tree
RUN Rscript -e "options( \ 
    warn = 2, \
    repos = c(CRAN = 'https://p3m.dev/cran/__linux__/jammy/2024-07-06') \
    ); \ 
    pak::pak( \ 
    c( \ 
    'httr2', \
    'lambdr' \
    ) \
    )"

# Lambda setup
RUN mkdir /R

# Needs to be set to use aws-lambda-rie. It is the path up to runtime.R
ENV LAMBDA_TASK_ROOT="/R"

# Optional for local testing
# 900s, i.e. 15 min, the max time a lambda can run for.
ENV AWS_LAMBDA_FUNCTION_TIMEOUT=900
```

The differences as compared to the deployment Dockerfile are

1) Uses a Dev Container base image
   - Adds some tooling
2) Installs the Lambda Runtime Interface Emulator
   - The RIE opens up the possibility of locally testing the _Lambda_ function
3) Doesn't copy R files into the image
   - Instead, create a live link between system and container file
  systems at run time

#### The Dev Container base image

```{.dockerfile}
FROM ghcr.io/rocker-org/devcontainer/r-ver:4.4@sha256:e99cfe63efd5d79f44146d8be8206019fd7a7230116aa6488097ee660d6aa5dc
```

This time we are using a devcontainer base image, [which](https://containers.dev/)
"allows you to use a container as a full-featured development environment".
This is made by Rocker and it's the same as the r-ver:4.4 image but with some
extra tooling.

The most valuable of these is having [radian](https://github.com/randy3k/radian)
and its dependencies installed, plus some of the other usual setup required to
use VS Code as an IDE for R. Using the actual devcontainer VS Code extension is
an exercise left for the reader, and with a word of warning: One of the authors
of this vignette has seen some M2 MacBook Pros with 16GB of RAM struggling to
run the devcontainer extension. 

If you prefer RStudio as your IDE as an alternative you could use a
[Rocker RStudio Server](https://rocker-project.org/images/versioned/rstudio.html)
image. 

## Build and run dev container

While you're developing it's a good idea to work out of the dev container.

One option for doing so is to have a "build script" that builds the image and
runs a container.

```
.
├── build.sh
├── Dockerfile
├── Dockerfile.dev
└── R
    ├── functions.R
    └── runtime.R
```

```{.bash}
#!/bin/sh
docker stop flags && docker container rm flags

docker build -t flags:latest \
             -f Dockerfile.dev .

docker run \
    -p 9000:8080 \
    -it \
    --rm \
    -v ~/.aws/:/root/.aws \
    -v ./R:/R \
    --name flags \
    flags:latest \
    bash
```

You might run this script by doing e.g. `bash build.sh`

If a container already exists it will be stopped and removed. Note that if there
**isn't** an image you'll see an error
`Error response from daemon: No such container: flags` -- this is expected.

The image will only build if `Dockerfile.dev` has been altered or is being built
for the first time.

The options being given to `docker run` are:

- `-p` Publishes port 9000 to host's 8080. For local testing using the Lambda
RIE (see below)
- `-it` Start the container with an interactive terminal
- `--rm` Remove the container when it is exited
- `-v` Mount. Creates a live link between the host and container file system
  - `~/.aws` makes AWS credentials available in the container. Can be useful if
you need to use {[paws](https://www.paws-r-sdk.com/)}. If you don't need AWS
creds in the container then you shouldn't mount this volume 
  - `./R` contains the lambda code

## Local testing with AWS RIE

In the dev Dockerfile we installed the AWS Runtime Interface Emulator (RIE).

The emulator gets as close to the environment of a _Lambda_ as is possible
without actually pushing up to AWS and invoking.

Two small shell scripts in `local-testing` below, plus the build script from the
previous section (required for port forwarding) are all we need to add:

```
.
├── local-testing
│   ├── event.sh
│   └── start-rie.sh
├── build.sh
├── Dockerfile
├── Dockerfile.dev
└── R
    ├── functions.R
    └── runtime.R
```

`start-rie.sh` starts the emulator with our handler:

```{.bash}
#!/bin/bash
exec /usr/local/bin/aws-lambda-rie Rscript /R/runtime.R handler
```

Run `bash local-testing/start-rie.sh` from **inside the running container**.
This will start the emulator, which runs an HTTP endpoint, similar to what
happens in the deployed _Lambda_. The endpoint will be waiting for a payload
to pass to the R code and this makes the container's terminal busy. If you need
to stop the process just press `Ctrl` + `C`.

Sending a payload to the emulator is the equivalent of invoking the _Lambda_.
This is done with `event.sh`:

```{.bash}
#!/bin/bash
port="9000"
endpoint="http://localhost:$port/2015-03-31/functions/function/invocations"

curl -XPOST $endpoint -d '{"country":"'"$1"'"}'
```

Simply run `bash local-testing/event.sh someCountry` replacing `someCountry`
with a partial or full country name. The result will appear in your terminal. If
you go look at the container terminal all the logs (equivalent to what would 
appear in _CloudWatch_) from the execution will be present. The RIE will run
until you stop it with `Ctrl` + `C`.

If your own _Lambda_ doesn't actually take any parameters then your payload
should be an empty json:

```{.bash}
#!/bin/bash
port="9000"
endpoint="http://localhost:$port/2015-03-31/functions/function/invocations"

curl -XPOST $endpoint -d '{}'
```

And finally, one more option if using the devcontainer VS Code extension.
Because you can spawn multiple terminals from within the container, you can
start the RIE **and** send it a payload from another terminal inside the
container. In this case the port will be `8080`, which is easy to add logic for
in `event.sh` because of the devcontainer envvar `$DEVCONTAINER`:

```{.bash}
if [ "$DEVCONTAINER" = "TRUE" ]
then
    port="8080"
else
    port="9000"
fi
```

## Deployment

At this point, you will have **at minimum** a project with Dockerfile, some R
code, and a `runtime.R` that starts `lambdr`: 

```
.
├── Dockerfile
└── R
    ├── functions.R
    └── runtime.R
```

Deployment is the act of turning these files into an _AWS Lambda_ function that
can be invoked. Here we provide some rough instructions for two common ways of
deploying: via the AWS Console (the website), and the AWS Cloud Development Kit
(CDK).

**For both options you will need to have the AWS CLI installed
([instructions](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html))
as a prerequisite**. The CLI is how you interact with AWS from the terminal.

### AWS Console

The following instructions use the example project given earlier, `flags`. 

The steps are:

- Build the image
- Create a repository in the _AWS Elastic Container Registry_ (ECR)
- Push the image to _ECR_
- Make the _Lambda_ function from the image in _ECR_

First, in a terminal `cd` to the project and build the image:

```{.bash}
docker build -t flags:latest .
```

Then, create a repository in ECR either by using the CLI or the Console. If you
do it in the Console you pretty much just go to the _ECR_ service, click
`Create`, and call it `flags`.

Or you can use the CLI:

```bash
aws ecr create-repository --repository-name flags --image-scanning-configuration scanOnPush=true
```

Make a note of the URI, which is the resource identifier of the created
repository.

The image can now be pushed to the repository. This part has to be done via the
CLI. You can get all the commands ready-made for you via the Console by clicking
on the repo name in _ECR_, then `View push commands`. Or, you can replace the
username `123456789123` and the region `region-name-1` in the commands below to
do the same thing.

Note: You don't need a Docker account for the `docker login` command.

```bash
docker tag flags:latest 123456789123.dkr.ecr.region-name-1.amazonaws.com/flags
aws ecr get-login-password | docker login --username AWS --password-stdin 123456789123.dkr.ecr.region-name-1.amazonaws.com/flags
docker push 123456789123.dkr.ecr.region-name-1.amazonaws.com/flags:latest
```

Now that the image is in _ECR_ it can be used to make the _Lambda_ function.
You could make it from the command line, but this requires an IAM Role to be
configured and ready for the function. That is beyond the scope of lambdr. If
you have a Role and prefer to use the CLI, see the examples in
`aws lambda create-function help`

In the Console go to the _Lambda_ service. Click `Create a function`. Choose the
container image option, give it a name (`flags` is fine) and choose the image
from the _ECR_ repository. Everything else can be left default.

It will take a minute for the function to be made.

Once it is, you can test it by scrolling down and clicking on the `Test` tab. 
Edit the `Event JSON` to be `{"country": "namibia"}`, then click the orange
`Test` button. It should be successful. To see the response click `Details`,
beneath the big green tick.

However, there's a good chance of a timeout because the default value for a
_Lambda_ is only 3 seconds. To increase it, click on the `Configuration` tab,
then `Edit`, and bump it up. 10 seconds is fine in this case.

Alternatively the _Lambda_ can be invoked from the CLI:

```{.bash}
aws lambda invoke --function-name flags \
  --invocation-type RequestResponse --payload '{"country": "namibia"}' \
  /tmp/response.json --cli-binary-format raw-in-base64-out
```

See the response with:

```{.bash}
cat /tmp/response.json            
```

```{.json}
[{"flags":{"png":"https://flagcdn.com/w320/na.png","svg":"https://flagcdn.com/na.svg","alt":"The flag of Namibia features a white-edged red diagonal band that extends from the lower hoist-side corner to the upper fly-side corner of the field. Above and beneath this band are a blue and green triangle respectively. A gold sun with twelve triangular rays is situated on the hoist side of the upper triangle."}}]
```

### Cloud Development Kit (CDK)

The CDK allows you to programatically create application stacks and all of the
associated resources they need, like Lambdas, Step Functions, and so on. It's
an alternative to clicking around in the AWS Console and means that your stack
can (theoretically) be rebuilt at any time with just a few commands.

**First [install the CDK CLI](https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html).**

Once the CDK CLI is installed:

Make a directory somewhere called `LambdrExample`, then `cd` into the directory
and run the following in the terminal:

```{.bash}
cdk init app --language typescript
```

You will now have a bunch of files and folders containing boilerplate code and 
libraries. We're only interested in the `bin` and `lib` directories. You also
need to add a new directory called `lambda`, and to that, the `flags` project,
like below:

```
.
├── bin
│  └── lambdr_example.ts
├── lambda
│  └── flags
│     ├── Dockerfile
│     └── R
│        ├── functions.R
│        └── runtime.R
└── lib
   └── lambdr_example-stack.ts
```

Replace the contents of `bin/lambdr_example.ts` with this:


```{.typescript}
#!/usr/bin/env node
import "source-map-support/register";
import * as cdk from "aws-cdk-lib";
import { LambdrExampleStack } from "../lib/lambdr_example-stack.ts";

const app = new cdk.App();
new LambdrExampleStack(app, "LambdrExampleStack", {
  // Your account number and region go in the env below
  env: { account: "111122223333", region: "my-region-1" },
});
```

Replace the contents of  `lib/lambdr_example-stack.ts` with this:

```{.typescript}
import * as cdk from "aws-cdk-lib";
import { Construct } from "constructs";
import * as lambda from "aws-cdk-lib/aws-lambda";

export class LambdrExampleStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    this.createFlagsLambda();
  }

  createFlagsLambda(): lambda.IFunction {
    const flagsLambda = new lambda.DockerImageFunction(this, "flags", {
      functionName: "flags",
      code: lambda.DockerImageCode.fromImageAsset("lambda/flags"),
      timeout: cdk.Duration.seconds(15),
    });

    return flagsLambda;
  }
}
```

You then need to run

```{.bash}
cdk bootstrap
```

Followed by

```{.bash}
cdk deploy
```

Some building will happen and then you will be prompted as to whether you wish
to deploy a couple of changes. If you feel comfortable, enter `y` and hit
return. 

_Note: If you get a failure because no bucket and/or ECR repository exist, this
is probably because you have previously bootstrapped your account and now have
"stack drift". To resolve this, delete the `CDKToolkit` stack from
CloudFormation and re-bootstrap. For more information, see SO
[here](https://stackoverflow.com/questions/71280758/aws-cdk-bootstrap-itself-broken/71283964#71283964)._

If all has gone well you should see a green tick and sparkles with deployment
time.

Test by invoking from the CLI:

```{.bash}
aws lambda invoke --function-name flags \
  --invocation-type RequestResponse --payload '{"country": "namibia"}' \
  /tmp/response.json --cli-binary-format raw-in-base64-out
```

See the response with:

```{.bash}
cat /tmp/response.json            
```

```{.json}
[{"flags":{"png":"https://flagcdn.com/w320/na.png","svg":"https://flagcdn.com/na.svg","alt":"The flag of Namibia features a white-edged red diagonal band that extends from the lower hoist-side corner to the upper fly-side corner of the field. Above and beneath this band are a blue and green triangle respectively. A gold sun with twelve triangular rays is situated on the hoist side of the upper triangle."}}]
```

If you make changes to the Dockerfile or R code of the Lambda, simply re-deploy
with another `cdk deploy`.

Delete the CDK stack using Cloudformation via the AWS Console, or via the
terminal with `cdk destroy`. You should also delete the ECR repository otherwise
it will sit in your account and you will be charged for usage (a very small
amount, but still).

### Tidying up resources

To clean up the resources made using this guide, use the AWS Console to find
and delete items in

- CloudFormation (the stack)
- ECR
- Lambda