Part 4 - ECS container infrastructure

aws
infrastructure-as-code
awscdk
Author

Erik Lundevall-Zara

Published

December 6, 2021

Modified

February 13, 2024

It is now time to create an infrastructure in AWS to run container-based solutions!

We will use the Amazon Web Services (AWS) service Elastic Container Service (ECS) to run an application packaged as a container. We will also put a load balancer in front of it, so that we can run multiple copies of that containerised application and distribute the load among those copies.

This is part 4 in a series of articles about building cloud infrastructure solutions with AWS Cloud Development Kit (AWS CDK). If you have not already read them, I recommend checking out part 1, 2, and 3 as well, which are all on the Tidy Cloud AWS website.

Update:
This article series uses Typescript as an example language. However, there are repositories with example code for multiple languages.
* https://github.com/cloudgnosis/iac-ninja-aws-cdk-ts * https://github.com/cloudgnosis/iac-ninja-aws-cdk-python * https://github.com/cloudgnosis/iac-ninja-aws-cdk-go
The repositories will contain all the code examples from the articles series, implemented in different languages with the AWS CDK. You can view the code from specific parts and stages in the series, by checking out the code tagged with a specific part and step in that part, see the README file in each repository for more details.

What are containerised applications?

Containers in computer infrastructure and software terms are a way to package applications or software solutions with its dependencies in a way that is portable and (relatively) easy to manage. It has become a popular way to deploy and manage various types of applications.

We will not dive into details about creating, building, and using containers. There are many other resources that cover that extensively. Here are a few links to check out for information about containers (and Docker, which popularised the concept):

  • https://techgenix.com/getting-started-with-containers/
  • https://www.docker.com/get-started
  • https://www.youtube.com/watch?v=fqMOX6JJhGo
  • https://www.youtube.com/watch?v=pTFZFxd4hOI

Containers in AWS

There are actually many ways to run containers in AWS. Some may argue that there are way too many, which adds to confusion about what to choose.

The Elastic Container Service (ECS) is one of the core services in AWS to run containers, and it is not too hard to get started with, so that is what we will pick here.

It also helps that AWS CDK has substantial support for ECS.

We will also use the flavour of ECS, that is called ECS Fargate. When we use Fargate, we will not need to concern ourselves with the underlying servers that the containers run on. These details are something that Fargate handles for us.

Note: ECS is not a free tier service in AWS. If you spin up containers in ECS, you will pay for the privilege. It will not cost much (probably < $1) unless you forget to shut it down afterwards you are done with this article.

A few pieces of terminology that may be of use here:

  • A cluster. This is a collection of computational resources (also known as servers) that the containers run on, and the resources needed to manage these computational resources. With ECS Fargate, we do not need to care about the servers in the cluster, but we will still need a cluster to let AWS manage it for us.
  • A container image. A reference to a specific solution/application packaging that we will use. This is where we have packaged the software components we want to run.
  • An ECS Task. A collection of one or more container images with associated configuration we want to run as a unit in ECS. The specification of what those images are and their configuration is a TaskDefinition.
  • An ECS Service. When we want to run an ECS task continuously in one or more copies and make sure it stays up, then we configure an ECS Service. ECS will do the heavy lifting for us to make sure it keeps running. If the task(s) itself dies or fails, ECS will auto-start up new ones.

Unfortunately, the terminology that different platforms to run containers use can differ, so this can become a bit confusing if you look at Kubernetes, or Docker Swarm. They are like ECS, but use partially different terminology.

Goals

Before we begin, let us define what we should accomplish. We will keep it relatively simple, but still not too simple. After that, we will refine our solution to make it better in various ways. Along the way, we should pick up some good practices and feature of the AWS CDK!

  • Expose an endpoint for a web server for HTTP traffic from internet.
  • Web server shall run in a container.
  • The container itself shall not be directly reachable from internet.
  • We should be able to have a service set up so that containers will automatically be started if needed.
  • We should be able to build our custom solution for this web server.
  • We should be able to get container images from DockerHub.
  • We do not care about managing the underlying server infrastructure that runs the containers. I.e., we will use Fargate.

We will not fix all these points in one go, but iterate on these points. By listing a few points for our goals, we have something to focus on. Once we are done with these, we can refine even further.

The application solution is initially going to be an Apache web server. This is a straightforward solution for us to test that we have something working, as there are pre-built container images to use.

Initialize our project

Similar to what we did in part 2 of this series of articles, we create a new project and use the AWS CDK command-line tool to initialise it.

mkdir my-container-infrastructure
cd my-container-infrastructure 
cdk init --language=typescript

Contrary to the previous project we did, we will not keep everything in a single source code file. We will though remove all sample content that cdk init generated and write our own code from scratch.

So let us start from the top, in our main program file, bin/my-container-infrastructure.ts. Our starting point will be quite similar to our previous project, with an App and a Stack. We will also fetch the default VPC.


import { App, Stack } from 'aws-cdk-lib';
import { Vpc } from 'aws-cdk-lib/aws-ec2';

const app = new App();
const stack = new Stack(app, 'my-container-infrastructure', {
    env: {
        account: process.env.CDK_DEFAULT_ACCOUNT,
        region: process.env.CDK_DEFAULT_REGION,
    },
});

const vpc = Vpc.fromLookup(stack, 'vpc', {
    isDefault: true,
});

If you run this code via cdk synth, get an empty CloudFormation template, which is fine, since we have added no actual resources yet. You will get an error if you have specified no AWS credentials, because it does not know which account and region to look up the VPC from. So make sure you use valid and relevant credentials.

Now we will start with the actual container infrastructure!

Setting up an ECS cluster

The first piece of infrastructure we need to set up to run is an ECS cluster. We could just add a few lines of code in our main files to create the cluster, but we will prepare a bit for future more complex set-up. So instead we will package this in its own module.

In the lib directory, let us create a subdirectory with name containers and in that directory create a file with name container-management.ts. This means the path to the file from the root of the project is lib/containers/container-management.ts.

In this file, we make a function to add a cluster, since we want to add this to our stack to deploy it later. In AWS CDK there is an aws-cdk-lib/aws-ecs sub-module for ECS, in which there is a construct with name Cluster. This is the one we are going to use when we create the cluster. In the documentation for Cluster, we can see that all properties are optional. However, if we want to use our existing VPC, we need to provide that. Otherwise, AWS CDK will create a new VPC for us, which may not be what we want. So with that in mind, our first naïve function signature may look like this:


const addCluster = function(scope: Stack, vpc: IVpc): Cluster {
}

The function will add the cluster we create to the stack we provide and then return a reference to the cluster itself that was created. We pass in the VPC as a parameter, where we use the type IVpc. This is returned by Vpc.fromLookup(). Any type that begins with an “I” is an interface type, which provides a subset of the functionality of a full-blown construct - which is typically the case when the resource is a reference to something that already exists and you not have created yourself.

The implementation is short, like this:


const addCluster = function(scope: Stack, vpc: IVpc): Cluster {
    return new Cluster(scope, 'cluster', {
        vpc,
    });
}

We create a Cluster object, passing in the VPC as a property. The Cluster object we create also gets a name (not the same as a cluster name!), which we have set to cluster.

However, in case we want to reuse the function, this might be troublesome with a hard-coded name for our Cluster object. In AWS CDK, every resource or construct gets an identifying name. It does not have to be globally unique, but at the same level in the resource hierarchy, it has to be unique. So we will need to change that, if we want to be sure to reuse the function without a problem. Thus, we can add a parameter to give the Cluster construct a name as well.


const addCluster = function(scope: Stack, id: string, vpc: IVpc): Cluster {
    return new Cluster(scope, id, {
        vpc,
    });
}

If we look closer to the parameters that Cluster expects in the AWS CDK documentation when we create an object, we can see in its signature that the first parameter is a Construct, not a Stack. A Stack is a kind of Construct, so it is perfectly fine to pass in a Stack. The reason for the first parameter to be a Construct is that we can compose multiple resources to build our own constructs, which is exactly how the AWS CDK itself has built the higher-level building blocks from the CloudFormation-based primitives.

So allowing ourselves to reuse this not only directly in a Stack, but in constructs we create ourself, let us change the type of the first parameter to Construct instead:


const addCluster = function(scope: Construct, id: string, vpc: IVpc): Cluster {
    return new Cluster(scope, id, {
        vpc,
    });
}

Except for the type change, the code is identical. We also need to export the code from our module, so we can use it in our main program. The complete code, including export and import statements, looks like this:


import { IVpc } from 'aws-cdk-lib/aws-ec2';
import { Cluster } from 'aws-cdk-lib/aws-ecs';
import { Construct } from 'constructs';

export const addCluster = function(scope: Construct, id: string, vpc: IVpc): Cluster {
    return new Cluster(scope, id, {
        vpc,
    });
}

In our main program in bin/my-container-infrastructure.ts, we import the function and add a cluster to the stack we created previously. The code would look like this:


import { App, Stack } from 'aws-cdk-lib';
import { Vpc } from 'aws-cdk-lib/aws-ec2';
import { addCluster } from '../lib/containers/container-management';

const app = new App();
const stack = new Stack(app, 'my-container-infrastructure', {
    env: {
        account: process.env.CDK_DEFAULT_ACCOUNT,
        region: process.env.CDK_DEFAULT_REGION,
    },
});

const vpc = Vpc.fromLookup(stack, 'vpc', {
    isDefault: true,
});

const id = 'my-test-cluster';
addCluster(stack, id, vpc);

You can run cdk synth (or cdk synth ---quiet) to check that AWS CDK thinks your code is ok. You can also run a cdk deploy to deploy the ECS cluster. AWS does not charge you for empty ECS clusters, it is only when you actually add containers to run in it you get charged. If you deploy it and look in the view of ECS in AWS Console, it should look like this:

ECS Cluster

Great work! Now it is time to add an actual task to run into the cluster.

Create an initial task definition

As we mentioned earlier in this article, we set up something to run by using the Apache httpd web server, for which there are container images already which we can download. To create a Task, we need a TaskDefinition. It needs to contain at least one container image definition. Let us assume we need to configure things at the task level and at the container level.

Let us just add some placeholders for task and container configuration, and for now only handle the case with a single container in a task. So a first stab at a function for this could be:


export interface TaskConfig {

}

export interface ContainerConfig {

}

export const addTaskDefinitionWithContainer = 
function(scope: Construct, id: string, taskConfig: TaskConfig, containerConfig: ContainerConfig): TaskDefinition {

};

In our goals, we stated we do not care about managing the underlying server infrastructure that the containers run in; we let AWS handle that for us. This means that we will use a FargateTaskDefinition to define our task. Such a definition requires at least to specify the amount of CPU to allocate and the memory limit for the container. So we should add that to our TaskConfig. Another nice-to-have parameter is the family, which is a name to group multiple versions of a task definition. Our updated TaskConfig then looks like this:


export interface TaskConfig {
    readonly cpu: 256 | 512 | 1024 | 2048 | 4096;
    readonly memoryLimitMB: number;
    readonly family: string;
}

There are only a few fixed values that we can set for CPU, where 1024 represents 1 vCPU (virtual CPU). So we define the cpu property to only include these values. There are also some restrictions on the memory based on the CPU setting, but it is a bit too complex for a simple type definition. So we just expect a number for the memory limit.

Now we can add a task definition in our function:


export const addTaskDefinitionWithContainer = 
function(scope: Construct, id: string, taskConfig: TaskConfig, containerConfig: ContainerConfig): TaskDefinition {
    const taskdef = new FargateTaskDefinition(scope, id, {
        cpu: taskConfig.cpu,
        memoryLimitMiB: taskConfig.memoryLimitMB,
        family: taskConfig.family,
    });

    return taskdef;
};

Since a task definition can have one or more containers, adding container(s) is a separate operation. Thus we will add a method call for this on the task definition, addContainer(). The minimum we have to add is a reference to a container image. If we for now assume we will just use a container image from DockerHub, then we can define ContainerConfig as this for now:


export interface ContainerConfig {
    readonly dockerHubImage: string;
}

Using this input to our addTaskDefinitionWithContainer() function, we can now add a minimal container configuration. The ContainerImage.fromRegistry() function retrieves image information from DockerHub or other public image registries. Will this be enough for us to get it running? Let us find out!


export const addTaskDefinitionWithContainer = 
function(scope: Construct, id: string, taskConfig: TaskConfig, containerConfig: ContainerConfig): TaskDefinition {
const taskdef = new FargateTaskDefinition(scope, id, {
        cpu: taskConfig.cpu,
        memoryLimitMiB: taskConfig.memoryLimitMB,
        family: taskConfig.family,
    });

    const image = ContainerImage.fromRegistry(containerConfig.dockerHubImage);
    taskdef.addContainer(`container-${containerConfig.dockerHubImage}`, { image, });

    return taskdef;
};

We need to add a call to our new function in our main program, so that we can add the task definition to our stack.

const taskConfig: TaskConfig = { cpu: 512, memoryLimitMB: 1024, family: 'webserver' };
const containerConfig: ContainerConfig = { dockerHubImage: 'httpd' };

addTaskDefinitionWithContainer(stack, `taskdef-${taskConfig.family}`, taskConfig, containerConfig);

We add this to the end of our main program. Now we add an ECS cluster to our stack, as well as an ECS Task Definition - at least from the point of view of our AWS CDK code. What about IAM permissions, security groups? AWS CDK will create a few things under the hood for us. We can get an idea of what will actually be created by running the command cdk diff with our updated stack. But first, this is how our main program file looks like now:

import { App, Stack } from 'aws-cdk-lib';
import { Vpc } from 'aws-cdk-lib/aws-ec2';
import { addCluster, addTaskDefinitionWithContainer, ContainerConfig, TaskConfig } from '../lib/containers/container-management';

const app = new App();
const stack = new Stack(app, 'my-container-infrastructure', {
    env: {
        account: process.env.CDK_DEFAULT_ACCOUNT,
        region: process.env.CDK_DEFAULT_REGION,
    },
});

const vpc = Vpc.fromLookup(stack, 'vpc', {
    isDefault: true,
});

const id = 'my-test-cluster';
addCluster(stack, id, vpc);

const taskConfig: TaskConfig = { cpu: 512, memoryLimitMB: 1024, family: 'webserver' };
const containerConfig: ContainerConfig = { dockerHubImage: 'httpd' };

addTaskDefinitionWithContainer(stack, `taskdef-${taskConfig.family}`, taskConfig, containerConfig);

With this program, if we run cdk diff, and we already deployed the empty ECS cluster before, the changes may look like this:


 cdk diff
Stack my-container-infrastructure
IAM Statement Changes

┌───┬───────────────────────────────────┬────────┬────────────────┬─────────────────────────────────┬───────────┐
   │ Resource                          │ Effect │ Action         │ Principal                       │ Condition │
├───┼───────────────────────────────────┼────────┼────────────────┼─────────────────────────────────┼───────────┤
 + │ ${taskdef-webserver/TaskRole.Arn} │ Allow  │ sts:AssumeRole │ Service:ecs-tasks.amazonaws.com │           │
└───┴───────────────────────────────────┴────────┴────────────────┴─────────────────────────────────┴───────────┘
(NOTE: There may be security-related changes not in this list. See https://github.com/aws/aws-cdk/issues/1299)

Resources

[+] AWS::IAM::Role taskdef-webserver/TaskRole taskdefwebserverTaskRole18D47E42 
[+] AWS::ECS::TaskDefinition taskdef-webserver taskdefwebserver2E748BDD 

We can see there is an IAM Role with permissions that seems to be related to ECS Task execution. We also see that there is the ECS TaskDefinition. When we use the cdk diff command, we can see what changes a deployment of our AWS CDK code would make to our stack. However, this information is presented in terms of CloudFormation resources. While you do not need to be an expert in CloudFormation, it is certainly beneficial to read and understand CloudFormation resource descriptions. This is also true when we write unit tests for our infrastructure, which we will see in a later article.

If you run cdk deploy to deploy this added infrastructure, we can log in to the AWS Console and see the result. In the ECS Service view in AWS Console, see an ECS task appearing now:

ECS Task Definition

You can drill further into the task definition to see the settings in place there. What we should focus on now, though, is to get it running. At this stage, we will run the task manually from AWS Console. Later, we automate that as well. Select the Cluster view under ECS and click on the cluster name there. Select the Task tab in the new view. Click on the Run new Task button.

Run new task in cluster

In the Run Task view, select launch type Fargate, and operating system family Linux. There is only one task definition in place, and only one version.

Run task config

Next under VPC and security groups you select the VPC id of the default VPC, and pick one or more of the subnets presented in the list. In the security group setting, you can notice a security group name. This is a security group that the AWS Console has generated for you - not something we have created. It is set up by default to allow incoming traffic on port 80 from anywhere.

Run task config

Run task config

Run task config

Scroll down to the end of the page and click on Run Task. The task should not be set up, first in stage PROVISIONING, then in state RUNNING. It should only take a couple of seconds, but you may need to refresh the view to see the status changes.

![Provisioning task]part4/Run_task_5.png)

Running task

If you click on the link under the headline Task in the task status view above, see a detailed view about the task itself. This view includes the private and public IP addresses of the task. Since we used the default VPC and it has only public subnets, the default here is to allocate a public IP address to the task. This is not what we want in the long run, but it is useful for us right now to see that the task with Apache web server is running properly.

Running task details

Note the public IP address you got in your view, and enter that in a web browser. It should try to access that IP address via HTTP by default and we should see the default response from Apache web server if we try that (the text It works!).

Web server response - it works!

Great work! We have now been able to create a cluster and run a container-based task in that cluster!

Rounding up and final notes

You should probably destroy the provisioned infrastructure before you forget it, so remember to run cdk destroy before you forget what you did here!

Looking at our goal list, we got some points we covered here, and a few that remain:

  • Expose an endpoint for a web server for HTTP traffic from internet.
  • Web server shall run in a container.
  • The container itself shall not be directly reachable from internet.
  • We should be able to have a service set up so that containers will automatically be started if needed.
  • We should be able to build our custom solution for this web server.
  • We should be able to get container images from DockerHub.
  • We do not care about managing the underlying server infrastructure that runs the containers. I.e., we will use Fargate.

So out of the 7 goals, we have covered 4 of them. The next three will be in the next article, so stay tuned for that one! If you are interested in similar materials, you can also visit the cloudgnosis.org website to look for more content.

Back to top