Devcloud and Devkit. How do they improve the developer experience at Jobandtalent? | by Job&Talent Engineering

Devcloud and Devkit. How do they improve the developer experience at Jobandtalent? | by Job&Talent Engineering
Latest Job Opportunities in India

Latest Job Opportunities in India

Discover top job listings and career opportunities across India. Stay updated with the latest openings in IT, government, and more.

Check Out Jobs!
Read More

🚀 Devcloud and Devkit. How do they improve the developer experience at Jobandtalent? | by Job&Talent Engineering uncovered

12 min read

Jan 31, 2023

Jobandtalent utilizes the power of AWS, Terraform, Docker and Python to create a rich development environment. Let me show you how.

Press enter or click to view image in full size

Written by Jakub Polak

Table of content

  1. Introduction
  2. Devkit
    Devkit’s architecture
    Docker Compose
    Dockerfiles
    Services
    Gateway
    Shared services
    Commands
    Devkit clone
    Devkit setup
    Devkit multi
    Devkit secrets
  3. Devcloud
    Devcloud’s architecture
    Instance lifecycle
    Launching the ‘Nightly build’ machine
    Launching a developer’s machine
    Setting an instance
    Building images
    Launching an instance
    Starting, stopping rebooting and terminating instances
    Accessing an instance
    AWS Lambda functions
    Out of the office instance stopper
    Instance terminator
    Build monitor
    Image deleter
  4. Wrapping up

Introduction

Jobandtalent consists of many engineering teams. Teams create many different projects, and these projects are often dependent on each other.

If you want to know more about Jobandtalent’s three main verticals (and teams responsible for these verticals), please read this article.

How do we maintain an environment for so many teams, so many people and so many projects? There has to be some standardization of tools and regular support for them. Otherwise it would be too chaotic to solve any potential environment problems.

Thankfully, there are open source instruments and cloud providers that have helped Jobandtalent to create a solution that is used by more than 150 developers every day.

Using these instruments and providers, we created two tools:

  • Devkit — Docker Compose based development environment,
  • and Devcloud — Devkit on AWS.

Backend DevEx, a technology team that implements internal libraries for better development experience at Jobandtalent, is also responsible for maintaining Devcloud and Devkit, as well as for new features.

The goal for this blog post is to share Jobandtalent’s internal solution for developer environment and inspire you to possibly implement a similar solution in your company.

Press enter or click to view image in full size

Devkit

Devkit is the heart of Devcloud. It can be utilized locally or in the cloud.

Devkit’s architecture

Devkit’s architecture

We can use it on our local environment or in Devcloud.

It consists of three main parts:

  • The extensive docker-compose.yml file that interacts with service’s Dockerfiles; regularly updated by all Product teams at Jobandtalent,
  • Dockerfiles of shared services — RabbitMQ, mailcatcher, gateway, etc.,
  • Custom shell commands that ease developer’s workflow.

Docker Compose

Docker Compose is a really great tool. It allows us to easily add new projects to our environment. When the team creates a new service, they should also add it to the huge docker-compose.yml file.

For example:

admin-front:
<<: *logging-conf
build:
context: ../admin-front
dockerfile: Dockerfile
container_name: admin-front
command: yarn dev -- -p 8080
depends_on:
- gateway
- companies
- farming
environment:
APP_DOMAIN: $👉
APP_ENV: development
APP_NAME: admin-front
API_CLIENT_COMPANIES_BASE_URL: http://companies:8080/api
API_CLIENT_FARMING_BASE_URL: http://farming:8080/api
expose:
- 8080
volumes:
- ../admin-front/src:/usr/local/jt/srcdd

As you can see, the admin-front service depends on other services: Gateway, Companies and Farming. Companies and Farming are just regular services that are maintained by Product Teams. The Gateway service is deeper explained below.

The file also stores some ENV variables and thanks to Gateway always expose port 8080.

Dockerfiles

And this is what the Dockerfile file looks like in one of the projects:

FROM node:16.18.1-alpine3.16

WORKDIR /usr/local/jtservice

COPY package.json yarn.lock .npmrc ./
RUN yarn install --frozen-lockfile

COPY src src

EXPOSE 8080

CMD ( "yarn", "start", "--", "-p", "8080")

There is no magic here. Docker pulls the node:16.18.1-alpine3.16 image from registry. Sets some variables, copies files, installs dependencies and runs the server.

Services

Gateway

Gateway is a special service shipped with DevKit, based on NGINX. It’s configured as a reverse-proxy server to redirect all the requests to the proper service.

Gateway uses naming conventions to map subdomains and services. For example, any request to mailcatcher.jt.dev would be forwarded to the mailcatcher service. The Devkit-Gateway-Proxy-Pass HTTP header included in the response can be used to know which service received the request from the Gateway. We use ngx_http_perl_module for routing and authentication.

Every service exposed by Docker (EXPOSE) listens on the port 8080. That allows Gateway to forward the requests to that port.

........
location / Source

Shared services

Some services like mailcatcher, elasticsearch, RabbitMQ, etc., should be shared in one Dockerfile:

FROM ruby:3.1-alpine

RUN apk add --no-cache g++ make

RUN gem install mailcatcher

EXPOSE 1025 8080

CMD ("mailcatcher", "-f", "--ip=0.0.0.0", "--http-port=8080")

Commands

We wrote some custom commands in Devkit using shell to ease the development and onboarding of new people.

Devkit clone

What if someone added a new service to Docker Compose?

Thanks to a few lines of code we can have synchronization with our project folder. Devkit will warn us when our Docker Compose file is out of the date:

warn_if_outdated() Explore more:

We need to pull the changes:

git pull devkit

And then run:

devkit clone

This is what the sample output of the command looks like:

==> git clone        git@github.com:jobandtalent/some-service-1.git (cloning into '/home/user/jobandtalent/some-service-1')
The repository is already cloned
==> git clone git@github.com:jobandtalent/new-service.git (cloning into '/home/user/jobandtalent/new-service')
Cloning into 'new-service'...
remote: Enumerating objects: 536, done.
remote: Counting objects: 100% (128/128), done.
remote: Compressing objects: 100% (64/64), done.
remote: Total 536 (delta 64), reused 91 (delta 50), pack-reused 408
Receiving objects: 100% (536/536), 431.30 KiB | 1.10 MiB/s, done.
Resolving deltas: 100% (200/200), done.
==> git clone git@github.com:jobandtalent/some-service-2.git (cloning into '/home/user/jobandtalent/some-service-2')
The repository is already cloned

Devkit setup

Some of the services have an additional file .devkit/setup

Usually, it looks very simple:

#!/bin/sh

set -eu

cd "$(dirname "$0")/.." || exit 1

script/setup

The script/setup file executes commands responsible for database creation, migrations, seeds etc.:

bundle exec rake db:create
bundle exec rake db:schema:load
bundle exec rake db:seed
bundle exec rake elasticsearch:setup

he rake commands you see above are used most widely by the Ruby on Rails framework.

The devkit setup command is launched whenever a new Devcloud instance is created.

This convention is derived from Scripts To Run Them All repository, here you can read more about it.

Devkit multi

It is a dedicated command to execute commands in multiple services. There are 2 when executing commands:

  • exec – execute a command in multiple running containers
  • run – run a one-off command in multiple services

Let’s say, we want to print the environment of all running containers. We can use the command:

devkit multi exec env

We can specify the services:

devkit multi exec service_1 service_2 env

Devkit secrets

We have to deal with some secrets, right? We store secrets encrypted in the one repository and we inject those secrets into the Docker Compose commands.

There is a possibility to list and safely encode secrets in AES.

Devcloud

Devcloud is Devkit moved to the cloud. It saves whole gigabytes of RAM and CPU, let alone developer frustration compared to services running locally. To ensure the speed of development, this solution simply had to be implemented.

Devcloud’s architecture

Press enter or click to view image in full size

A diagram of Devcloud architecture.

We utilize the power of Python programming language, capabilities and flexibility of AWS and the ease of deployment using Terraform.

Instance lifecycle

There are two independent processes that happen during instance lifecycle:

Launching the ‘Nightly build’ machine

  • The new machine is launched.
  • The machine runs the build
  • If successful, it creates a new image, i.e. ami-abcd
  • The process is repeated every 24 hours.

This process is explained in more depth in the Building images section

Launching a developer’s machine

  • The developer launches his own machine XXXX based on the latest build
  • Now he can do the work, stop the instance, reboot it or terminate the machine
  • If he terminates his machine and want to create a new machine YYYY it will use the same, untouched ami-abcd build.

The diagram below illustrates this process:

Press enter or click to view image in full size

Setting an instance

Before we launch and create an instance, we need to configure some basic settings.

class Instance(object):

def __init__(self, aws_instance):
self.set_aws_instance(aws_instance)

def set_aws_instance(self, aws_instance):
self.id = aws_instance.id
self.type = aws_instance.instance_type
self.image_id = aws_instance.image_id
....................

Variables like the ones posted above are derived from the boto3 library — a library created for creating, configuring, and managing AWS services.

For Devcloud we decided to use Amazon EC2. This Amazon service was perfect for our needs.

We coded our own on top of EC2 cloud. To build such a layer, you need to set certain variables in your code.

And so on… There are plenty more functions available that you can use depending on your own needs.

Building images

We need to create new builds on a regular basis. There is one simple reason for that: changes merged into the devkit’s master branch can have bugs. The build might crash and thus the developer will be forced to use an old image that was built more than 24 hours ago.

With the help of the:

devcloud build

command, we can build new images.

The command creates the Builder instance and logs an information for the User:

def build(self, options: Any) -> None:
instance = Builder.launch(self.owner)
ip = instance.public_ip_address
msg = "Build in progress. The build log can be accessed by running " "`ssh ubuntu@%s tail -f build.log`." % ip
self.io.write(text=msg)

Launching an instance

The command to launch new Devcloud instances uses an image from builder and executes

devcloud launch

Let’s dive into the code to understand what is going on behind the scenes.

And there is a lot going on.

EC2 = boto3.resource("ec2")
DEFAULT_AWS_EBS_VOLUME_SIZE = 100
.................

class Instance(object):
.................

@classmethod
def launch(
cls,
owner,
name,
aws_image_id,
aws_instance_type=None,
aws_ebs_volume_size=None,
.................
):
public_name = "devcloud %s, owned by %s" % (name, owner)
domain x= os.environ("DEVCLOUD_DOMAIN")
host = ("%s.%s" % (name, domain)).lower()
hostname = host.replace(".", "-")
aws_ebs_volume_size = aws_ebs_volume_size or DEFAULT_AWS_EBS_VOLUME_SIZE
.................
user_data = """
#cloud-config

hostname: %(hostname)s
bootcmd:
- echo 127.0.0.1 %(hostname)s >> /etc/hosts
write_files:
- path: /etc/environment
content: |
PATH="/home/ubuntu/devkit/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
AWS_DEFAULT_REGION=%(aws_region)s
AWS_REGION=%(aws_region)s
.................
""" % dict(
aws_region=os.environ("AWS_REGION"),
name=name,
owner=owner,
host=host,
domain=domain,
.................
aws_instance = EC2.create_instances(
BlockDeviceMappings=(
📰 Published by
),
ImageId=aws_image_id,
UserData=user_data,
.................
aws_instance.wait_until_running()
aws_instance.load()

return cls(aws_instance)

The original code has more implementation and variable details but the goal here is to get a general grasp of the idea.

Devcloud’s launch command sets new variables and parses some of them to create a new EC2 instance using:

EC2.create_instances(args)

Among the method’s arguments, we can distinguish BlockDeviceMapping, ImageId, UserData, unmentioned InstanceType or TagSpecifications.

UserData is injected into the developer’s machine and if everything runs as expected, Python executes two boto3 Python’s methods:

And at the end, the method returns the just launched instance.

Starting, stopping rebooting and terminating instances

We need a few commands for an efficient daily work with AWS machines. I grouped them together because they are quite similar in their structure.

As described below, we utilize AWS Lambda functions section. All machines are stopped after 7 p.m.

What is the first thing, then, that a developer must do as they start their work in the morning? They must start the stopped machine.

This can be done by the command:

devcloud start

The behind the scenes code is not that complicated:

def start(self):
self.aws_instance.start()
self.aws_instance.wait_until_running()
self.aws_instance.load()

return self.set_aws_instance(self.aws_instance)

The method utilizes previously mentioned EC2.Instance.wait_until_running and EC2.Instance.load references to boto3 library.

I would like to group the following commands because these methods can be distinguished from each other by just some small details. Instead of:

devcloud start

we will have the following:

devcloud stop

devcloud reboot

devcloud terminate

Accessing an instance

Well, we launched an instance and now we would like to get inside! To do this, we will use the command:

devcloud ssh

What’s inside? Let’s find out!

def ssh(self, options):
name = options.get("NAME_OR_INSTANCE_ID") or self.owner
instance = Instance.find(name)

if instance:
if instance.state != "running":
self.io.write(text="The instance %s is not running." % name)
else:
os.execv("/usr/bin/ssh", ("ssh", "ubuntu@%s" % instance.ip, "-A"))
else:
self.io.write(text="The instance %s was not found." % name)

First, it tries to find the instance by the given name. Then, the conditions check if the instance has not been stopped or does not exist. If the conditions are not met, the Python code:

os.execv("/usr/bin/ssh", ("ssh", "ubuntu@%s" % instance.ip, "-A"))

will execute the SSH program that was previously configured during the instance launch.

AWS Lambda functions

Terraform sets up and maintains some AWS based lambda functions that provides serverless execution infrastructure. Let’s take a look at some, in my opinion, of the most interesting ones.

Out of the office instance stopper

At Jobandtalent, we have flexible working hours. People usually work between 6 a.m. and 7 p.m. After 7 p.m., all instances are automatically stopped to save money. The developer can resume the machine at any time, of course.

We wrote some simple lines of Python code to execute this action. The method tries to find all instances that are either running or pending building and stops them.

def lambda_handler(params, context):
instances = Instance.all(states=("running", "pending"))

for instance in instances:
instance.aws_instance.stop()

return (instance.id for instance in instances)

We have an AWS Lambda function that defines this action:

resource "aws_lambda_function" "lambda_out_of_office_hours_instance_stopper" explained

And the AWS Cloudwatch Event Rule — simply a cron job responsible for stopping the instance after 7 p.m.:

resource "aws_cloudwatch_event_rule" "cron_out_of_office_hours_instance_stopper" 🔗

And the AWS Cloudwatch Event Rule — simply a cron job responsible for stopping the instance after 7 p.m.:

Instance terminator

We also have a lambda function responsible for terminating machines older than 2 days. Similar to the previous lambda function, it helps us to save money.

BUILDER_LIFETIME_DAYS = 2

def lambda_handler(params, context):
builder_instances = Builder.all(states=("running"))
old_builder_instances = list(filter(__is_old_builder, list(builder_instances)))

for builder_instance in old_builder_instances:
builder_instance.stop()

return (instance.id for instance in old_builder_instances)

def __is_old_builder(instance):
tz_info = instance.launch_time.tzinfo
now = datetime.now(tz_info)
return (now - instance.launch_time).days >= BUILDER_LIFETIME_DAYS

The lambda function queries all running builds and then filters them by the launch_time.

Then, similarly to the previous example, a cron job is being executed:

resource "aws_cloudwatch_event_rule" "cron_builder_instance_terminator" Full Article

Build monitor

We frequently see this AWS lambda function action in use in our Slack channel:

Press enter or click to view image in full size

IMAGE_BUILD_FREQUENCY = 24

def lambda_handler(params, context):
image_age = __hours_since_last_image_creation()

if image_age > IMAGE_BUILD_FREQUENCY:
...
admin = "..."
msg = "@%s the latest image is %d hours old. " % (admin, image_age)

if instance:
ip = instance.public_ip_address
msg += "Please, inspect the build log using the command `ssh ubuntu@%s cat build.log`." % ip
else:
msg += "Please, run a new build using the command `/devcloud build`."

slack.write(text=msg)

The function checks if the build was successful in the last 24 hours. If not, the most possible reason for it is that docker-compose.yml file in Devkit has some errors.

The Backend DevEx Team pings the code owners of the failing service and they fix the bug. This is by the way, why codeowners are important. In big growing companies like Jobandtalent, we need to have this defined and regularly maintained, so that we can react quickly to problems and not wonder who is responsible for the given piece of code.

Image deleter

We want to prevent storing lots of AWS images. This is for performance and financial reasons: the more images, the money Jobandtalent spends for instances grows bigger and bigger. So we wrote a rule to store maximum 5 AWS lambda images:

MAX_IMAGES_COUNT = 5

def lambda_handler(params, context):
images = Image.all()

del images(:MAX_IMAGES_COUNT)

for image in images:
image.delete()

return (image.id for image in images)

Similar to out of the office instance stopper, we also use an AWS Cloudwatch Event Rule:

resource "aws_cloudwatch_event_rule" "cron_image_deleter" Hashtags:

Wrapping up

I have not described everything here, but I hope I have covered the general idea well.

There would be a few more things to cover here: integration with Slack, more security aspects, provisions for the Devcloud machines, integration with docopt library, authentication, DNS, IDE and metric (Datadog) integration. If you liked it, we will consider discussing one of the topics in the future.

As we publish this post, the Backend DevEx team is investigating some alternatives to these tools. The company is growing bigger and bigger. We can’t stop it, we are proud of it.

If you are reading this, thank you for your time, I hope you have learned something valuable!

Authored by Read more at: Source Feed:



From: #Devcloud #Devkit #improve #developer #experience #Jobandtalent #JobTalent #Engineering

✨ Job&Talent Engineering on 2023-01-31 17:51:00

uncovered Job&amp;Talent Engineering – Medium
👉 Devcloud and Devkit. How do they improve the developer experience at Jobandtalent? | by Job&Talent Engineering Read Now

12 min read

Jan 31, 2023

Jobandtalent utilizes the power of AWS, Terraform, Docker and Python to create a rich development environment. Let me show you how.

Press enter or click to view image in full size

Written by Jakub Polak

Table of content

  1. Introduction
  2. Devkit
    Devkit’s architecture
    Docker Compose
    Dockerfiles
    Services
    Gateway
    Shared services
    Commands
    Devkit clone
    Devkit setup
    Devkit multi
    Devkit secrets
  3. Devcloud
    Devcloud’s architecture
    Instance lifecycle
    Launching the ‘Nightly build’ machine
    Launching a developer’s machine
    Setting an instance
    Building images
    Launching an instance
    Starting, stopping rebooting and terminating instances
    Accessing an instance
    AWS Lambda functions
    Out of the office instance stopper
    Instance terminator
    Build monitor
    Image deleter
  4. Wrapping up

Introduction

Jobandtalent consists of many engineering teams. Teams create many different projects, and these projects are often dependent on each other.

If you want to know more about Jobandtalent’s three main verticals (and teams responsible for these verticals), please read this article.

How do we maintain an environment for so many teams, so many people and so many projects? There has to be some standardization of tools and regular support for them. Otherwise it would be too chaotic to solve any potential environment problems.

Thankfully, there are open source instruments and cloud providers that have helped Jobandtalent to create a solution that is used by more than 150 developers every day.

Using these instruments and providers, we created two tools:

  • Devkit — Docker Compose based development environment,
  • and Devcloud — Devkit on AWS.

Backend DevEx, a technology team that implements internal libraries for better development experience at Jobandtalent, is also responsible for maintaining Devcloud and Devkit, as well as for new features.

The goal for this blog post is to share Jobandtalent’s internal solution for developer environment and inspire you to possibly implement a similar solution in your company.

Press enter or click to view image in full size

Devkit

Devkit is the heart of Devcloud. It can be utilized locally or in the cloud.

Devkit’s architecture

Devkit’s architecture

We can use it on our local environment or in Devcloud.

It consists of three main parts:

  • The extensive docker-compose.yml file that interacts with service’s Dockerfiles; regularly updated by all Product teams at Jobandtalent,
  • Dockerfiles of shared services — RabbitMQ, mailcatcher, gateway, etc.,
  • Custom shell commands that ease developer’s workflow.

Docker Compose

Docker Compose is a really great tool. It allows us to easily add new projects to our environment. When the team creates a new service, they should also add it to the huge docker-compose.yml file.

For example:

admin-front:
<<: *logging-conf
build:
context: ../admin-front
dockerfile: Dockerfile
container_name: admin-front
command: yarn dev -- -p 8080
depends_on:
- gateway
- companies
- farming
environment:
APP_DOMAIN: $Tags:
APP_ENV: development
APP_NAME: admin-front
API_CLIENT_COMPANIES_BASE_URL: http://companies:8080/api
API_CLIENT_FARMING_BASE_URL: http://farming:8080/api
expose:
- 8080
volumes:
- ../admin-front/src:/usr/local/jt/srcdd

As you can see, the admin-front service depends on other services: Gateway, Companies and Farming. Companies and Farming are just regular services that are maintained by Product Teams. The Gateway service is deeper explained below.

The file also stores some ENV variables and thanks to Gateway always expose port 8080.

Dockerfiles

And this is what the Dockerfile file looks like in one of the projects:

FROM node:16.18.1-alpine3.16

WORKDIR /usr/local/jtservice

COPY package.json yarn.lock .npmrc ./
RUN yarn install --frozen-lockfile

COPY src src

EXPOSE 8080

CMD ( "yarn", "start", "--", "-p", "8080")

There is no magic here. Docker pulls the node:16.18.1-alpine3.16 image from registry. Sets some variables, copies files, installs dependencies and runs the server.

Services

Gateway

Gateway is a special service shipped with DevKit, based on NGINX. It’s configured as a reverse-proxy server to redirect all the requests to the proper service.

Gateway uses naming conventions to map subdomains and services. For example, any request to mailcatcher.jt.dev would be forwarded to the mailcatcher service. The Devkit-Gateway-Proxy-Pass HTTP header included in the response can be used to know which service received the request from the Gateway. We use ngx_http_perl_module for routing and authentication.

Every service exposed by Docker (EXPOSE) listens on the port 8080. That allows Gateway to forward the requests to that port.

........
location / Written by

Shared services

Some services like mailcatcher, elasticsearch, RabbitMQ, etc., should be shared in one Dockerfile:

FROM ruby:3.1-alpine

RUN apk add --no-cache g++ make

RUN gem install mailcatcher

EXPOSE 1025 8080

CMD ("mailcatcher", "-f", "--ip=0.0.0.0", "--http-port=8080")

Commands

We wrote some custom commands in Devkit using shell to ease the development and onboarding of new people.

Devkit clone

What if someone added a new service to Docker Compose?

Thanks to a few lines of code we can have synchronization with our project folder. Devkit will warn us when our Docker Compose file is out of the date:

warn_if_outdated() 👉

We need to pull the changes:

git pull devkit

And then run:

devkit clone

This is what the sample output of the command looks like:

==> git clone        git@github.com:jobandtalent/some-service-1.git (cloning into '/home/user/jobandtalent/some-service-1')
The repository is already cloned
==> git clone git@github.com:jobandtalent/new-service.git (cloning into '/home/user/jobandtalent/new-service')
Cloning into 'new-service'...
remote: Enumerating objects: 536, done.
remote: Counting objects: 100% (128/128), done.
remote: Compressing objects: 100% (64/64), done.
remote: Total 536 (delta 64), reused 91 (delta 50), pack-reused 408
Receiving objects: 100% (536/536), 431.30 KiB | 1.10 MiB/s, done.
Resolving deltas: 100% (200/200), done.
==> git clone git@github.com:jobandtalent/some-service-2.git (cloning into '/home/user/jobandtalent/some-service-2')
The repository is already cloned

Devkit setup

Some of the services have an additional file .devkit/setup

Usually, it looks very simple:

#!/bin/sh

set -eu

cd "$(dirname "$0")/.." || exit 1

script/setup

The script/setup file executes commands responsible for database creation, migrations, seeds etc.:

bundle exec rake db:create
bundle exec rake db:schema:load
bundle exec rake db:seed
bundle exec rake elasticsearch:setup

he rake commands you see above are used most widely by the Ruby on Rails framework.

The devkit setup command is launched whenever a new Devcloud instance is created.

This convention is derived from Scripts To Run Them All repository, here you can read more about it.

Devkit multi

It is a dedicated command to execute commands in multiple services. There are 2 when executing commands:

  • exec – execute a command in multiple running containers
  • run – run a one-off command in multiple services

Let’s say, we want to print the environment of all running containers. We can use the command:

devkit multi exec env

We can specify the services:

devkit multi exec service_1 service_2 env

Devkit secrets

We have to deal with some secrets, right? We store secrets encrypted in the one repository and we inject those secrets into the Docker Compose commands.

There is a possibility to list and safely encode secrets in AES.

Devcloud

Devcloud is Devkit moved to the cloud. It saves whole gigabytes of RAM and CPU, let alone developer frustration compared to services running locally. To ensure the speed of development, this solution simply had to be implemented.

Devcloud’s architecture

Press enter or click to view image in full size

A diagram of Devcloud architecture.

We utilize the power of Python programming language, capabilities and flexibility of AWS and the ease of deployment using Terraform.

Instance lifecycle

There are two independent processes that happen during instance lifecycle:

Launching the ‘Nightly build’ machine

  • The new machine is launched.
  • The machine runs the build
  • If successful, it creates a new image, i.e. ami-abcd
  • The process is repeated every 24 hours.

This process is explained in more depth in the Building images section

Launching a developer’s machine

  • The developer launches his own machine XXXX based on the latest build
  • Now he can do the work, stop the instance, reboot it or terminate the machine
  • If he terminates his machine and want to create a new machine YYYY it will use the same, untouched ami-abcd build.

The diagram below illustrates this process:

Press enter or click to view image in full size

Setting an instance

Before we launch and create an instance, we need to configure some basic settings.

class Instance(object):

def __init__(self, aws_instance):
self.set_aws_instance(aws_instance)

def set_aws_instance(self, aws_instance):
self.id = aws_instance.id
self.type = aws_instance.instance_type
self.image_id = aws_instance.image_id
....................

Variables like the ones posted above are derived from the boto3 library — a library created for creating, configuring, and managing AWS services.

For Devcloud we decided to use Amazon EC2. This Amazon service was perfect for our needs.

We coded our own on top of EC2 cloud. To build such a layer, you need to set certain variables in your code.

And so on… There are plenty more functions available that you can use depending on your own needs.

Building images

We need to create new builds on a regular basis. There is one simple reason for that: changes merged into the devkit’s master branch can have bugs. The build might crash and thus the developer will be forced to use an old image that was built more than 24 hours ago.

With the help of the:

devcloud build

command, we can build new images.

The command creates the Builder instance and logs an information for the User:

def build(self, options: Any) -> None:
instance = Builder.launch(self.owner)
ip = instance.public_ip_address
msg = "Build in progress. The build log can be accessed by running " "`ssh ubuntu@%s tail -f build.log`." % ip
self.io.write(text=msg)

Launching an instance

The command to launch new Devcloud instances uses an image from builder and executes

devcloud launch

Let’s dive into the code to understand what is going on behind the scenes.

And there is a lot going on.

EC2 = boto3.resource("ec2")
DEFAULT_AWS_EBS_VOLUME_SIZE = 100
.................

class Instance(object):
.................

@classmethod
def launch(
cls,
owner,
name,
aws_image_id,
aws_instance_type=None,
aws_ebs_volume_size=None,
.................
):
public_name = "devcloud %s, owned by %s" % (name, owner)
domain x= os.environ("DEVCLOUD_DOMAIN")
host = ("%s.%s" % (name, domain)).lower()
hostname = host.replace(".", "-")
aws_ebs_volume_size = aws_ebs_volume_size or DEFAULT_AWS_EBS_VOLUME_SIZE
.................
user_data = """
#cloud-config

hostname: %(hostname)s
bootcmd:
- echo 127.0.0.1 %(hostname)s >> /etc/hosts
write_files:
- path: /etc/environment
content: |
PATH="/home/ubuntu/devkit/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
AWS_DEFAULT_REGION=%(aws_region)s
AWS_REGION=%(aws_region)s
.................
""" % dict(
aws_region=os.environ("AWS_REGION"),
name=name,
owner=owner,
host=host,
domain=domain,
.................
aws_instance = EC2.create_instances(
BlockDeviceMappings=(
Source
),
ImageId=aws_image_id,
UserData=user_data,
.................
aws_instance.wait_until_running()
aws_instance.load()

return cls(aws_instance)

The original code has more implementation and variable details but the goal here is to get a general grasp of the idea.

Devcloud’s launch command sets new variables and parses some of them to create a new EC2 instance using:

EC2.create_instances(args)

Among the method’s arguments, we can distinguish BlockDeviceMapping, ImageId, UserData, unmentioned InstanceType or TagSpecifications.

UserData is injected into the developer’s machine and if everything runs as expected, Python executes two boto3 Python’s methods:

And at the end, the method returns the just launched instance.

Starting, stopping rebooting and terminating instances

We need a few commands for an efficient daily work with AWS machines. I grouped them together because they are quite similar in their structure.

As described below, we utilize AWS Lambda functions section. All machines are stopped after 7 p.m.

What is the first thing, then, that a developer must do as they start their work in the morning? They must start the stopped machine.

This can be done by the command:

devcloud start

The behind the scenes code is not that complicated:

def start(self):
self.aws_instance.start()
self.aws_instance.wait_until_running()
self.aws_instance.load()

return self.set_aws_instance(self.aws_instance)

The method utilizes previously mentioned EC2.Instance.wait_until_running and EC2.Instance.load references to boto3 library.

I would like to group the following commands because these methods can be distinguished from each other by just some small details. Instead of:

devcloud start

we will have the following:

devcloud stop

devcloud reboot

devcloud terminate

Accessing an instance

Well, we launched an instance and now we would like to get inside! To do this, we will use the command:

devcloud ssh

What’s inside? Let’s find out!

def ssh(self, options):
name = options.get("NAME_OR_INSTANCE_ID") or self.owner
instance = Instance.find(name)

if instance:
if instance.state != "running":
self.io.write(text="The instance %s is not running." % name)
else:
os.execv("/usr/bin/ssh", ("ssh", "ubuntu@%s" % instance.ip, "-A"))
else:
self.io.write(text="The instance %s was not found." % name)

First, it tries to find the instance by the given name. Then, the conditions check if the instance has not been stopped or does not exist. If the conditions are not met, the Python code:

os.execv("/usr/bin/ssh", ("ssh", "ubuntu@%s" % instance.ip, "-A"))

will execute the SSH program that was previously configured during the instance launch.

AWS Lambda functions

Terraform sets up and maintains some AWS based lambda functions that provides serverless execution infrastructure. Let’s take a look at some, in my opinion, of the most interesting ones.

Out of the office instance stopper

At Jobandtalent, we have flexible working hours. People usually work between 6 a.m. and 7 p.m. After 7 p.m., all instances are automatically stopped to save money. The developer can resume the machine at any time, of course.

We wrote some simple lines of Python code to execute this action. The method tries to find all instances that are either running or pending building and stops them.

def lambda_handler(params, context):
instances = Instance.all(states=("running", "pending"))

for instance in instances:
instance.aws_instance.stop()

return (instance.id for instance in instances)

We have an AWS Lambda function that defines this action:

resource "aws_lambda_function" "lambda_out_of_office_hours_instance_stopper" Written by

And the AWS Cloudwatch Event Rule — simply a cron job responsible for stopping the instance after 7 p.m.:

resource "aws_cloudwatch_event_rule" "cron_out_of_office_hours_instance_stopper" From:

And the AWS Cloudwatch Event Rule — simply a cron job responsible for stopping the instance after 7 p.m.:

Instance terminator

We also have a lambda function responsible for terminating machines older than 2 days. Similar to the previous lambda function, it helps us to save money.

BUILDER_LIFETIME_DAYS = 2

def lambda_handler(params, context):
builder_instances = Builder.all(states=("running"))
old_builder_instances = list(filter(__is_old_builder, list(builder_instances)))

for builder_instance in old_builder_instances:
builder_instance.stop()

return (instance.id for instance in old_builder_instances)

def __is_old_builder(instance):
tz_info = instance.launch_time.tzinfo
now = datetime.now(tz_info)
return (now - instance.launch_time).days >= BUILDER_LIFETIME_DAYS

The lambda function queries all running builds and then filters them by the launch_time.

Then, similarly to the previous example, a cron job is being executed:

resource "aws_cloudwatch_event_rule" "cron_builder_instance_terminator" Hashtags:

Build monitor

We frequently see this AWS lambda function action in use in our Slack channel:

Press enter or click to view image in full size

IMAGE_BUILD_FREQUENCY = 24

def lambda_handler(params, context):
image_age = __hours_since_last_image_creation()

if image_age > IMAGE_BUILD_FREQUENCY:
...
admin = "..."
msg = "@%s the latest image is %d hours old. " % (admin, image_age)

if instance:
ip = instance.public_ip_address
msg += "Please, inspect the build log using the command `ssh ubuntu@%s cat build.log`." % ip
else:
msg += "Please, run a new build using the command `/devcloud build`."

slack.write(text=msg)

The function checks if the build was successful in the last 24 hours. If not, the most possible reason for it is that docker-compose.yml file in Devkit has some errors.

The Backend DevEx Team pings the code owners of the failing service and they fix the bug. This is by the way, why codeowners are important. In big growing companies like Jobandtalent, we need to have this defined and regularly maintained, so that we can react quickly to problems and not wonder who is responsible for the given piece of code.

Image deleter

We want to prevent storing lots of AWS images. This is for performance and financial reasons: the more images, the money Jobandtalent spends for instances grows bigger and bigger. So we wrote a rule to store maximum 5 AWS lambda images:

MAX_IMAGES_COUNT = 5

def lambda_handler(params, context):
images = Image.all()

del images(:MAX_IMAGES_COUNT)

for image in images:
image.delete()

return (image.id for image in images)

Similar to out of the office instance stopper, we also use an AWS Cloudwatch Event Rule:

resource "aws_cloudwatch_event_rule" "cron_image_deleter" 📰 Published by

Wrapping up

I have not described everything here, but I hope I have covered the general idea well.

There would be a few more things to cover here: integration with Slack, more security aspects, provisions for the Devcloud machines, integration with docopt library, authentication, DNS, IDE and metric (Datadog) integration. If you liked it, we will consider discussing one of the topics in the future.

As we publish this post, the Backend DevEx team is investigating some alternatives to these tools. The company is growing bigger and bigger. We can’t stop it, we are proud of it.

If you are reading this, thank you for your time, I hope you have learned something valuable!

From: Read more at: {Source|Full Article|Read Now}



{Hashtags:|Tags:|Explore more:} #Devcloud #Devkit #improve #developer #experience #Jobandtalent #JobTalent #Engineering

{📰 Published by|Written by|Authored by} Job&Talent Engineering on 2023-01-31 17:51:00

{Source Feed:|From:|Via} Job&amp;Talent Engineering – Medium

Leave a Comment