Introduction
AWS Beanstalk is an application deployment option provided by AWS. The mail goal of this service is to simplify and speed up an environment setup. It provides an infrastructure and deployment framework to deliver a web application fast without any need to manage and provision multiple AWS resources.
Even though this service seems to be a great options for people without any knowledge about AWS, or for somebody who would like to deliver the product fast, it has some hidden obstacles that may slow down you work significantly if you don't know how to solve them. Also provisioning resources that Beanstalk manages is not that difficult and can be easy done with Terraform or CloudFormation. The deployment process can be automated with highly customizable CodeDeploy or ECS, so maybe it's worthwhile spending more time on infrastructure creation that is fully managed by Terraform or CloudFormation rather than using Beanstalk?
In this post I would like to present some obstacles that you may encounter when you choose Beanstalk as your deployment solution. In the next part I will present a similar deployment solution using CodeDeploy. I hope that after reading this article it would be easier for you to choose the best solution for you use case.
Provisioning
AWS Beanstalk environment can be created and managed by dedicated CLI tool called EB CLI it's an interactive tool that can be used to configure, monitor, update and clone Beanstalk environments, the downside of this tool, is that it's not designed to work in non-interactive mode (for example as part of a Jenkins pipeline).
The better option is to use CloudFormation or Terraform, by using these Infrastructure as Code tools you can automate provisioning of a Beanstalk environment.
Source ofmain.tf
describing Python3.8
environment HERE
Source of beanstalk.yaml
HERE
In both cases version_label
(or in CloudFormation VersionLabel
) is not provided and Beanstalk during
the first deployment will run a sample application, so there is no need to provide any application code in this phase.
As provided platforms may become deprecated and finally removed it's worthwhile to fetch the newest one during the infrastructure update, in Terraform it can be done with:
data "aws_elastic_beanstalk_solution_stack" "python-stack" {
name_regex = "^64bit Amazon Linux 2 (.*) running Python 3.8$"
most_recent = true
}
CloudFormation doesn't have such feature, but it can be provided as a stack parameter:
# Returns a list of the available solution stack names, with the public version first and then in reverse chronological order.
SOLUTION_STACK=$(aws elasticbeanstalk list-available-solution-stacks \
| jq -r '.SolutionStacks | map(select(test("64bit Amazon Linux 2 (.*) running Python 3.8"))) | .[0]')
aws cloudformation deploy --template-file cloudformation/beanstalk.yaml \
--stack-name "demo-app-stack" \
--capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM \
--parameter-overrides SolutionStackName="${SOLUTION_STACK}"
If your application uses different health check path than the default one it can be easily provided as part of the
OptionSettings
block (Or setting
in Terraform):
OptionSettings:
- Namespace: "aws:elasticbeanstalk:application"
OptionName: "Application Healthcheck URL"
Value: "/status/health"
- Namespace: "aws:elasticbeanstalk:environment:process:default"
OptionName: "HealthCheckPath"
Value: "/status/health"
Beanstalk will not fail during first deployment (but it used to) if health check path is different from the one used by sample application.
Other option is to set it using .ebextensions folder
by creating a file with .config
extension.
Source of .ebextensions/app-health-check.config
:
option_settings:
"aws:elasticbeanstalk:application":
"Application Healthcheck URL": "HTTP:80/status/health"
"aws:elasticbeanstalk:environment:process:default":
"HealthCheckPath": "/status/health"
This file will set listed environment properties during deployment. It doesn't collide with neither Terraform
nor CloudFormation, both will not detect any changes even if we update environment properties using .config
files.
When the infrastructure is provisioned a new version can be deployed using AWS CLI
ACCOUNT_ID=$(aws sts get-caller-identity | jq -r '.Account')
REGION=$(aws configure get region)
BUCKET_NAME="demo-app-bucket-${REGION}-${ACCOUNT_ID}"
TIMESTAMP=$(date +%s)
LABEL="app-${TIMESTAMP}"
FILE_NAME="${LABEL}.zip"
aws s3 cp latest.zip "s3://${BUCKET_NAME}/${FILE_NAME}"
aws elasticbeanstalk create-application-version --application-name "demo-app" \
--version-label "${LABEL}" \
--source-bundle S3Bucket="${BUCKET_NAME}",S3Key="${FILE_NAME}"
aws elasticbeanstalk update-environment --application-name "demo-app" \
--environment-name "demo-app-env" \
--version-label "${LABEL}"
update-environment
only triggers the update, to wait until the deployment is finished you can use:
aws elasticbeanstalk wait environment-updated --application-name "demo-app" --environment-name "demo-app-env"
But the problem with this command is that it has fixed timeout:
It will poll every 20 seconds until a successful state has been reached.
This will exit with a return code of 255 after 20 failed checks.
Sometimes 7 minutes may be not enough so i would recommend to write you own script using Python BOTO3 Beanstalk Client or other scripting language offering AWS SDK.
YAML parser
You can provision additional resources by placing .config
files with CloudFormation code
inside .ebextensions
directory, if you are familiar with CloudFormation you know that such code is totally fine:
GroupName: !Ref AWSEBSecurityGroup
But for some reason it will fail during version deployment with the following error:
The configuration file .ebextensions/enable-ssh.config in application version app-1662191692 contains invalid YAML or JSON.
YAML exception: Invalid Yaml: could not determine a constructor for the tag !Ref in 'reader',
line 9, column 18: GroupName: !Ref AWSEBSecurityGroup ^ ,
JSON exception: Invalid JSON: Unexpected character (R) at position 0.. Update the configuration file.
To solve this problem the line should be changed to:
GroupName:
Ref: AWSEBSecurityGroup
It may also fail for other functions like !Sub
or !GetAtt
so it's better to use the full form.
AutoScaling Group Metrics
AutoScaling group provides free scaling metrics
that are disabled by default and can be enabled using MetricsCollection
property of
AWS::AutoScaling::AutoScalingGroup
. In Beanstalk AutoScaling group is created and managed by the environment
so modification is possible only by using .ebextensions,
you can modify any existing resource by creating a .config
file with resource that have the
same name as one listed HERE.
These metrics can be very useful especially if you plan to run some performance tests.
Metrics can be configured for the AutoScaling group by following .config
file:
Source of .ebextensions/asg-metrics.config
:
Resources:
AWSEBAutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
MetricsCollection:
- Granularity: "1Minute"
Metrics:
- "GroupMinSize"
- "GroupMaxSize"
- "GroupDesiredCapacity"
- "GroupInServiceInstances"
Resource name must exactly match the one listed in the documentation. For this example it's AWSEBAutoScalingGroup
.
Beanstalk will merge this configuration part with the original template managed by the environment.
Security Group update
For testing purposes (and only for testing! Don't allow any SSH traffic on production environment!)
you may want to have an SSH access to your instances. AWS Beanstalk creates a Security Group
that allows only incoming traffic from a Load Balancer.
It's possible to override Security Group for the Environment using SecurityGroups
option from aws:autoscaling:launchconfiguration
namespace,
but there is a simpler solution, by placing a file with .config
extension in a .ebextensions
folder
of your application zip archive you can create a ingress rule that allows SSH traffic from anywhere.
Source of beanstalk_ebextensions/.ebextensions/enable-ssh.config
:
Resources:
SshIngressRule:
Type: AWS::EC2::SecurityGroupIngress
Properties:
CidrIp: "0.0.0.0/0"
FromPort: 22
ToPort: 22
IpProtocol: "tcp"
GroupName:
Ref: AWSEBSecurityGroup
It will create a new resource that will be connected to existing
Security Group named AWSEBSecurityGroup
created by Beanstalk.
Logs rate limit
Beanstalk comes with the default configuration files for journald
and rsyslog
, because of that, both programs use default log rate limits that are quite low: 1000 logs within 30 seconds for journald
and
20000 logs within 10 minutes for rsyslog
. Even a small application can go over these limits easily,
when the limit is reached new logs are dropped, you can verify it by checking journald
and rsyslogd
logs.
Result of journald -u systemd-journald
:
Sep 03 14:15:56 ip-172-31-42-178.eu-west-1.compute.internal systemd-journal[1142]: Suppressed 8749 messages from /system.slice/web.service
Result of journald -u rsyslog
:
Sep 03 16:10:47 ip-172-31-38-241.eu-west-1.compute.internal rsyslogd[4712]: imjournal: begin to drop messages due to rate-limiting
Beanstalk doesn't offer any environment option to change the limits, but it can be done with .ebextensions
config file
described HERE
Source of .ebextensions/log-rate-limit.config
:
files:
"/etc/systemd/journald.conf":
owner: root
group: root
mode: "000644"
content: |
[Journal]
RateLimitInterval=30s
RateLimitBurst=20000
"/etc/rsyslog.d/rate-limit.conf":
owner: root
group: root
mode: "000644"
content: |
$imjournalRatelimitInterval 30
$imjournalRatelimitBurst 20000
commands:
restart_journald:
command: systemctl restart systemd-journald
restart_rsyslog:
command: systemctl restart rsyslog
Using this config Beanstalk will create /etc/systemd/journald.conf
and /etc/rsyslog.d/rate-limit.conf
with specified
content. In this example the limit is set to 20000 logs within 30 seconds.
As config file sections are executed in this order:
- packages
- groups
- users
- sources
- files
- commands
- services
- container_commands
The same config file can be used to restart both journald
and rsyslog
to let them fetch defined properties
by using commands
section and running systemctl restart
.
Required VPC Endpoints
To use Beanstalk in private subnets without NAT Gateway the following endpoints are required:
Gateway:
com.amazonaws.${AWS::Region}.s3
Interface:
com.amazonaws.${AWS::Region}.cloudformation
com.amazonaws.${AWS::Region}.elasticbeanstalk-health
com.amazonaws.${AWS::Region}.elasticbeanstalk
com.amazonaws.${AWS::Region}.sqs
Interface endpoints should be provisioned in every subnet that is used to run EC2 instances,
these subnets are configured using Subnets
environment property form aws:ec2:vpc
namespace.
SQS endpoint may look odd on this list, but it's required by cfn-hup script, without it, the environment will not start.
Also, it's worth to mention, that Beanstalk uses AWS::CloudFormation::WaitCondition
resource to wait for a signal from EC2 instance. As described HERE
WaitCondition uses presigned S3 URL to signal success, so S3 endpoint policy should allow
s3:PutObject
action on arn:aws:s3:::cloudformation-waitcondition-${AWS::Region}/*
Sample application
Beanstalk sample applications that are deployed when version_label
is not specified
require internet connection. Python application uses
pip -r requirements.txt
to download dependencies during deployment, Java application uses build phase to run maven
that tries to download some plugins (and fails),
i didn't check all available platforms, but it may be a common problem. Because of that it's required
to deploy the initial version by yourself using version_label
property, example HERE.
I would suggest to decouple infrastructure update and new version deployment, so version_label
changes should be ignored:
lifecycle {
ignore_changes = [version_label]
}
As CloudFormation doesn't have ignoring feature it needs to be used as both infrastructure and version deployment tool.
To do so you can have separated CloudFormation stack with AWS::ElasticBeanstalk::Environment
resource
and pass previously created with aws elasticbeanstalk crete-application-version
CLI command ApplicationVersion
as a stack parameter.
Custom Image
If your customization cannot be done using .ebextensions
(as it requires internet connection to download the application bundle),
or you want to install additional packages, like for example New Relic Agent,
with no deployment slowdown you may use ImageId
environment property from aws:autoscaling:launchconfiguration
and create a custom image using Packer, the image can be based on existing Beanstalk platform.
Keep in mind that image doesn't have any lifecycle policy so old images should be removed by yourself
(for example with Lambda triggered by Cron).
NOTE: even if you provide a custom image the SolutionStackName
property is still required, it's worth
to fetch the newest one using Terraform aws_elastic_beanstalk_solution_stack
or provide it as a CloudFormation
stack parameter as described earlier. Solution stacks may become deprecated and finally removed, so
you build pipeline may fail if you hardcode a specific version.
Conclusion
The title of this blog post is kind of ClickBait, AWS Beanstalk is not THAT bad, and it offers multiple
customization options including custom image. You can customize it with config files,
update existing infrastructure
or even override some /etc
config files.
The downside of this is that everything mentioned (except custom image) is a part of deployment process,
it requires to download all config files and process them, every upscaling event will require to perform
these operations on newly provisioned instance. Also, it mixes infrastructure management with deployment process,
you need to check both Infrastructure as Code configuration AND .ebextensions
to have a consistent
view of your environment configuration.
If Beanstalk works fine for you after applying presented tweaks use it and enjoy, if you are looking for more advanced and customizable solution check the next part of this series, i will present how to create an environment using CodeDeploy that may look similar to one that Beanstalk provides.