Java app in the AWS Cloud – deployment and infrastructure management

data: 8 października, 2014

czas czytania: 6 min

autor: Marcin Kania

Categories

Several months ago, we started implementation of the solution for mobile startup. Solution consists of native mobile apps (Android, iOS) and backend for them. We made a decision about going for the cloud, because in social apps number of users may grow rapidly. As I mentioned in one of the recent blog posts, decision about going to the cloud is only the beginning of tough choices. Firstly, you need to decide about technology stack, cloud service model (Paas, Iaas…) and cloud provider of course. We decided to use java (Spring Boot) and AWS Cloud. In this article I’ll describe our solution as far as deployment and cloud infrastructure management are concerned.

AWS offers several ways of managing infrastructure and application lifecycle. You can use high level services such as Elastic Beanstalk (which is basically an application container – PaaS) or OpsWorks (hosted chef under the hood). You can also go deeper and use low level services like EC2 and CloudFormation. This means more control, but also requires more effort. Fortunately, AWS allows to combine both ways, which is very useful. I’ll go into detail in the next paragraphs.

Elastic Beanstalk

Elastic Beanstalk allows to start quickly. In case of java development application you only need to upload a .war file, configure some option settings (machine image, autoscalling, environment variables…) and that’s it. If you need to customize application container, it’s also possible. In our case, two customizations were necessary. Firstly, we needed to turn on gzip compression, because it’s disabled by default. It wasn’t complicated because java application container actually consists of Tomcat application server with Apache in front of it. It was only a matter of placing apache configuration file that enables mod_deflate. Second customization was required because you don’t care about servers in the cloud, meaning that you create, use and terminate them, if no longer needed or terminate and provide new ones, if something needs to be changed. Servers are immutable. Due to that storing logs on app servers is not an option. You must store them elsewhere. There is a lot of possible solutions to this problem, the simplest one is to push logs to one of cloud log aggregators, our choice was Loggly. We needed to customize EB container, rsyslog deamon in particular, to push logs to Loggly endpoint. That was it. Our middleware layer was ready. Deployment was simple, because Elastic Beanstalk handled most of the complexity behind the scenes.

EC2

Sometimes, your architecture is simple and typical. In such case, most likely someone has already solved your problems and it’s only a matter of finding right cloud mechanisms and using it. In our case, application server layer was responsible for handling HTTP requests from REST API. It was stateless, so only small modifications of the container were necessary.
Due to this fact we could leverage Elastic Beanstalk which solved a lot of problems for us. On the other hand, sometimes it’s beneficial to have more control and roll your own servers manually. One reason could be related to money. High level cloud services are often expensive. We decided to deploy our own MongoDB replica set on EC2, because decision about using DynamoDB seemed risky, due to it’s limited query capabilities. We didn’t have to start from scratch, because there are preconfigured MongoDB images available on AWS Marketplace. We modified this image a bit (to start as a replica set) and prepared proper backup strategy using EBS snapshots basing on MongoDB documentation.

VPC

VPC allows to create your own virtual private cloud. Question whether to use it or not became outdated, because currently every AWS account has ‘default vpc’. It turns out that it is worth spending little more time to get familiar with VPC, because used properly (private and public subnets in different availability zones) may significantly improve security and high availability. Nice read on this here.

Making it work as a whole

All bits and pieces I’ve mentioned in previous paragraphs must be provisioned and configured to work together in one system. What’s more, this is not an exhaustive list. Probably, you will also need queues (SQS), buckets (S3), alarms (CloudWatch) and so on. Configuration also includes permissions. All components must be allowed to talk with other components. There are 2 ways of granting permissions:

create IAM users for applications and put credentials into the code
assign IAM roles to EC2 machines and assign permissions to roles

I think the second method is tempting because it allows to omit putting credentials into the code – I always feel reluctant when I’m about to do this.

Provisioning, configuring, deploying, setting permissions – looks like a lot of work, especially, if you want to leverage power of the cloud during development and testing: ability to create and destroy development and testing environments on demand. You could write bash scripts running aws-cli commands, but what if something goes wrong in the middle of the script? You end up with a lot of garbage to clean up. What’s more, this garbage causes unused resources in the cloud and unused resources cost money.

CloudFormation

Fortunately, there’s a solution for this problem. Using CloudFormation you can create templates for (almost) all those things. What’s interesting, ElasticBeanstalk is also CF template under the hood. So you can include EB resource in your template and CF will run other template for you. Looks like a decent level of automation. Templates are kind of transactional, so in case of failure in the middle of stack creation, all the mess created so far will be cleaned up automatically. I’ve prepared a small project on github that demonstrates how all mentioned AWS resources (both high-level, such as Elastic Beanstalk and low-level stuff like security groups) can be created and configured using cloudformation template to work together. Good starting point for writing your own templates could be (as it was in our case) to create some resources manually (via AWS console), and generating template using CloudFormer tool. Generated template will probably look ugly and require some polishing (CloudFormer is still beta at the time of this writing), but I think this way is faster than starting from scratch. I think it’s also a good idea to look at some sample templates if you want to learn some advanced tricks.

I’ve depicted our solution to the problem of creation and managing infrastructure in AWS cloud. If you have interesting experiences with CloudFormation or different solutions to this problem, please share your comments.