Are you a developer looking to get into cloud ? Are you a system admin managing multiple servers already? Are you a client or a business owner who feels held back with infrastructure changes and so unable to go to market soon? Are you still not running your product/application/software on the cloud? Do you have problems with different version of tools being used in different environments? Then this article is for you. Read On.
Gone are the days.
Gone are the days where you login to AWS/Azure/GCP or any other cloud platform and perform Infrastructure changes manually(AKA clickops). Keep notes of the changes made so you wont forget (surely a couple of steps will be missed). Call someone at the middle of the night to get that secret/key/password so that you can perform some action. Manage an excel sheet where all the details are stored. All these just take a lot of time, going back and forth multiple times. Simply put its increased frustrations and decreased team efficiency. If you are still doing this, you and your team are missing out on a lot of things.
The Iron Age of IT.
There are still many businesses which are out there who are not even on cloud. This could be due to many reasons like legacy architecture, financial reasons or just this thought - "If it works, don't touch it" . Such teams still will have physical hardware pugged to a socket, have one or more teams configure everything manually.
The Flow.
Here are few things which would happen:
Ordering of server
Praying for on time delivery.
Install OS, patches, software and other tools
Create firewalls, Provide access, Open Ports
Create Networks, Routing.
Many more. All this for one server.
Probably Its Ok.
Not every application need to be moved to cloud or need IaC. If the application simple few servers and a simply infrastructure which can be managed within a small team of admins, I guess its fine if they know how to pull it off and while keeping cost low and also keeping the end users happy.
But what could go wrong.
Here are few things:
Human effort
Larger Team
More testing, quality check
Humane Mistakes and associated rework
Dependency on key Individuals
All these things are just for one server. Imagine doing it for 5 servers or 20 servers? And oh you need infrastructure maintenance as well.
Much bigger problems.
The ones I mentioned are smaller problems. For Medium to larger companies, there are much bigger problems to worry about like:
On Demand Scaling
Cost Effective Infrastructure
Security
Compliance
Consistency etc.
IaC
The solution to many of these problems is IaC -Infrastructure as Code. IaC is a concept, a fundamental shift in provisioning & maintenance of infrastructure than how we used to do things historically right from the Iron age of IT. Its probably not the silver bullet we need but makes life much easier. Lets look at it in details next.
Infrastructure As Code.
Infrastructure - Hardware, Software, Tools, Security etc.
As - Represented as.
Code - Human Readable and Maintainable Code.
Simply put, its a process of managing and provisioning your IT Infrastructure using code or automation.
Why Code ?
When we do something as a code, automatically we get so many benefits. To name a few:
Version Control.
Code Review / PRs.
Quality Control.
Automation.
Reduces Manual Work and saves time.
Stability, High Availability and so higher profits.
Consistency.
The Magic of Automation.
Imagine creating an EC2 instance on AWS from scratch. You need to create multiple resources (subnets, gateways, security groups etc.), configure them and make sure everything works. All these activities take a lot of time. What if you want to create 100 such instances ?
Repeated tasks done by Humans give way to mistakes and so needs more quality control and supervision. After all we are humans. This is precisely why we have machines.
Some of you can be thinking, wait I can create a template once and use that template to spin more instances. Yes you can but its still a manual process. Unless your answer is related to Cloud Formation (IAC tool specific to AWS) or Azure Resource Manager(IAC tool specific to Azure), you are still doing it wrong. But what if your company or product need to support multi-cloud Hybrid environments? Well only Cloud Formation wont cut it. So read on.
Open Source.
What If I tell you there is a better open source alternative to achieve IaC. 100 brownie points if you have guessed it right, Yes - I'm talking about the HashiCorp suite of productions especially Terraform. Shout out to folks at HashiCorp for the amazing work and making it all open source.
HashiCorp Products.
Here are some of the products from HashiCorp which I'm learning, loving and implementing since last few months.
Terraform
Packer
Vault
Consul
Vagrant and there are some more.
Two other thing I love about HashiCorp are:
HCL - HashiCorp Configuration Language
Documentation
HCL.
Whether you want to write code for Packer, Terraform, Vault etc. - you will need to learn only one configuration Language - HCL. Its a very simple declarative language and can be easily learnt under two days (provided you know any other programming language)
Documentation.
I love how neat HashiCorp documentation is and also how consistent it is between so many of their tool. Made my learning enjoyable.
Using these tools.
While being a newbie, I learnt Terraform first and then Packer and then Ansible (Not a HashiCorp product but yet another amazing open source tool) and then Vault. Initially I was not aware what was used where. This is how I learn and I'm ok with it. If I would teach someone to learn IaC today, Here is how I will tell you to start:
Packer.
Terraform.
Vault.
These three are good to start with and in this order. Lets see the tools one by one now.
Packer.
As the name suggest, Packer is a tool to create an Image which can be used in a multi-cloud hybrid environment using code.
Packer creates identical machine images for multiple platforms from a single source configuration.
With Packer, An engineer can "pack" everything that is needed on the platform. Right from which operating system to be installed on cloud, version of the operating system, tools (runtimes, SDKs, frameworks, databases) to be provisioned or installed, compliance tools and necessary security patches, do post boot processing/setup. All of these things can be configured via code.
Using Packer.
Packer code is written in HCL and inside a .pkr.hcl file. Its usually a good practice to start with the base image of Linux, Windows or Mac and then start customizing it to your needs. To install tools/patches we use provisioners and boy there are a variety of them for every need. This is not a packer tutorial so I'll not get into more details here.
Three Main Block
Packer
Source
Build
Lets look at each one of them.
Packer Block
This is the block where we specify which version of the packer tool we want to use and also mention the required plugins.
Source Block
This is the block which packer uses as a starting point. It can be a Linux AMI or any other Image which can act as a base.
Build Block
This is the block which packer does most of the magic, provisioning software/tools in a variety of ways like executing shell scripts, executing files or running ansible playbooks etc..
Packer Output.
The output of packer is always an Image. This image can now be used by Terraform to create the actual "multi-cloud hybrid" Infrastructure. How cool isn't it?
Terraform.
Now that packer has given us an image, we use that image to create the actual infrastructure. This is where Terraform comes in. Of course all this with code and automation.
Terraform is an infrastructure as code tool that lets you build, change, and version infrastructure safely and efficiently.
Terraform has "Providers" which calls the APIs exposed by cloud providers (AWS, Azure, GCP etc..) to create the actual infrastructure. So, as long as the APIs exposed by the cloud provider works (which should always), we are safe to use Terraform.
Using Terraform.
Terraform code is again written in (no prizes for guessing) HCL. We create .tf files and declare what resources should be created as a part of our infrastructure. When I say resources, it could be compute instances or networks or even a complete Software as a Service suite. Everything is a resource in terraform.
Please note that the Packer Output Image is only a part of the infrastructure. There are many other moving parts which are also needed. Security Groups, Networking, Load Balancers, Elastic IPs are some examples (If these terms are not making sense, that is fine.. You just need to read more about AWS or similar cloud providers and you will be able to understand better).
Terraform Stages.
Terraform works in three stages:
Define - The expected Infrastructure to be created. Basically the code.
Plan - Terraform will plan on how to create them. Consider this like a blue print.
Apply - This is where the actual Infrastructure is created.
Pro tip for beginners.
The thing about Terraform is that you just need to tell what resources you need via code and that code can be either in one file or a bunch of files. Terraform will review what is needed and creates an ordered plan. This plan contains which resources should be created first and which resources should be created after. For a newbie this can be a little scary. It takes a while to figure this out. Remember to go through the Terraform Plan output and things will be better with time.
Terraform Output.
Output of Terraform is always the Infrastructure on Cloud. When you are at this point where you can spin infrastructure as code, you will also be needing multiple secrets like private keys, cloud credentials etc. Since Terraform code lives in Git or similar code versioning software, we cannot commit secrets with our code. So how do we deal with this now? Here is where Vault comes in (Of course there are other options).
HashiCorp Vault.
Vault is a Open Source Tool for Managing Secrets and its lifecycle. Lifecycle as in -Not just storing (static) secrets, but creating dynamic secrets, sharing /leasing it and revoking/destroying it in real time. Vault also provides encryption services. Isn't that amazing..
Secure, store and tightly control access to tokens, passwords, certificates, encryption keys for protecting secrets and other sensitive data using a UI, CLI, or HTTP API.
Consider Vault as a central location to store secrets which can be accessed by Packer, Terraform to create Infrastructure, your Node or Java server for DB passwords, API keys etc. or even in your CI/CD pipeline for credentials, environment secrets etc. Now that is one powerful tool.
Client Server Model.
Vault works like a Client Server Model. There is an Administrations Process usually done by admins who securely store secrets and apply policies to control who has access to these secrets. The place where secrets reside is called the Vault server. Clients (Users or Code) can access these secrets post authentication by connecting to the vault server.
APIs or CLIs or UI.
Everything we do with vault is done via RESTful API calls and always involves a Path. Another way to access vault (as admin or client) is via the CLI. A third way is via the UI. I love using the CLI for a majority of my work.
Stitching them all.
To summarize, we use packer to create an image which is then given to terraform so that It can create the actual infrastructure. Both tools will reach out to vault if it needs any secrets to complete the task.
Closing thoughts.
I hope this article inspires your Infrastructure As Code journey. If you ever need help with any of the topics above, please feel free to reach out to me here or on Twitter.
Good Luck and Oh Happy new year 2023 :)