Developing a Infrastructure-as-Code (IaC) cybersecurity lab (Part 1)

Build a cybersecurity lab using latest and greatest from terraform, ansible and packer.

Developing a Infrastructure-as-Code (IaC) cybersecurity lab (Part 1)

With the cloud, its more easier than ever to build a real production-grade lab. This comes in handy when you need to do a slew of things—from malware detonation to testing new products. In addition, it can lead to development of amazing capabilities with so many other use-cases, from:

  • Red-team testing
  • Blue-team testing
  • Product-design and testing
  • a lot, lot more…

Often, we do the above in silos, or non-production data. For example, when you normally do a staged red-team assessment, is there a real domain controller? Is there real authentication traffic from real user’s who map to specific security groups? Are there linux machines who may also be domain joined, or at least using Kerberos authentication against Active Directory?

This foundational capability, of building a lab, that you can quickly blow up and rebuild in minutes and consistently iterate on, is a gold-mine.

Things you’ll learn

There are numerous technologies that you’ll hit on. If you’re like me, you’ll want to stick to strong idempotent practices, so you can easily reset environments and guarantee the outcomes.

There are other ways to build this technology but what will be discussed here is probably the most proven due to the prevalence of the technologies we will hit on. Those technologies are:

Provider, or even use Hashicorp’s Packer.  For this article, we will be using AWS.

It will be impossible to talk through all these technologies simply because what you can do with these technologies is somewhat absurd. But what this article will hit on is the how we will integrate these technologies. It will hit parts that I had to spend a lot of time on so hopefully you, the reader, can save yourself time and lower than learning curve (at my expense)!

You can use any Cloud Service

Stealing/“Re-use” is the best form of flattery

In my past life at Microsoft, I led a development called Defend the Flag. This is still used today by groups within Microsoft Defender for Endpoint (MDE), Microsoft Defender for Identity (MDI), Azure Information for Protection (AIP) and Microsoft Cloud App Security (MCAS). I’m not bragging, it just shows how powerful this is. It’s used to showcase the technology, drive partner training and even used for development testing.

So, if you want to know what you can somewhat do with these designs, head over to DefendTheFlag.

That resource may confuse you, but its the use-case that should help get the creative juices flowing! I say this will potentially confuse you since, since that was MSFT, it was built in purely via Azure Resource Management, or [ARM]. Plus PowerShell and Desired State Configuration ([DSC]).

Essentially the better method to do this, that is cloud-agnostic (to a degree, with some massaging), is Terraform, which replaces ARM, and Ansible, which replaces DSC.

The crazy thing is TF abstracts much more than ARM plus adds so much flexibility with providers. In addition, DSC is very confusing—behavior on client vs server isn’t guaranteed, and with the split of .NET into .NET and .NET Core, some DSC modules are for one and others the other. This said, if your reading this and exploring best option, know TF and Ansible is a much more eloquent solution with the same results, more flexibility, and less time to realize your outcomes. Even crazier, Ansible has better development and testing workflows, not requiring compilation (MOF files as DSC calls it). In addition, it has none of the complexity of .NET, .NET Core.

Architecture

We need to be able to create a lab purely via code, based on base-images. All modifications should be performed via code. And, when we want to launch a lab, we want to be able to do it quickly, not having to wait 6 hours.

All the above steps aren’t possible via one quick workflow. In fact, we need a few workflows, which build on top of each other. By using Ansible and Terraform, we can do just that and very elegantly.

ARCHITECTURE
At a high level, there are 3 phases. This of course can be optimized and customized to your requirements and preferred workflow.

Phase 0

This phase is all about taking provided images and buidling the most fundamental parts of the lab. For example, if our lab requires a Windows Server 2016, take an ISO or a cloud-provided image, and make minor tweaks all your downstream phase will require.

At the end, take all your images and build snapshots or AMIs from them. We can then use them for further customization downstream.

Phase 1

This phase takes our phase 0 work and adds additional customization. Perhaps its where you take the Server 2016 and add a few roles to it or install an application. This is where majority of your work will be.

Phase 2

By the time you get to phase 2, it should be all about speed to deploy. Need an application server that was not already provisioned or built, earlier?--this is not the phase to do that!

Phase 2 is about spead to deploy. The last thing you want is your lab users (or you!) to wait 15 minutes for the lab to be built. If you find tasks that require much more time, there is certainly almost always a way to take that work and bring it into phase 1.

Going forward

Explore what’s on the web and what you can do. This is just Part 1 of a multi-part blog series. It’s aimed for cybersecurity professionals but could be useful for anyone…

You now have the high level ingrediants to make your very own cyber-lab using many of the same technologies as Cloud Service Providers and Software-as-a-Service vendors.

We will have more details in another blog post.

Feel free to leave me comments. I'm no expert on the matter so please, don't hesitate to reach out and correct me.

Andrew