An Introduction to Cloud Security for Infosec Professionals
As someone who has spent a long time in network and endpoint security and then moved to cloud security, I can sympathize with people with security backgrounds who want to learn more about the cloud and cloud security concepts. AWS, EC2, CMK, KMS, IAM, SQS, etc.? It can seem like a big alphabet soup of unfamiliar acronyms. And lots of questions come up. How can I know whether a cloud provider encrypts a service by default or if I must specify it? What is the difference between a queue and a topic? Does CMK stand for customer-managed key or customer master key?
When it comes to understanding cloud security, 3 key points can be helpful:
- Everything is programmable
- IAM is the new network
- Cloud vulnerabilities seem simple but are actually complex
After talking about these three points, I provide some tips on how to get started in learning about cloud security.
One phrase that describes the cloud is, “The cloud is software-defined everything.” Cloud services ultimately run in physical data centers somewhere but the hardware is abstracted away for practically all cloud users as APIs. You won’t need to enter the data center anymore to rack and stack security appliances or get console access.
Everything for accessing cloud services happens in a web browser or via APIs. On-premise security tools have long supported a browser for access, but you have to deal with scores of different vendor interfaces, most of which are inconsistent with each other. All cloud services are set up and configured in a cloud provider’s single console, whether it be for AWS or other providers like Microsoft Azure or Google Cloud.
Everything in the cloud is software-defined so you can create any kind of resource within minutes, where it may take weeks or months to rack and stack the equivalent physical hardware. Do you want to stand up a monster server on Azure with 96 CPUs, 384 GB of memory, 3.6 TB of SSD storage, and 8 NICs with 35 Gbps of network throughput? No problem, just choose the Standard_D96d_v5 virtual machine. Do you need to spin up a cluster of thousands of server nodes? Google’s GKE service can create a Kubernetes cluster of up to 15,000 nodes. Of course, just be sure to watch that monthly cloud provider bill! You could rack up thousands of dollars a month without even realizing it.
What’s more, the cloud provider console in a web browser is only the tip of the iceberg. If you REALLY want to access the power of the cloud, get familiar with a dedicated language for defining and configuring resources. This type of language is known as “Infrastructure as Code” (IaC). IaC uses cloud provider APIs to create and change resources. Numerous IaC tools exist but one of the most well known is Terraform. It is an open source tool that supports many different cloud platforms. A single Terraform template could easily create 100 or 1000 virtual machines.
Just like developers have a software development lifecycle (SDLC) for application software, they have one for IaC. The best way to deploy IaC securely is to apply “policy as code” prior to deployment. Policy as code expresses your security policies as code and ensures your IaC configurations are secure. If you’re interested in learning more, check out Open Policy Agent, an open source standard for policy as code and a project of the Cloud Native Computing Foundation.
IAM (Identity and Access Management) is the cloud service that defines access control for all users and applications to all resources. Basically, “who” can access something, “what” is being accessed, and the nature of the access (read-only, read/write, list objects but not read them, etc.).
This is why “IAM is the new network.” In the programmable world of the cloud, everything is accessible if you have the right permissions. Many cloud resources are directly accessed via IAM permissions, with no visible intermediary network devices. For example, S3 storage buckets don’t have a configurable firewall sitting in front of them. This is true of many other cloud resources.
This does not mean that traditional network security or defense-in-depth concepts don’t matter anymore, but you’ll need to think differently about them for cloud use cases. Most cloud environments contain resources that have physical counterparts (EC2 instances, VPCs, virtual machines, virtual networks, etc.), and even newer technologies like Kubernetes still rely on networking within a cluster or between a cluster and other resources. But practically all environments have storage buckets or other resources whose access is determined by IAM, and most cloud-based data breaches don’t traverse traditional TCP/IP networks, so traditional security approaches aren’t sufficient.
IAM is an incredibly powerful service because it controls access to objects in the cloud. But this power brings complexity. For example, this screenshot shows that AWS supports over 120 actions on the S3 storage service! There are 10 List actions, 52 Read actions, 41 Write actions, and so on. Every cloud resource has its own distinct set of actions that can be permitted on it.
You can also specify additional permission conditions. For example, only permit a user to access a resource if that user logged in with multi-factor authentication. Or only permit access if a user or application belongs to a specific cloud account or organization, or during a certain time of day, or using a certain IP address (where relevant). There are dozens of different conditions available and they may vary depending on what resource is being referenced.
One final consideration: cloud providers like AWS support multiple types of IAM policies. Identity-based policies control what a given user, group or role can access. Resource-based policies control who can perform what actions on a given resource. AWS only permits an action if it is permitted across the board. If it isn’t explicitly permitted then it is denied, like a “default deny” firewall rule. So even if a user has broad permissions, configuring a restrictive policy on a resource will ensure that the user is restricted when accessing that resource.
Because IAM is both powerful AND complex, it is easy to make a mistake in configuring IAM permissions that have wide reverberations. Take the following examples:
- If a permission includes “s3:GetObject” instead of “s3:GetObjectAcl”, a user can access an object in an S3 bucket instead of only the object’s access list. This has huge security implications If the object is sensitive data such as financial projections or PII.
- IAM allows users to assume other roles temporarily. Think “sudo” for the cloud. A privileged role may be misconfigured to allow all principals (i.e. users and applications) to assume it. If someone creates a small EC2 (i.e. virtual machine) instance for learning or testing and forgets about it, a hacker can use this orphaned instance to assume the privileged role and elevate permissions.
- In general, wildcards in permissions are powerful and convenient but also potentially dangerous. A permission may include “s3:*” or “s3:List*” for testing purposes. This allows a user to perform all 120+ actions on S3 resources or, at least all the list actions. Wildcards in production accounts can allow access that was never anticipated initially.
IAM is more complicated than traditional RBAC because the cloud provides power and flexibility that is not possible in the data center. Properly configured IAM can result in more security in the cloud than in the data center, because permissions are so granular in how they apply to every resource and possible action. But because IAM’s complexity can lead to configuration mistakes, it is important to focus on secure IAM design and help developers use IAM securely.
See this page for more tips on securely configuring IAM.
Traditional operating system or application vulnerabilities range from simple to complex in how to exploit them. Some vulnerabilities are so complex that many users need researchers or hackers to provide exploit code so they can trigger them.
By comparison, cloud vulnerabilities may seem simple to understand and to “exploit.” For example, one of the most common vulnerabilities is public access on an S3 bucket. This is caused by the “Block all public access” setting not being enabled on a bucket policy. To exploit this, a non-authorized user just needs to access the bucket’s contents. Or an IAM policy may be too permissive because it uses a ‘*’ wildcard to permit too many users or too many resources or too many actions. This vulnerability is implicitly exploited when users perform actions or access resources that are not permitted to them.
But this is just scratching the surface. You can’t just check the “Block all public access” setting on an S3 bucket policy and assume you’re safe. AWS considers a bucket policy to be non-public if it includes CIDR blocks of any size. So a non-public S3 bucket policy could permit millions of IP addresses to access the bucket. The real solution for addressing this vulnerability is to use IAM to restrict access to the bucket, which I talk about above. This blog post explores the task of building a secure S3 bucket in rich detail.
Cloud vulnerabilities are also complex in how they can be combined with more traditional vulnerabilities in cloud breaches. In the previous section, I talk about a scenario in which a hacker gains access to an orphaned EC2 instance that assumes a privileged role. This is very similar to what may have actually happened to Capital One.
This blog post analyzes how Capital One’s cloud environment was likely breached. The hacker first took advantage of a misconfigured firewall to gain network access to an EC2 instance (i.e. virtual machine), then used a traditional OS or application vulnerability to compromise the instance. The instance either already had overly broad IAM permissions available to it, or the hacker had permissions to assume a more privileged role to download sensitive data from an S3 bucket.
Hopefully this pattern is clear. Use a traditional vulnerability to gain a foothold in an organization, then use an IAM vulnerability or vulnerabilities to compromise the API control plane to enable discovery, lateral movement, and to escalate permissions.
Another pattern is more network-centric. Both Security Groups and access control lists are cloud resources that control network access. A misconfigured Security Group or ACL allowing port 22 to be accessed by the world would enable a hacker to leverage vulnerabilities in sshd. Or you may inadvertently use the same routing table for both public and private subnets, which then exposes the private subnets directly to the Internet. Hackers can then exploit any traditional vulnerabilities on any resource in these “private” subnets. So the pattern here is to use a cloud vulnerability to gain network access and then use a traditional vulnerability to compromise a host. You could then use an IAM vulnerability as above to enable lateral movement and escalate permissions.
One final point to make about the complexity of cloud vulnerabilities is that the scalability of the cloud can make everything more complicated. It is trivial to check a single S3 bucket for public access but how about checking 10,000? Which accounts do these S3 buckets live in? Do you have permissions to check them? Which team(s) will actually fix (remediate) the configurations?
The best place to learn about cloud vulnerabilities is to look at the CIS Benchmarks for cloud providers. The Center for Internet Security has published a set of best practices for a variety of platforms including operating systems, mobile devices, and clouds. The CIS AWS Benchmark is freely available and provides concrete steps for how to securely configure cloud resources.
I’ve covered a lot of material here. Probably the biggest takeaway is that the cloud brings considerable power and scalability, but with this power comes comparable complexity. It’s relatively easy to spin up 10,000 virtual machines with a single code template, but a misconfiguration in that template means that you now have 10,000 vulnerable virtual machines. Or it’s easy to release an application to the world, but if you’re not careful then all of that application’s sensitive data will be available to the world at the same time.
The cloud isn’t going to go away. Adoption is only accelerating. As security professionals, the best way to address the cloud is to start becoming familiar with its basic features and gradually become more knowledgeable over time on specific services such as IAM.
I have a few suggestions for learning more:
- Take the free AWS training course. AWS provides a free course titled AWS Cloud Practitioner Essentials. It is a training course for their Certified Cloud Practitioner certification. The 6-hour class is intended for people from a variety of professions: sales, project management, IT, etc. and introduces people to cloud concepts.
If this class whets your appetite for more technical material, consider taking a training course for the AWS Solutions Architect – Associate certification. You don’t necessarily need to take the certification exam but a course to prepare for the exam will give you a solid education on AWS fundamentals. One of my favorite training courses for this exam is available here.
Although AWS is only one of multiple cloud providers out there, it still has more market share than any other provider today so you can’t go wrong in starting with AWS. Also, the concepts you learn are readily applicable to other clouds such as Microsoft Azure or Google Cloud.
- Create a free AWS account. AWS provides a free tier for getting started. For instance, you get 750 hours of EC2 instance usage, 5 GB of S3 storage, 750 hours of RDS database usage, and more. This account will provide you a sandbox for working with AWS resources. Alternatively, you can request for an AWS account through your organization so you will have more flexibility when it comes to provisioning resources.
Whether the account is free or through your organization, be aware of monthly costs. If you start getting active on AWS then you may quickly max out the free tier. To give an extreme example, an EC2 instance of type p4d.24x.large is often used for machine learning and has 96 CPUs, 1152 GB, 8 GPUs, and 8 TB of SSD storage. It has an hourly cost of about $33, which is about $24,000 per month. But even less maxed-out resources can have high monthly costs. Enabling DDoS protection on your Microsoft Azure subscription will cost you about $3000/month.
- Scan your cloud account for cloud vulnerabilities. The best way to get familiar with the concept of cloud vulnerabilities is to see them firsthand. There are lots of free SaaS and open source tools that scan cloud accounts for these vulnerabilities and most of them can scan for adherence to the CIS Benchmark for AWS.
It will take about 15 minutes to get up and running with a SaaS tool. My company provides a free version of our cloud security SaaS platform but feel free to use anything else, including free services from the cloud providers.
- Get familiar with Infrastructure as Code (IaC). The most painless way to get started with coding in the cloud is using a tool like Terraform. Follow this simple tutorial to install the terraform binary on your computer and create a template that defines an EC2 instance. Then use Terraform to deploy this instance to your cloud account. You are now officially a cloud developer!
As you get more familiar with Terraform, you can do things that would take you much longer in the cloud web console. Figure out how to create a loop to deploy 100 virtual machines instead of one. Change a setting, such as the instance type, and redeploy. See how quickly Terraform makes the changes in your AWS account.
If you want to get really sophisticated, use a tool to scan your Terraform template for vulnerabilities. Because the cloud is programmable, resources that are vulnerable in the cloud often have the same vulnerabilities in their underlying code. There are lots of open source tools out there but as my company makes a really great one, I’m biased toward it. It leverages Open Policy Agent, which is used by companies like Capital One, Netflix, and Pinterest.