The Capital One Breach: Did the technology or the process fail?
Posted by: Jonathan Villa
One of my favorite quotes is “It is easy to be wise after the event,” (Arthur Conan Doyle, The Complete Sherlock Holmes). After the shock and awe of each data breach, many of us take our turn at Monday morning quarterbacking and begin explaining what the compromised company did wrong.
Today’s media headlines have the security industry dissecting the massive Capital One breach announced on Monday and attempting to re-architect their AWS environment.
My intent is not to bash Capital One or to speculate about the specifics of what happened (although there have been several good write-ups in the last 24 hours featuring solid research into the specifics that have been gleaned from publicly available information). Rather, my intent is to share my perspective on what likely went wrong and how these types of incidents can be avoided as the rush to the public cloud continues to gain momentum.
During the Palo Alto Ignite conference in June, I presented (along with my GuidePoint Security colleague Shawn Slater) a small segment on how the AWS WAF was blind to certain web-based attacks (despite being a Web Application Firewall). Don’t get me wrong, I’m a huge AWS supporter, having designed solutions dependent on the cloud service provider for over a decade. However, I also managed WAF environments for an even longer period of time. That said, the AWS WAF always raises one of my eyebrows when customers are considering it.
But…this isn’t about the AWS WAF (and as of writing this post I haven’t seen who the WAF vendor was in Capital One’s incident). This is about the growing trend of losing the trust but verify conviction that the information security industry has had within the shadow of the Shared Responsibility Model. It’s about how a vulnerability in a single chain link can cause the chain to break. The new standard is that the cloud service provider (CSP) is responsible for “security of the cloud” and we, as tenants, are responsible for “security in the cloud”. The CSPs have given us tools and standards to follow and for the most part, they’re good guidelines. However, architecting our cloud environments solely under the guide of the Shared Responsibility Model leads many to trust but forget to verify.
Based on what’s been written in the complaint, Capital One followed AWS best practices (e.g. they did not store access and secret keys on an EC2 instance, they did not have a public S3 bucket, etc.). After all, they are the authors of Cloud Custodian, a tool many have come to depend on. However, after reading numerous articles including Twitter posts, it could be said that following (yes, following as a guide) AWS security best practices led to this breach, or at least to the magnitude of it.
So, what went wrong? Let’s go back to my Palo Alto Ignite presentation. During that session, I demonstrated how the AWS WAF does not prevent me from uploading a malicious script to a vulnerable web app. Once my script is uploaded, I’m able to connect to the instance via a reverse shell. In my next step, and this is where the story may sound similar, I am able to take advantage of the EC2 instance role by executing the available AWS CLI command. In my example, I performed the following:
Note: in the complaint it states, “…permitted commands to reach and be executed by that server, which enabled access to folders or buckets of data…”
Where my example is similar to the Capital One breach is that in both scenarios, the environments trust the “security of the cloud” by relying on instance roles. Using instance roles is a documented AWS security best practice followed by many. Even if the instance role has been assigned an IAM policy that had minimal permissions, the issue is that the API call coming out of the compromised instance is inherently trusted, based upon the instance role policy attached to the instance. In other words, once I gain access to the instance, I have the entitlements that the instance was entrusted with.
According to erratic’s Twitter posts (screenshots on KrebsonSecurity), “Then I launch an instance into their vpc with access to aurora, attach the correct security profile…assume-role their IAM instance profiles…”. Admittedly, I am making an assumption, but the statement “…attach the correct security profile…” makes me believe that Capital One was also relying on IAM policies for database authentication. That would be two cases of assumed trust based on AWS security that pose a potential risk (e.g. assumed API access via an instance role and database authentication also via an instance role, etc.).
The moral of the story is not that Capital One did something “wrong” but more so that cloud architectures have softened the security principle of trust but verify as more people trust CSP platforms to provide “security in the cloud” under the Shared Responsibility Model. Additionally, many organizations have developed architectures relying solely on cloud native controls and technology (e.g. they’ve traded mature and proven security technologies for newer technologies that are still evolving, etc.).
To end the story of the presentation Shawn and I gave, we demonstrated the same exploit but behind a Palo Alto firewall, and my malicious script was stopped immediately from being uploaded. While some of the details are yet to emerge about Capital One’s firewall misconfiguration, one thing is apparent–relying on the “security of the cloud” without additional controls can lead to blindly trusting a layer that you do not control.
What would I recommend? While it is difficult to be the Monday morning quarterback without knowing all of the details, I can make some recommendations based upon my similar example and controls that GuidePoint has identified and recommended in our Cloud Security Architecture Framework.
- Lean on the conviction of trust but verify through multiple layers of defense. While the WAF vendor has not been identified, I am going to make the assumption that it was the AWS WAF. I would strongly recommend leveraging a mature WAF platform.
- Invest time in ensuring that all IAM policies follow a least privilege model (despite the complexity of managing numerous IAM policies). The complaint states that “According to Capital One, the *****-WAF-Role account does not, in the ordinary course of business, invoke the List Buckets Command.”
- Implement a platform that can ingest CloudTrail logs and detect anomalies in the normal course of business. More importantly, leverage a platform than can analyze IAM policies, report on whether the role is using the assigned entitlements and how often.
- Recall “…commands to reach and be executed by that server”. Follow hardening standards that remove or prevent the use of potentially exploitable commands such as curl, wget and even the AWS command. Web application code on an EC2 instance should be using the AWS SDK and not executing the AWS CLI.
The list of recommendations will vary based on use case, but one thing should remain constant– trust but verify.
Jonathan Villa
Practice Director - Cloud Security ,
GuidePoint Security
Jonathan Villa has worked as a technology consultant since 2000 and has worked in the information security field since 2003. For more than 10 years, Jonathan worked with a large municipality as a senior consultant in several competencies including PCI compliance and training, web application architecture and security, vulnerability assessments and developer training, and web application firewall administration. Jonathan also co-architected and managed an automated continuous integration environment that included static and dynamic code analysis for over 150 applications deployed to several distinct environments and platforms.
Jonathan has worked with virtualization and cloud technologies since 2005, and since 2010 has focused primarily on cloud security. Jonathan has worked with clients in various verticals across North America, South America and Asia to design and implement secured public and hybrid cloud environments, integrate security into continuous integration and delivery methodologies and develop custom application and security solutions using the AWS SDK. He has also provided guidance to customers in understanding how to manage their environments under the Shared Responsibility Model.
In addition to providing PCI training, Jonathan also has presented to law enforcement on cybersecurity and was a speaker at the Cloud Security Alliance New York City Summit. Jonathan holds the following certifications: CISSP, CCSP, C|EH, PCIP, AWS Certified Solutions Architect – Professional, AWS Certified SysOps Administrator, AWS Certified Developer, AWS Certified DevOps Professional and Security+ certifications including the CSA Certificate of Cloud Security Knowledge.