Addressing the concern of security has become a critical focus for numerous company executives in recent times. The rise of high-profile security incidents has resulted in the exposure of sensitive customer data and inflicted severe damage to the reputation of affected organisations. In response to this escalating challenge, Kasna has been actively collaborating with customers to conduct comprehensive assessments, implement remediation measures, and offer effective solutions to bolster their cloud security posture.
During a recent customer engagement, the Kasna team was tasked with identifying ways to detect specific conditions that might indicate security policy violations. One such example involved detecting the presence of VPC networking peering with external organisations, aimed at reducing the risk of data exfiltration through an unauthorised network pathway. While this might seem simple on the surface, detecting such occurrences can become complex and time-consuming, leading to potential oversight and delays in conducting reviews. This is one of an ever growing list of policies that require ongoing review. Therefore, the implementation of an automated detection mechanism becomes crucial to alleviate the burden on manual processes and ensure that violations are detected promptly and mitigated, closer to the time of their occurrence.
In this engagement it soon became clear that GCP’s Asset Inventory was where most of the information we needed resided in a queryable form and could serve as the foundation of automated mechanisms. We validated the feasibility of Asset Inventory by writing example SQL queries which could be run directly against Asset Inventory using the Cloud Console’s Asset Query feature.
The Cloud Console is a powerful feature in itself, and Google provides a Query Library to help users get started. However, although both Asset Inventory and IAM Policies can be queried, there is no easy way to join the results of queries across both sets of data.
For example, an IAM Policy query could tell you what policy bindings existed, and an Asset Query could tell you what assets existed, but if you want to know a particular combination, such as subnets starting with “10.” that are editable by non-service accounts, that’s not possible with Asset Query. However, with the ability to export both datasets into BigQuery, a complex query as mentioned above is possible.
Asset Inventory provides the capability to export data at various levels, including project, folder, and organisation. Hence, our initial design was to export data at the organisation level.
At this point we seemed to have found a good mechanism that met our main criteria:
- Covered many of the security violations that we needed to detect
- Relied on GCP built-in functionality, with a minimum customisation
- Could be automated
Initial Validation Highlights a Complication
We proceeded with a tech spike to test the concept and uncover any unexpected challenges. And, unfortunately, we soon hit quite a big challenge… Our client is strongly committed to the use of VPC Service Controls to provide API-level security boundaries, and all of their projects sit within one of four VPC Service Perimeters. This is an approach which we endorse, but introduces a fundamental constraint: Asset Inventory exports running within a VPC Service Perimeter are prohibited at the organisation or folder level. Only project-level exports as mentioned in Cloud Asset API are supported. The mandatory use of VPC Service Controls in this environment posed a significant challenge. At this point the team discussed the challenge with the client with a view of abandoning the concept as not feasible and ceasing any further effort involving this design. Nonetheless, our client expressed great enthusiasm for the concept and urged us to explore further possibilities. As a result, we devised a solution that we believed would provide significant value while introducing a manageable level of additional complexity.
Our revised solution was to run a separate export against every project within the client’s organisation, merging the resulting data back into a single BigQuery dataset. This still introduced significant challenges in terms of crossing VPC Service Perimeters. However, we were aware that if we could formulate a design that minimised both the scope and depth of required security exceptions, then our client would still achieve great value from the implementation.
Our client already had a Common Services Perimeter (used for shared services such as DNS), which was a suitable location for our shared export project. They had predefined VPC Service Control ingress and egress rules to allow certain service functions which required data flow between perimeters. Our challenge was to adhere to the principles of least privilege, separation of duties and keeping things as simple as possible.
By designing an implementation that split the required functionality into three different event-driven Cloud Functions running under three different service accounts, we were able to keep both the IAM permissions and VPC Service Control ingress and egress rules specific and minimal.
Other designs were possible, such as exporting a BigQuery dataset within each Service Perimeter and then running a merge function across perimeters, but the above design appeared to be the most streamlined and straightforward. This design also had the advantage that it allowed us to build the solution in stages, with the first stages proving the concept without the complexity of crossing perimeters, dealing only with the ingress and egress rules once the solution was otherwise complete. Broadly speaking, the process entailed:
- Run the Cloud Functions in a single project within the Common Services perimeter. List Projects Cloud Function not required.
- Introduce the List Projects Cloud Function but only run against projects within the Common Services perimeter.
- Test the full solution but only targeting projects within the Test perimeter
- Final solution with the List Projects Cloud Function targeting the entire organisation.
Asset Inventory Export Table Limitation
Another lesson we learnt along the way was that while it is possible to run an Asset Inventory export which targets a single table, the results may be unexpected. In order to export all resource types with different metadata into a single table, the export process converts a lot of the data to a JSON string which cannot easily be queried using SQL. (See: https://cloud.google.com/asset-inventory/docs/reference/rest/v1/TopLevel/exportAssets.)
To avoid this problem it is necessary to export each different resource type to a separate table using the setting: separateTablesPerAssetType. This means that after running an export for every project, with every resource type targeting a different table, the end result creates a large number of tables. However, with some logic it is possible to merge all the tables for a given resource type into a single target, ending up with one merged table for each asset type.
Keeping Historical Data
Key Lessons Learnt
- Exports can take some time. If many exports are running synchronously, the process may take longer than the 9 minute maximum runtime for a Gen 1 Google Cloud Function. We worked around this problem by running individual table merges asynchronously and in parallel but this made checking the merge results and handling errors more difficult. If we were implementing this again in future we would utilise an HTTP driven Gen 2 Google Cloud Function which has a maximum runtime of 60 minutes. Unfortunately for our solution Gen 2 had not been approved for use with our client’s environment.
- GCP imposes a quota limit on Asset Inventory Exports: 60 per minute per consumer project and 6000 per day per consumer project. Our experience suggests that “consumer project” means the project which the export is run from, although this is unclear when reviewing the documentation. This quota might prevent some large environments from exporting all their Asset Inventory data every day.
- VPC Service Controls are complex and not very well documented. It is highly recommended to set up a test environment with VPC Service Controls enabled and testing to ensure the minimum requirements for ingress and egress rules are established. Ingress and egress rules should be locked down to a single Service Account and tied to a source project where possible.
- A Serverless VPC Access Connector must be configured in order to make VPC Service Controls work as advertised.
- BigQuery nested fields differ from traditional SQL, so start by researching the UNNEST command if you’re new to BigQuery.
Due to time constraints the focus of this implementation was on creating a queryable data set, and we delivered a simple proof of concept solution for detecting violations from the data. The simple solution consisted of a YAML configuration file defining SQL statements and expected results expressed in terms of the number of rows returned. Upon completion a report is generated in JSON format to be reviewed and actioned by an engineer. This approach could be expanded with automatic scheduling and it would be relatively easy to pipe the JSON output into any standard event processing system for alerting and further analysis.
GCP Asset Inventory offers an extensive and up-to-date dataset from your GCP environment, providing a programmable approach for querying and validating the data. Exporting this data to BigQuery proves invaluable for conducting cross-referencing tasks across disparate information sources, as well as for maintaining historical records and tracking changes over time. However, the utilisation of VPC service perimeters introduces complexities when exporting data beyond the perimeter or at the folder/organisation level. Nonetheless, by implementing multiple functions within different perimeters and merging the resulting data, these challenges can be effectively addressed.
Integrating this solution into an automated detection pipeline ensures the swift identification of any violations, enabling prompt action to prevent or mitigate potential breaches.
By combining the power of GCP Asset Inventory, BigQuery, and a robust detection mechanism, organisations can establish a comprehensive framework for maintaining the security and integrity of their GCP environment.