make GCP provider work concurrently #3152

orouz · 2025-03-30T09:42:27Z

Summary of your changes

the core of this PR is to make any type of work done in the gcplib provider to run concurrently. this means that there's a pipeline of fetching, merging and enriching and once it's done the assets are sent to the fetcher which in turn sends them to the resourceCh of the flavor pipeline (cspm/assets inventory)

specific and notable changes are mentioned in PR comments.

Screenshot/Data

CSPM GCP: same total findings count (1149) and same findings count per rule (with same passed/failed count) as in live env (elastic-security-test)

rule.benchmark.rule_number	Count of records
3.8	826
1.7	65
2.12	19
3.1	19
3.2	19
4.7	15
4.1	14
4.2	14
4.3	14
4.4	14
4.5	14
4.6	14
4.8	14
4.9	14
1.4	9
3.6	9
3.7	9
1.12	4
1.14	4
1.15	4
1.10	3
1.9	3
5.1	3
5.2	3
2.3	2
1.11	1
1.17	1
1.5	1
1.6	1
1.8	1
2.10	1
2.11	1
2.13	1
2.16	1
2.4	1
2.5	1
2.6	1
2.7	1
2.8	1
2.9	1
3.3	1
3.4	1
3.5	1
7.1	1
7.2	1
7.3	1

Assets Inventory GCP: 19 more assets (due to network assets addition which is on `8.x`/`main` and not `8.17`) as in live env (elastic-security-test)

cloud.service.name	Count of records
compute.googleapis.com/Subnetwork	826
iam.googleapis.com/ServiceAccountKey	65
iam.googleapis.com/ServiceAccount	44
compute.googleapis.com/Network	19
compute.googleapis.com/Instance	14
compute.googleapis.com/Firewall	9
iam.googleapis.com/Role	3
storage.googleapis.com/Bucket	3
cloudresourcemanager.googleapis.com/Project	1
run.googleapis.com/Service	1

also ran these changes in CSPM GCP organization-account mode for 7 conseqeutive runs (5 mins period) with same number of assets in each cycle

Related Issues

Related: CSPM GCP RAM footprint #3081

mergify · 2025-03-30T09:43:04Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b gcp_ch upstream/gcp_ch
git merge upstream/main
git push upstream gcp_ch

mergify · 2025-03-30T09:43:05Z

This pull request does not have a backport label. Could you fix it @orouz? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit
backport-active-all is the label that automatically backports to all active branches.
backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.
NOTE: backport-v8.x has been added to help with the transition to the new branch 8.x.

mergify · 2025-04-01T10:00:19Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b gcp_ch upstream/gcp_ch
git merge upstream/main
git push upstream gcp_ch

orouz · 2025-04-03T10:58:15Z

internal/resources/fetching/fetchers/gcp/networks_fetcher.go

this is a new fetcher for network assets, it was added to avoid an enrichment attempt we did on all assets instead of just network assets:

cloudbeat/internal/resources/providers/gcplib/inventory/provider.go

Line 213 in afdb27c

p.enrichNetworkAssets(ctx, extendedAssets)

these assets are now fetched and enriched in their own fetcher, avoiding said attempt when fetching all other assets in assets_fetcher

orouz · 2025-04-03T10:58:35Z

internal/resources/providers/gcplib/inventory/provider.go

-	ListLoggingAssets(ctx context.Context) ([]*LoggingAsset, error)
-
-	// ListServiceUsageAssets returns a list of service usage assets grouped by project id
-	ListServiceUsageAssets(ctx context.Context) ([]*ServiceUsageAsset, error)


these two were combined in ListProjectAssets

orouz · 2025-04-03T10:58:43Z

internal/resources/providers/gcplib/inventory/provider.go

+	resourceCh := make(chan *assetpb.Asset) // *assetpb.Asset with Resource
+	policyCh := make(chan *assetpb.Asset)   // *assetpb.Asset with IamPolicy
+	mergeCh := make(chan *assetpb.Asset)    // *assetpb.Asset with Resource and IamPolicy
+	enrichCh := make(chan *ExtendedGcpAsset)

-	var assets []*assetpb.Asset
-	assets = append(append(assets, resourceAssets...), policyAssets...)
-	mergedAssets := mergeAssetContentType(assets)
-	extendedAssets := p.extendWithCloudMetadata(ctx, mergedAssets)
-	// Enrich network assets with dns policy
-	p.enrichNetworkAssets(ctx, extendedAssets)
+	go p.getAllAssets(ctx, p.config.Parent, assetpb.ContentType_RESOURCE, assetTypes, resourceCh)
+	go p.getAllAssets(ctx, p.config.Parent, assetpb.ContentType_IAM_POLICY, assetTypes, policyCh)
+	go p.mergeAssets(ctx, mergeCh, resourceCh, policyCh)
+	go p.enrichAssets(ctx, mergeCh, enrichCh)

-	return extendedAssets, nil
+	for asset := range enrichCh {
+		out <- asset
+	}


this method is used by the main fetcher (assets_fetcher) and the way it works is that we start fetching resources and then merge them by name. the merging process sends assets to the enrichment channel if: 1) an asset has a resource and a policy, or 2) an asset has either a resource or a policy, and the other channel is closed, so we don't need to wait for merging with the other content type

after an asset is enriched it's sent to the out channel which is received in the fetcher and passed to the resourceCh to start the relevant flavor (cspm/assets inventory) pipeline

orouz · 2025-04-03T10:58:48Z

internal/resources/providers/gcplib/inventory/provider.go

-	logMetrics, err := p.ListAllAssetTypesByName(ctx, monitoringAssetTypes["LogMetric"])
-	if err != nil {
-		return nil, err
+func (p *Provider) ListMonitoringAssets(ctx context.Context, out chan<- *MonitoringAsset) {


this method used to fetch all resources and policies for 2 asset types, then group them by project, and once all assets are grouped, return a slice with all groups

it now does the same thing, but sends each group once it's ready, instead of waiting for all of them to be ready. also, only resources are fetched (no policies, as CIS rules don't require it. not all these asset types necessarily have a policy, but trying to fetch them was redundant)

orouz · 2025-04-03T10:58:52Z

internal/resources/providers/gcplib/inventory/provider.go

-	if err != nil {
-		return nil, err
-	}
+func (p *Provider) ListProjectsAncestorsPolicies(ctx context.Context, out chan<- *ProjectPoliciesAsset) {


this used to send a slice of items that each represent a group of assets of the same project. it now does the same, but sends each group separately so we don't wait for all of them.

orouz · 2025-04-03T10:58:55Z

internal/resources/providers/gcplib/inventory/provider.go

-	var assets []*ExtendedGcpAsset
-	assets = append(append(assets, logMetrics...), alertPolicies...)
-	monitoringAssets := getAssetsByProject[MonitoringAsset](assets, p.log, typeGenerator)
+func (p *Provider) ListProjectAssets(ctx context.Context, assetTypes []string, out chan<- *ProjectAssets) {


this function now handles the same thing ListLoggingAssets and ListServiceUsageAssets used to. they both used to fetch both resource and policy for some asset types, group them all by project and send a slice with each group.

it now does the same thing, but starts by fetching projects and then for each project fetches only the resource of the given asset types and the sends it out. the difference is that we don't wait for all assets from all projects, plus we don't fetch policies as they weren't used by the relevant CIS rules.

Copilot

Pull Request Overview

This PR enhances the GCP provider to run its workflows concurrently by refactoring asset fetchers to use channels and goroutines. Key changes include:

Refactoring of multiple fetchers (Service Usage, Policies, Monitoring, Networks, Log Sink, and Assets) to use concurrent channel-based asset delivery.
Renaming and updating mock functions to support the new asynchronous calls.
Updating tests to reflect the new concurrent behavior and added network asset support.

Reviewed Changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated no comments.

File	Description
internal/resources/providers/gcplib/inventory/mock_service_api.go	Added Clear function and renamed/mock helper functions to support new ListAssetTypes method.
internal/resources/fetching/preset/gcp_preset.go	Added initialization for the network assets fetcher.
internal/resources/fetching/fetchers/gcp/*	Modified fetchers (Service Usage, Policies, Monitoring, Networks, Log Sink, Assets) to run concurrently via channels.
internal/inventory/gcpfetcher/*	Updated asset fetcher and mock provider functions to replace ListAllAssetTypesByName with ListAssetTypes, and adjusted tests accordingly.

Comments suppressed due to low confidence (2)

internal/resources/fetching/fetchers/gcp/service_usage_fetcher.go:38

[nitpick] Consider standardizing the naming of the asset subtype field across fetchers. In some files the field is named 'subType' (e.g. here) while in others it is 'SubType'; standardizing this will improve code consistency.

subType string

internal/resources/fetchers/fetchers/gcp/assets_fetcher.go:86

Ensure that the implementation of ListAssetTypes in the provider always closes the results channel; otherwise, the for-select loop may hang if the channel is never closed.

go f.provider.ListAssetTypes(ctx, lo.Keys(reversedGcpAssetTypes), resultsCh)

orouz · 2025-04-03T17:25:14Z

internal/resources/providers/gcplib/inventory/provider.go

 	Init(ctx context.Context, log *clog.Logger, gcpConfig auth.GcpFactoryConfig) (ServiceAPI, error)
 }

-func (p *ProviderInitializer) Init(ctx context.Context, log *clog.Logger, gcpConfig auth.GcpFactoryConfig) (ServiceAPI, error) {
+func newAssetsInventoryWrapper(ctx context.Context, log *clog.Logger, gcpConfig auth.GcpFactoryConfig) (*AssetsInventoryWrapper, error) {
 	limiter := NewAssetsInventoryRateLimiter(log)


we use rate limiting for the GCP Assets Inventory ListAssets method, and because we're making more concurrent calls, the cycle fetching takes longer (~30seconds vs ~18secs for our gcp test account) as we wait between calls.

orouz · 2025-04-08T08:46:40Z

internal/resources/fetching/fetchers/gcp/assets_fetcher.go

-				f.log.Infof("GcpAssetsFetcher.Fetch context err: %s", ctx.Err().Error())
+	f.log.Info("GcpAssetsFetcher.Fetch start")
+	defer f.log.Info("GcpAssetsFetcher.Fetch done")
+	defer f.provider.Clear()


Clear() will clear out the cache used for cloud account metadata, which is accessed by all assets fetched (mostly all hits)

it used to not get cleared at all (see #2182)
now it gets cleared when each fetcher exits, which is better but also not good, as it should get called when all fetchers are done. i think a better place would be if the registry would be the one calling a Clear method on the Fetcher interface, allowing them to clear up caches when fetching cycle is done.

moukoublen · 2025-05-01T03:54:19Z

internal/resources/providers/gcplib/inventory/resource_manager.go

+}
+
+func getOrganizationId(ancestors []string) string {
+	last := ancestors[len(ancestors)-1]


nit: perhaps a check if the ancestors has zero length (is it possible?)

ancestors can't be empty
https://github.com/googleapis/google-cloud-go/blob/656d319252e5d973366a2c10f57f072b170dee1f/asset/apiv1p5beta1/assetpb/assets.pb.go#L95-L103

mergify bot assigned orouz Mar 30, 2025

mergify bot added the backport-v8.x label Mar 30, 2025

orouz force-pushed the gcp_ch branch 3 times, most recently from 1d5beb8 to b0b857e Compare March 30, 2025 16:04

orouz force-pushed the gcp_ch branch from f6531ab to 405334d Compare April 2, 2025 07:49

orouz commented Apr 3, 2025

View reviewed changes

oren-zohar requested review from Copilot and moukoublen April 3, 2025 11:37

Copilot AI reviewed Apr 3, 2025

View reviewed changes

orouz commented Apr 3, 2025

View reviewed changes

orouz force-pushed the gcp_ch branch from 60348bf to 9210021 Compare April 3, 2025 19:09

orouz marked this pull request as ready for review April 6, 2025 11:58

orouz requested a review from a team as a code owner April 6, 2025 11:58

orouz commented Apr 8, 2025

View reviewed changes

orouz force-pushed the gcp_ch branch 2 times, most recently from b67b8de to 031d2c7 Compare April 16, 2025 09:53

orouz linked an issue Apr 21, 2025 that may be closed by this pull request

CSPM GCP RAM footprint #3081

Open

orouz added backport-v8.19.0 and removed backport-v8.x labels Apr 28, 2025

moukoublen approved these changes May 1, 2025

View reviewed changes

moukoublen reviewed May 1, 2025

View reviewed changes

orouz added 15 commits May 4, 2025 13:16

make gcp provider work concurrently

988710d

fix test

1155bf2

fix tests

ec6136d

split work by projects

166044e

fetchers fix

0e7845e

fix network fetchers and group tests

8d34a9a

fix tests

e5eb5a1

fixes

1e53667

fix tests

72cbe92

cleaning

e0ce23f

cleaning and more tests

e5cee56

cleaning and fixes

14f9854

fixes

bdf8071

context fix

68ef0c9

fix tests

205236d

orouz force-pushed the gcp_ch branch from 74dba51 to 205236d Compare May 4, 2025 12:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make GCP provider work concurrently #3152

make GCP provider work concurrently #3152

orouz commented Mar 30, 2025 •

edited

Loading

mergify bot commented Mar 30, 2025

mergify bot commented Mar 30, 2025

mergify bot commented Apr 1, 2025

orouz Apr 3, 2025

orouz Apr 3, 2025

orouz Apr 3, 2025

orouz Apr 3, 2025

orouz Apr 3, 2025

orouz Apr 3, 2025

Copilot AI left a comment

orouz Apr 3, 2025 •

edited

Loading

orouz Apr 8, 2025

moukoublen May 1, 2025

orouz May 4, 2025

make GCP provider work concurrently #3152

Are you sure you want to change the base?

make GCP provider work concurrently #3152

Conversation

orouz commented Mar 30, 2025 • edited Loading

Summary of your changes

Screenshot/Data

Related Issues

mergify bot commented Mar 30, 2025

mergify bot commented Mar 30, 2025

mergify bot commented Apr 1, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

orouz Apr 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orouz commented Mar 30, 2025 •

edited

Loading

orouz Apr 3, 2025 •

edited

Loading