SEC595: Applied Data Science and AI/Machine Learning for Cybersecurity Professionals


Experience SANS training through course previews.
Learn MoreLet us help.
Contact usBecome a member for instant access to our free resources.
Sign UpWe're here to help.
Contact UsSecurity operations teams increasingly rely on cloud and Kubernetes telemetry, yet exporting and indexing all available logs can inflate SIEM licensing costs, cloud logging spend, and analyst workload through increased event volume and operational noise. This research compares two concurrent ingestion strategies for Google Cloud Platform (GCP) and Google Kubernetes Engine (GKE) telemetry into Splunk: (1) a broad “dump-everything” pipeline and (2) a focused pipeline that allow-lists security-relevant sources.
Over a 9-day observation window, the focused strategy reduced exported Pub/Sub topic storage from 383.53 GiB to 18.7 GiB (20.5× reduction; 95.1% lower) and reduced indexed events from 134,765,353 to 14,640,637 (9.2× reduction; 89.1% lower). In a controlled lab, 48 GCP control-plane test procedures and 29 GKE control-plane test procedures were executed and evaluated for observable, attributable audit evidence in Splunk. The coverage of the performed GCP and GKE control-plane procedures was practically equal between the dump-everything and focused pipelines, meaning that a focused allowlist can lower the volume of ingest without significantly worsening evidence usable in investigations.
These findings provide practitioner guidance for evidence-driven log selection and validation to balance audit coverage against ingest cost and operational noise in cloud SIEM pipelines.

















