August 28, 2025
Your Kubernetes cluster is running smoothly. Deployments are fast, scaling is automatic, and your development teams are shipping features faster than ever.
Then your CISO asks the question that makes everyone uncomfortable: "Are we compliant? Can you show me our security posture?"
Suddenly, that beautifully orchestrated container environment looks like a compliance nightmare. Pod security admission standards, network segmentation, secrets management, RBAC configurations—there are dozens of moving pieces, and each one is a potential audit failure waiting to happen.
Here's the reality: Kubernetes security isn't just about preventing breaches. It's about building a defensible, auditable, compliant platform that satisfies SOC 2, PCI DSS, HIPAA, or whatever regulatory framework keeps your legal team awake at night.
This guide provides the complete enterprise Kubernetes security checklist—organized by compliance domain, with specific configurations, automation scripts, and audit evidence you need to pass your next review.
Traditional security was about perimeters: firewalls, VPNs, and controlled access points. Kubernetes threw that model out the window.
In Kubernetes, everything is dynamic:
- Containers spin up and down constantly
- Network policies change with deployments
- Service-to-service communication happens across hundreds of microservices
- Secrets, configurations, and access permissions are managed through YAML files
- Multiple teams deploy independently to shared infrastructure
The compliance challenge: Auditors want to see consistent, documented, enforceable security controls. But Kubernetes makes everything programmable and ephemeral. How do you audit something that changes every minute?
The answer: Infrastructure as Code + Policy as Code + Continuous Monitoring.
Instead of manual security reviews, you need automated security enforcement that produces audit trails. Every security control becomes a coded policy that's version-controlled, tested, and automatically applied.
Enterprise Kubernetes security covers six critical domains. Each domain maps to specific compliance requirements and audit controls:
Identity & Access Management (IAM): Compliance Frameworks: SOC 2 (CC6.1, CC6.2), PCI DSS (7.1, 8.1), HIPAA (164.312) What Auditors Want: Role-based access controls, least privilege access, regular access reviews
Network Security - Compliance Frameworks: PCI DSS (1.2, 1.3), SOC 2 (CC6.1), ISO 27001 (A.13.1) What Auditors Want: Network segmentation, traffic encryption, ingress/egress controls
Data Protection - Compliance Frameworks: GDPR (Article 32), HIPAA (164.312), SOC 2 (CC6.7) What Auditors Want: Encryption at rest/transit, secrets management, data classification
Security Monitoring - Compliance Frameworks: SOC 2 (CC7.1), PCI DSS (10.1), NIST (DE.CM) What Auditors Want: Comprehensive logging, real-time monitoring, incident response
Vulnerability Management - Compliance Frameworks: PCI DSS (6.1), SOC 2 (CC7.2), ISO 27001 (A.12.6) What Auditors Want: Container scanning, patch management, dependency tracking
Configuration Management - Compliance Frameworks: SOC 2 (CC8.1), NIST (PR.IP), CIS Controls (5.1) What Auditors Want: Baseline hardening, configuration drift detection, change tracking
Let's dive into each domain with specific configurations and compliance evidence.
The Problem: By default, Kubernetes gives the default service account broad permissions. Without proper RBAC, a compromised pod can access the entire cluster.
Essential RBAC Configuration
apiVersion: v1
kind: ServiceAccount
metadata:
name: default
namespace: production
automountServiceAccountToken: false
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: production
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: production
subjects:
- kind: ServiceAccount
name: app-service-account
namespace: production
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
# API Server configuration
--oidc-issuer-url=https://your-identity-provider.com
--oidc-client-id=kubernetes
--oidc-username-claim=email
--oidc-groups-claim=groups
- No default service accounts: Default service accounts have automountServiceAccountToken: false
- Principle of least privilege: Each workload has specific, minimal RBAC permissions
- Regular access reviews: Monthly audit of ClusterRoles and RoleBindings
- MFA enforcement: All human access requires multi-factor authentication
- Service account rotation: Non-human accounts use short-lived tokens where possible
- Audit logging: All authentication and authorization events logged
- RBAC configuration files in version control
- Authentication logs showing successful/failed access attempts
- Access review reports with timestamps and approvers
- Service account usage reports
#!/bin/bash
# Monthly RBAC audit script
echo "=== RBAC Compliance Audit ==="
echo "Date: $(date)"
# Check for overprivileged service accounts
kubectl get clusterrolebindings -o json | jq -r '.items[] | select(.subjects[]?.kind == "ServiceAccount") | select(.roleRef.name == "cluster-admin") | .metadata.name'
# List all service accounts with default permissions
kubectl get serviceaccounts --all-namespaces -o json | jq -r '.items[] | select(.automountServiceAccountToken != false) | "\(.metadata.namespace)/\(.metadata.name)"'
# Generate access review report
kubectl auth can-i --list --as=system:serviceaccount:production:app-service-account
The Problem: Kubernetes defaults to "allow all" networking. Any pod can communicate with any other pod, including across namespaces.
Network Policy Implementation
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: database-egress-restriction
namespace: production
spec:
podSelector:
matchLabels:
app: database
policyTypes:
- Egress
egress:
- to: []
ports:
- protocol: TCP
port: 53 # DNS only
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: production
spec:
mtls:
mode: STRICT
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: frontend-policy
namespace: production
spec:
selector:
matchLabels:
app: frontend
rules:
- from:
- source:
principals: ["cluster.local/ns/production/sa/gateway-service-account"]
- to:
- operation:
methods: ["GET", "POST"]
- Default deny policies: All namespaces have default deny network policies
- Micro-segmentation: Service-to-service communication is explicitly defined
- TLS encryption: All inter-service communication uses TLS
- Ingress controls: External traffic is restricted to specific entry points
- Egress controls: Outbound traffic is limited to required destinations
- Network monitoring: Traffic flows are logged and monitored
- Network policy configurations in version control
- Service mesh security configurations
- Network traffic flow logs
- TLS certificate management records
The Problem: Kubernetes stores secrets as base64-encoded strings by default. Container images often contain hardcoded credentials. Data encryption requires specific configuration.
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: vault-backend
namespace: production
spec:
provider:
vault:
server: "https://vault.company.com"
path: "secret"
version: "v2"
auth:
kubernetes:
mountPath: "kubernetes"
role: "production-role"
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: database-credentials
namespace: production
spec:
encryptedData:
username: AgBy3i4OJSWK+PiTySYZZA9rO43cGDEQAx...
password: AhA5P9k1pI+3TLZNOmSNIK9kJt5...
# encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: c2VjcmV0IGlzIHNlY3VyZQ==
- identity: {}
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
- Secrets encryption: All secrets encrypted at rest and in transit
- External secrets management: Secrets stored in dedicated vault system
- No hardcoded credentials: Container images scanned for embedded secrets
- Data classification: Sensitive data identified and labeled
- Access logging: All secret access events logged with user attribution
- Key rotation: Encryption keys rotated according to policy
- Secret scanning reports from container registries
- Encryption configuration files
- Key rotation logs
- Data access audit trails
The Problem: Kubernetes generates massive amounts of log data, but most of it isn't security relevant. Without proper monitoring, security incidents go undetected.
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log security-relevant events at detailed level
- level: Metadata
namespaces: ["production", "staging"]
resources:
- group: ""
resources: ["secrets", "configmaps"]
- group: "rbac.authorization.k8s.io"
resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]
# Log authentication failures
- level: Request
users: ["system:anonymous"]
verbs: ["get", "list", "watch"]
# Log exec and portforward (potential data exfiltration)
- level: Metadata
resources:
- group: ""
resources: ["pods/exec", "pods/portforward"]
# Custom Falco rules for enterprise security
- rule: Detect crypto mining
desc: Detect cryptocurrency mining
condition: >
spawned_process and
(proc.name in (cryptomining_binaries) or
proc.cmdline contains "stratum" or
proc.cmdline contains "cryptonight")
output: >
Cryptocurrency mining detected (user=%user.name command=%proc.cmdline
container=%container.name image=%container.image.repository)
priority: CRITICAL
- rule: Detect privilege escalation
desc: Detect attempts to escalate privileges
condition: >
spawned_process and
(proc.name in (su, sudo, doas) or
proc.cmdline contains "chmod +s" or
proc.cmdline contains "setuid")
output: >
Privilege escalation attempt (user=%user.name command=%proc.cmdline
container=%container.name)
priority: WARNING
- Audit logging enabled: All security-relevant API calls logged
- Runtime monitoring: Container behavior monitored for anomalies
- Log retention: Security logs retained per compliance requirements
- Alert thresholds: Automated alerts for security policy violations
- Incident response: Documented procedures for security events
- Log integrity: Audit logs protected from tampering
The Problem: Container images ship with outdated packages; base images drift; dependencies accumulate CVEs. If you don’t gate what enters the cluster, auditors will flag “uncontrolled risk.”
Use Trivy/Grype in CI and break the build for high/critical CVEs.
# Example (Trivy)
trivy image --scanners vuln --severity HIGH,CRITICAL \
--exit-code 1 --ignore-unfixed \
--format table $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
Generate SBOM (CycloneDX or SPDX) and attach to artifacts. Keep them in your registry or artifact repo.
# Example (Syft SBOM generation)
syft $IMAGE_REF -o cyclonedx-json > sbom.cdx.json
3) Sign images
Sign build artifacts with Cosign and store signatures in registry.
cosign sign --key cosign.key $IMAGE_REF
cosign verify --key cosign.pub $IMAGE_REF
4) Enforce at admission (Kyverno)
Block unsigned images, disallow :latest, and restrict registries.
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-signed-images
spec:
validationFailureAction: enforce
background: true
rules:
- name: verify-signatures
match:
any:
- resources:
kinds: ["Pod"]
verifyImages:
- imageReferences:
- "registry.company.com/*"
attestors:
- entries:
- keys:
publicKeys: |
-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqh...
-----END PUBLIC KEY-----
5) Optional: Gate by CVE severity at admission
If your platform adds labels/annotations from CI scan (e.g., image.security.company.com/severity=high), you can deny based on that.
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: deny-high-vuln-images
spec:
validationFailureAction: enforce
rules:
- name: block-high
match:
any:
- resources:
kinds: ["Pod"]
preconditions:
all:
- key: "{{ request.object.metadata.annotations.\"image.security.company.com/severity\" }}"
operator: AnyIn
value: ["HIGH","CRITICAL"]
validate:
message: "Images with HIGH/CRITICAL vulns are not allowed."
deny: {}
- Prefer distroless or minimal bases.
- Pin images by digest (not just tag).
- Patch cadence: weekly for critical workloads; monthly for the rest.
- Private, scanned registries only.
- Block public pulls at cluster egress except allowed registries.
- Dependency pinning with Renovate/Dependabot PRs and a mandatory review path.
- CI pipeline blocks HIGH/CRITICAL CVEs; SBOMs generated for all images
- Images are signed; admission enforces signature verification
- No :latest; images pinned by digest; approved registries only
- Regular patching SLAs documented and met
- Runtime scans of node OS and container base layers
- Vulnerability exceptions documented with expiry and compensating controls
- CI scan reports, SBOM files, Cosign verification logs
- Kyverno/OPA policies in Git, admission controller logs
- Patch calendars, change tickets, exception register with approvals
The Problem: Kubernetes makes everything configurable—sometimes dangerously so. Auditors look for baseline hardening, drift detection, and provable change control.
Baseline Hardening (Pod Security + Controls)
You already labeled namespaces with Pod Security Standards (restricted). Add specific workload controls:
1) Enforce pod security context (Kyverno)
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: baseline-pod-security
spec:
validationFailureAction: enforce
background: true
rules:
- name: require-seccomp
match:
any:
- resources:
kinds: ["Pod"]
validate:
message: "seccompProfile RuntimeDefault required"
pattern:
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
- name: require-nonroot-and-readonly
match:
any:
- resources:
kinds: ["Pod"]
validate:
message: "runAsNonRoot and readOnlyRootFilesystem required; drop ALL caps"
pattern:
spec:
containers:
- securityContext:
runAsNonRoot: true
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
2) Deny privilege escalation, host access, and hostPath
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: deny-privileged-and-host
spec:
validationFailureAction: enforce
rules:
- name: no-privileged
match:
any:
- resources:
kinds: ["Pod"]
validate:
message: "Privileged containers are not allowed."
pattern:
spec:
containers:
- securityContext:
allowPrivilegeEscalation: false
privileged: false
- name: ban-host-namespaces
match:
any:
- resources:
kinds: ["Pod"]
validate:
message: "hostPID/hostIPC/hostNetwork not allowed."
deny:
conditions:
any:
- key: "{{ request.object.spec.hostPID || false }}"
operator: Equals
value: true
- key: "{{ request.object.spec.hostIPC || false }}"
operator: Equals
value: true
- key: "{{ request.object.spec.hostNetwork || false }}"
operator: Equals
value: true
- name: restrict-hostpath
match:
any:
- resources:
kinds: ["Pod"]
validate:
message: "hostPath volumes are forbidden."
deny:
conditions:
any:
- key: "{{ request.object.spec.volumes[].hostPath || [] | length(@) }}"
operator: GreaterThan
value: 0
3) Require resource limits and probes
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-limits-probes
spec:
validationFailureAction: enforce
background: true
rules:
- name: require-resources
match:
any:
- resources:
kinds: ["Pod"]
validate:
message: "CPU/memory requests & limits required for every container."
pattern:
spec:
containers:
- resources:
requests:
cpu: "?*"
memory: "?*"
limits:
cpu: "?*"
memory: "?*"
- name: require-probes
match:
any:
- resources:
kinds: ["Pod"]
validate:
message: "liveness and readiness probes required."
pattern:
spec:
containers:
- livenessProbe: {}
readinessProbe: {}
- Git as the source of truth (Helm/Kustomize).
- Argo CD enforces desired state; auto-sync off in prod; PR approvals required.
- Drift alerts: Argo CD notifications to Slack/Teams; deviations must be reconciled or explained.
Argo CD app example:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: platform-policies
namespace: argocd
spec:
source:
repoURL: 'https://git.company.com/platform/policies.git'
targetRevision: main
path: kyverno
destination:
server: 'https://kubernetes.default.svc'
namespace: kyverno
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
- CIS-aligned hardening (PSA restricted + additional controls)
- Mandatory securityContext, probes, resource limits on all workloads
- GitOps with peer-reviewed PRs; drift detection and alerting
- Change tickets link to Git SHAs; rollbacks documented
- Environment separation (dev/stage/prod) with different policy strictness
- Periodic config drift reports across namespaces
- Kyverno/Gatekeeper policy repos + Argo CD app manifests
- Drift reports, change tickets with approvals, rollback notes
- Periodic CIS scan reports (kube-bench, etc.) and remediation records
- Kyverno for Kubernetes-native policies (security context, signatures, registries, probes)
- OPA/Gatekeeper (optional) for org-wide Rego policies and legacy controls
- CI scans (Trivy/Grype), SBOMs (Syft), signing (Cosign), provenance (SLSA where possible)
- Falco for syscall-level detections
- API audit logs with targeted high-value rules
- Centralized logs and alerts (Elastic/Splunk/Cloud) with retention per framework
- External Secrets Operator + Vault/KMS; etcd encryption at rest
- RAG-style labels for data classification; access logs tied to identity
- Default-deny + micro-segmentation via NetworkPolicy
- mTLS via service mesh (Istio/Linkerd) with AuthZ policies
- Git as the evidence store (policies, manifests, CI reports, SBOMs)
- Scheduled exports of audit logs, Falco alerts, and Argo drift reports
- Quarterly control reviews with updated risk register and exceptions
Kubernetes compliance isn’t a sprint to a fancy PDF; it’s a system: IaC + Policy as Code + Continuous Monitoring. When every control is codified, versioned, and enforced, your “Are we compliant?” conversation stops being awkward—and starts being a link to a Git repo, a dashboard, and a clear audit trail.
If you want, I can package this into a one-page auditor checklist or a repo scaffold (/policies, /pipelines, /evidence) you can drop into Git to start enforcing today.
Just like how your fellow techies do.
We'd love to talk about how we can work together
Take control of your AWS cloud costs that enables you to grow!