Cloud Security Fundamentals

Cloud security starts with understanding the shared responsibility model — what the provider secures vs what you own. IAM is the most impactful control in any cloud: misconfigurations in AWS IAM, Azure RBAC, and GCP IAM cause more breaches than all network misconfigs combined. This covers IAM across all three clouds, common misconfigurations, CSPM tools (Prowler, ScoutSuite, Checkov), and the zero-trust cloud posture.

AWS · Azure · GCP · IAM · CSPM · Zero Trust · Shared Responsibility

Cloud is different, but the fundamentals apply

The names changed when everything moved to cloud, but the underlying security disciplines did not. IAM is still identity and access management — it just uses policies and roles instead of Active Directory groups. Security groups are still firewalls — they just live in software instead of a rack. S3 is still a NAS — except the default configuration is that the whole internet can see it if you misconfigure the ACL. The concepts are the same. The attack surface is larger, the blast radius of a mistake is global, and the controls work at API level rather than network level.

The critical difference that reshapes everything: in cloud, every resource is managed through an API, and that API is accessible from anywhere in the world. A leaked AWS access key is not a local threat. It is a global authentication credential to your entire cloud estate. A developer who accidentally pushes an ~/.aws/credentials file to a public GitHub repo gives attackers immediate access — and automated tools scan GitHub for exactly these secrets within seconds of a commit. The response time from "key leaked" to "account compromised" in the wild is measured in minutes.

CISA's 2024 Cloud Security Best Practices guide identifies identity misconfigurations as the leading cause of cloud security incidents — not network vulnerabilities, not software bugs. The attack path in almost every significant cloud breach starts with a compromised identity: a leaked access key, a phished MFA, or an over-permissioned role assumed via SSRF.

Shared responsibility: who owns what?

The shared responsibility model defines where the cloud provider's security obligation ends and yours begins. It is the most important mental model for cloud security because misunderstanding it is where the majority of cloud breaches originate. The provider is responsible for the security of the cloud — the physical infrastructure, the hardware, the hypervisor, the managed service availability. You are responsible for security in the cloud — how you configure everything you deploy on top of it.

What the provider always secures

Provider responsibilityScope
Physical data centresLocks, guards, power, cooling, hardware disposal
Global network infrastructureInter-region backbone, DDoS protection at the edge
HypervisorVM isolation, escape prevention, patching
Managed service availabilityRDS, Lambda, ECS, AKS — the platform SLA

What the customer always owns

Customer responsibilityWhat it means in practice
IAM configurationWho has access, what policies are attached, whether MFA is enforced
Data encryption choicesEncryption at rest (S3, EBS, RDS) and in transit — opt-in, not default everywhere
Network access controlsSecurity groups, NACLs, VPC configuration, firewall rules
Application securityYour code, its dependencies, OWASP vulnerabilities in your app
Patch management (IaaS)Guest OS, application runtime, middleware — the provider does NOT patch EC2 instances
Logging configurationCloudTrail/Activity Log/Cloud Audit Logs must be enabled — not always on by default in all regions
"AWS said their infrastructure is secure" — yes, and that is true. AWS will not misconfigure your S3 bucket ACL. AWS will not attach AdministratorAccess to your IAM users. AWS will not disable CloudTrail in your account. Every one of those is your configuration, your decision, your responsibility. The shared responsibility model has a sharp boundary and misconfigurations happen entirely on the customer side of it.

IAM across AWS, Azure, and GCP

IAM is the single highest-impact control in any cloud environment. Get it right and the blast radius of every other misconfiguration shrinks dramatically. Get it wrong and a single compromised credential gives an attacker the run of the entire cloud estate. The three major clouds have different naming conventions but the underlying model is identical: an identity (user, group, service account, managed identity) is granted permissions via a role/policy and can act on resources within that permission boundary.

AWS IAM

bash
# Check for users with AdministratorAccess (the most common critical misconfiguration):
aws iam list-attached-user-policies --user-name alice
aws iam get-policy --policy-arn arn:aws:iam::aws:policy/AdministratorAccess

# List all access keys older than 90 days (should be rotated or disabled):
aws iam generate-credential-report
aws iam get-credential-report --output text --query Content | base64 -d | \
  awk -F',' 'NR>1 && $9!="N/A" && $9<"2025-03-01" {print $1, $9}'

# Check for root account access keys (should never exist):
aws iam get-account-summary | grep "AccountAccessKeysPresent"
# Should return 0

# Check password policy:
aws iam get-account-password-policy

# List all roles and their trust policies (find over-trusted roles):
aws iam list-roles --query 'Roles[*].[RoleName,AssumeRolePolicyDocument]'

Azure RBAC

shell
# List all role assignments at subscription scope:
az role assignment list --scope /subscriptions/<sub-id> --output table

# Find all users/SPs with Owner role (highest risk):
az role assignment list --role Owner --all --output table

# Check for service principals with client secrets (prefer managed identities):
az ad sp list --all --query '[?passwordCredentials!=null].[displayName,appId]'

# List managed identity role assignments:
az identity list --output table

# Check PIM (Privileged Identity Management) eligibility vs active assignments:
# az rest --method GET --uri 'https://management.azure.com/subscriptions/{id}/providers/Microsoft.Authorization/roleEligibilitySchedules?api-version=2020-10-01'

GCP IAM

bash
# List all IAM policy bindings at project level:
gcloud projects get-iam-policy <project-id> --format=json | \
  jq '.bindings[] | select(.role | contains("owner", "editor")) | {role, members}'

# List service account keys (should be zero — use Workload Identity instead):
gcloud iam service-accounts keys list --iam-account=<sa>@<project>.iam.gserviceaccount.com

# Check for allUsers or allAuthenticatedUsers on GCS buckets (public access):
gsutil iam get gs://<bucket-name> | grep -E "allUsers|allAuthenticatedUsers"

# Check org policies (prevent public IPs, restrict domains, etc.):
gcloud org-policies list --project=<project-id>
💡 The most effective thing you can do immediately in any cloud environment: run a CSPM tool (Prowler for AWS, ScoutSuite for multi-cloud, Checkov for IaC). Sixty minutes of a Prowler run against a production AWS account usually surfaces the three or four findings that actually matter — root MFA missing, overly broad role, disabled CloudTrail in a region. The rest of the findings are noise. Fix the critical ones first.

The most common IAM mistakes

The most common IAM mistakes are well-documented, show up in CISA advisories, and still appear in production environments year after year. Not because security teams do not know about them — but because cloud estates grow faster than governance does. A developer creates a "temporary" IAM user with AdminAccess for a weekend project, and it is still there three years later. An EC2 instance gets an IAM role with S3 full-access because it was easier than figuring out the exact bucket permissions. These are the findings that matter most because they are the ones attackers exploit first.

AWS critical misconfigurations

MisconfigurationSeverityRemediation
Root account without MFACriticalEnable MFA, create an admin IAM user instead of using root for daily work
AdministratorAccess on IAM usersCriticalUse IAM Identity Center (SSO); never attach AdministratorAccess to human users
Long-lived access keys for humansHighUse IAM Identity Center and temporary credentials; no human should have long-lived keys
EC2/Lambda role with s3:*HighScope down to exactly the buckets and actions needed; add condition keys
No permission boundaries on developer accountsMediumPrevents privilege escalation via RBAC; cap max permissions at the boundary
VPC Flow Logs disabledMediumNo network visibility for incident response; enable for all VPCs

Azure critical misconfigurations

MisconfigurationSeverityRemediation
Owner role at subscription scope (too many users)CriticalReview all Owner assignments; use PIM for just-in-time activation
Service principal with client secret (not managed identity)HighMigrate to Managed Identity wherever possible; secrets require manual rotation
No Conditional Access policyHighRequire MFA for all users, restrict access to managed devices for admin roles
Diagnostic settings not configuredMediumEnable Activity Log export to Storage/Log Analytics; default retention is 90 days

GCP critical misconfigurations

MisconfigurationSeverityRemediation
allUsers on GCS bucketCriticalRemove allUsers binding; enable uniform bucket-level access; use Org Policy to prevent
Service account keys created and sharedCriticalDelete all SA keys; use Workload Identity Federation instead
Owner/Editor primitive roles at projectHighMigrate to predefined roles; primitive roles are far too broad
No org-level policy constraintsMediumEnable: constraints/iam.disableServiceAccountKeyCreation, constraints/compute.disableGuestAttributesAccess
One pattern that appears in every major cloud breach: the attacker used legitimate credentials. Not a zero-day. Not a sophisticated exploit. A real AWS access key from a GitHub commit, a real Azure service principal from a leaked configuration file, a real GCP service account key from an unsecured S3 bucket. The most important investment in cloud security is not detection or response — it is keeping credentials from leaking in the first place, and minimizing what those credentials can do when they do leak.

CSPM: automated posture management

Cloud Security Posture Management (CSPM) tools continuously scan your cloud environment against security benchmarks and flag misconfigurations before (or after) they are exploited. They do not require deploying agents — they use the cloud provider's read-only APIs with a configured set of credentials. Three tools cover the landscape: Prowler for deep AWS coverage, ScoutSuite for multi-cloud assessments, and Checkov for Infrastructure-as-Code scanning before deployment.

shell
# ---- Prowler (AWS — 500+ checks against CIS, PCI-DSS, NIST, SOC2) ----
pip install prowler

# Full assessment with HTML + JSON + CSV output:
prowler aws --output-formats html json csv -o output/

# Assess only specific service (faster for CI/CD):
prowler aws --service iam s3 cloudtrail

# Run specific CIS checks:
prowler aws --compliance cis_1.5_aws

# Check only critical/high findings:
prowler aws --severity critical high

# ---- ScoutSuite (multi-cloud: AWS, Azure, GCP, OCI, Alibaba) ----
pip install scoutsuite
python scout.py aws --report-dir /tmp/scoutsuite-report/
python scout.py azure --tenant <tenant-id> --report-dir /tmp/
python scout.py gcp --report-dir /tmp/

# Open the self-contained HTML report:
# open /tmp/scoutsuite-report/scoutsuite-results/scoutsuite_report.html

# ---- Checkov (IaC: Terraform, CloudFormation, K8s, Helm, ARM) ----
pip install checkov

# Scan a Terraform directory:
checkov -d terraform/ --compact

# Scan a specific CloudFormation template:
checkov -f cloudformation-template.yaml

# Scan Kubernetes manifests:
checkov -d k8s-manifests/ --framework kubernetes

# Fail CI/CD pipeline if any HIGH findings exist:
checkov -d terraform/ --check HIGH --compact --quiet
# Returns exit code 1 if findings exist

# Suppress a specific check (with justification comment):
checkov -d terraform/ --skip-check CKV_AWS_18

Integrating CSPM into CI/CD

The highest-value integration: run Checkov as a required CI/CD check on every pull request that modifies infrastructure code. A developer who creates a public S3 bucket in Terraform sees the Checkov failure in their PR, not six months later in a breach notification. Prowler as a scheduled job (daily or weekly) catches what slips through configuration drift — a manual change through the console, a permission added "temporarily", a security group rule that was never removed.

Network security in cloud: security groups and VPCs

Cloud networks use abstractions over physical networking but the concepts and the risks are the same: keep the attack surface small, apply least privilege at the network layer, know where your traffic goes, log it all. The three major controls are security groups / network security groups (instance-level firewall), VPCs (network segmentation), and VPC Flow Logs (traffic visibility).

Security groups: the cloud firewall

RuleSeverityRemediation
0.0.0.0/0 inbound to port 22 (SSH)CriticalEntire internet can attempt SSH to your EC2 instance; restrict to known IPs or VPN
0.0.0.0/0 inbound to port 3389 (RDP)CriticalSame as above for Windows; RDP is actively scanned and brute-forced
0.0.0.0/0 inbound to database ports (3306/5432/27017/6379)CriticalDatabases should never be internet-accessible; restrict to application security group only
Overly broad egress (all traffic allowed)MediumRestrict outbound to known destinations; limits blast radius if a workload is compromised
Unused security groupsLowReduces sprawl; unused rules may be re-attached to new instances with unintended scope

VPC design for security

shell
# AWS VPC best practice architecture:
#
# Internet Gateway
#       |
# Public subnets      → Load balancers, NAT Gateways, Bastion hosts
#       |
# Private subnets     → Application servers (EC2, ECS, Lambda in VPC)
#       |
# Data subnets        → RDS, ElastiCache, OpenSearch (no internet route)
#
# Key rules:
# - Data subnets: NO route to internet gateway
# - Application servers: access internet only via NAT Gateway (outbound only)
# - Separate VPCs (or accounts) for prod/staging/dev
# - VPC Peering / Transit Gateway for controlled cross-VPC access
# - PrivateLink for AWS service access (no internet, no NAT)

# Enable VPC Flow Logs (essential for incident response):
aws ec2 create-flow-logs \
  --resource-type VPC \
  --resource-ids vpc-12345abc \
  --traffic-type ALL \
  --log-destination-type cloud-watch-logs \
  --log-group-name /aws/vpc/flowlogs \
  --deliver-logs-permission-arn arn:aws:iam::<account>:role/flowlogs-role

# Query VPC Flow Logs in Athena to find unexpected outbound traffic from a compromised instance:
# SELECT srcaddr, dstaddr, dstport, packets, bytes
# FROM vpc_flow_logs
# WHERE srcaddr = '10.0.1.14'
# AND action = 'ACCEPT'
# ORDER BY bytes DESC
# LIMIT 50;
💡 VPC Flow Logs are the most underutilised security control in AWS. They cost almost nothing to store in S3 (compressed), they are queryable in Athena for free, and they are the only record of which instances communicated with which IPs on which ports. Without them, incident response in a cloud breach is essentially blind. Enable them for every VPC, every region, on day one. The question is not "should I pay for flow logs" — it is "can I afford not to know what my instances are doing on the network?"

Zero trust posture and continuous compliance

Zero trust in cloud is not a product you buy. It is a set of principles that, when applied consistently, reduce the blast radius of any single failure — whether that failure is a leaked credential, a compromised instance, a supply chain attack on a dependency, or an insider threat. The principle is simple: never assume a request is legitimate because of where it comes from or what identity it claims. Verify it explicitly, every time, against the minimum necessary permissions, with full logging.

The six pillars of zero trust in cloud

PillarImplementation in cloudTools
IdentityEvery API call authenticated; MFA everywhere; short-lived credentials (roles/tokens) not long-lived access keys; anomaly detection on API patternsIAM Identity Center, PIM, Entra ID Conditional Access, GCP Workload Identity
Least privilegeAutomated enforcement via SCPs/guardrails; regular access reviews (remove what is not used); permission boundaries cap max permissionsAWS IAM Access Analyzer, Azure AD Access Reviews, GCP IAM Recommender
DeviceRestrict console and sensitive API access to corporate managed devices; MDM enrollment required for admin rolesConditional Access (Azure), AWS SSO device trust, BeyondCorp (GCP)
NetworkVPC segmentation; security groups by application tier; PrivateLink for managed service access; no management ports open to internetAWS VPC, Azure VNet, GCP VPC; flow logs in all
LoggingCloudTrail/Activity Logs/Cloud Audit Logs in EVERY region, EVERY service; immutable log storage; SIEM ingestionAWS CloudTrail + Security Lake, Azure Sentinel, GCP Chronicle
Detection & ResponseGuardDuty/Defender for Cloud/Security Command Center for anomaly detection; automated remediation (Lambda auto-disabling public buckets, auto-revoking compromised keys)EventBridge + Lambda, Security Hub, Azure Policy auto-remediation

Automated remediation examples

python
# AWS Lambda auto-remediating a newly public S3 bucket
# (triggered by EventBridge rule on s3:PutBucketPublicAccessBlock event):
import boto3

def lambda_handler(event, context):
    s3 = boto3.client('s3')
    bucket = event['detail']['requestParameters']['bucketName']
    s3.put_public_access_block(
        Bucket=bucket,
        PublicAccessBlockConfiguration={
            'BlockPublicAcls': True,
            'IgnorePublicAcls': True,
            'BlockPublicPolicy': True,
            'RestrictPublicBuckets': True
        }
    )
    print(f"Blocked public access on {bucket}")

# AWS Security Hub custom action — auto-disable compromised IAM user:
# EventBridge rule: SecurityHub > Findings > Filter on IAM compromised credential
# Target: Lambda that calls iam.update_access_key(Status='Inactive')

# Azure Policy auto-remediation — enforce storage account HTTPS-only:
az policy assignment create \
  --name "enforce-https-storage" \
  --policy "34c877ad-507e-4c82-993e-3452a6e0ad3c" \
  --scope "/subscriptions/<sub-id>" \
  --location eastus \
  --identity-scope "/subscriptions/<sub-id>" \
  --role "Contributor"

The goal of continuous monitoring and automated remediation is to reduce the mean time to remediation (MTTR) for common misconfigs from weeks (manual review cycle) to seconds (automated fix triggered by the misconfiguration itself). Not everything can be auto-remediated — some fixes need human judgment. But the known, high-confidence findings (public S3 bucket, disabled MFA, compromised credential indicator) should trigger immediate automated action with a notification to the security team.

Reactions

Related Articles