PART IV — OPERATIONS, COMPLIANCE, AND COST
Chapter 9 — Monitoring & Incident Response
Section titled “Chapter 9 — Monitoring & Incident Response”9.1 Comprehensive Logging Strategy
Section titled “9.1 Comprehensive Logging Strategy”Log Categories and Collection
Section titled “Log Categories and Collection”Required Log Sources
| Log Type | AWS Sources | Azure Sources | GCP Sources | Retention |
|---|---|---|---|---|
| Control Plane | CloudTrail, Config | Activity Log, Policy | Audit Logs, Cloud Logging | 365 days |
| Network | VPC Flow Logs | NSG Flow Logs | VPC Flow Logs | 90 days |
| Application | CloudWatch Logs | App Insights | Cloud Logging | 90 days |
| Security | GuardDuty, Inspector | Security Center | Security Command Center | 365 days |
| Database | RDS Logs, CloudTrail | SQL Audit | Cloud SQL Logs | 90 days |
Centralized Log Architecture
LogAggregation: Collection: - AWS: CloudWatch Logs → S3 → SIEM - Azure: Log Analytics → Event Hub → SIEM - GCP: Cloud Logging → Pub/Sub → BigQuery → SIEM
Processing: - Parsing: Field extraction and normalization - Enrichment: Threat intelligence integration - Correlation: Multi-source event correlation - Storage: Hot/warm/cold tier optimization
Analysis: - Real-time: Stream processing for alerts - Batch: Historical trend analysis - ML: Anomaly detection and pattern recognition - Forensics: Long-term query capabilitiesLog Format Standards
Section titled “Log Format Standards”Common Event Format (CEF)
CEF:Version|Device Vendor|Device Product|Device Version|Signature ID|Name|Severity|ExtensionCustom Log Schema
{ "timestamp": "2025-02-11T10:30:45Z", "event_type": "iam_policy_change", "severity": "high", "source": { "service": "aws", "region": "us-east-1", "account": "123456789012" }, "actor": { "user": "admin@company.com", "ip": "203.0.113.45", "mfa_status": "verified" }, "action": { "operation": "AttachUserPolicy", "resource": "arn:aws:iam::123456789012:user/user1", "policy": "arn:aws:iam::aws:policy/AdministratorAccess" }, "risk_score": 85}9.2 Security Monitoring Architecture
Section titled “9.2 Security Monitoring Architecture”Real-time Threat Detection
Section titled “Real-time Threat Detection”Detection Rules Framework
DetectionRules: PrivilegeEscalation: - Rule: "IAM Policy Attachment to Sensitive Roles" Conditions: - action: "AttachUserPolicy OR AttachRolePolicy" - policy: "*Admin* OR *FullAccess*" - user_not_in: ["security_team", "devops_team"] Severity: "High" Response: "Alert + Block"
- Rule: "Multiple Failed MFA Attempts" Conditions: - event: "ConsoleLoginFailure" - mfa_failure_count: "> 3 in 5 minutes" - same_user: true Severity: "Medium" Response: "Alert + Temporary Lockout"
DataExfiltration: - Rule: "Large Data Transfer to External" Conditions: - data_volume: "> 1GB in 1 hour" - destination: "external IP ranges" - protocol: "HTTPS, FTP, SFTP" Severity: "High" Response: "Alert + Block + Investigate"
- Rule: "Unusual S3 Access Patterns" Conditions: - unusual_time_access: "2AM-5AM" - download_volume: "10x normal baseline" - new_ip_address: true Severity: "Medium" Response: "Alert + MFA Challenge"
NetworkAnomalies: - Rule: "Port Scanning Activity" Conditions: - connection_attempts: "> 100 ports" - time_window: "5 minutes" - source: "single IP" Severity: "Medium" Response: "Block Source IP + Alert"
- Rule: "Lateral Movement Detection" Conditions: - internal_connections: "> new normal" - privileged_ports: [22, 3389, 1433, 3306] - time_window: "1 hour" Severity: "High" Response: "Alert + Investigate"SIEM Integration
Section titled “SIEM Integration”Alert Escalation Matrix
| Severity | Response Time | Escalation | Actions |
|---|---|---|---|
| Critical | < 5 minutes | Immediate page | Incident response team |
| High | < 15 minutes | 30 min escalation | Security team notification |
| Medium | < 1 hour | 4 hour escalation | Email + ticket creation |
| Low | < 24 hours | Weekly review | Log entry only |
9.3 Incident Response Lifecycle
Section titled “9.3 Incident Response Lifecycle”Phase 1: Detection
Section titled “Phase 1: Detection”Detection Mechanisms
DetectionSources: Automated: - SIEM correlation rules - Threat intelligence feeds - Anomaly detection algorithms - Vulnerability scan results - User behavior analytics
Manual: - Security team monitoring - Employee reports - External notifications - Compliance audit findings
Indicators: - Unauthorized access attempts - Data access anomalies - Configuration changes - Performance degradation - Alert floodsPhase 2: Containment
Section titled “Phase 2: Containment”Containment Strategies
ContainmentActions: Network: - Block malicious IP addresses - Isolate compromised subnets - Disable compromised accounts - Implement network segmentation
System: - Isolate affected instances - Disable compromised credentials - Stop malicious processes - Snapshot evidence
Data: - Prevent data exfiltration - Implement additional encryption - Restrict data access - Preserve evidencePhase 3: Investigation
Section titled “Phase 3: Investigation”Forensic Investigation Process
InvestigationSteps: EvidenceCollection: - System memory dumps - Disk images - Network captures - Log files - Configuration snapshots
TimelineReconstruction: - Initial compromise point - Lateral movement paths - Data access patterns - Persistence mechanisms - Exfiltration methods
ImpactAssessment: - Affected systems and data - Data breach scope - Business impact analysis - Regulatory notification requirementsPhase 4: Eradication
Section titled “Phase 4: Eradication”Eradication Activities
EradicationTasks: MalwareRemoval: - Scan and clean systems - Remove persistence mechanisms - Patch vulnerabilities - Update security controls
AccessControl: - Reset all credentials - Review and update permissions - Implement additional MFA - Strengthen authentication
SystemHardening: - Update security configurations - Implement additional monitoring - Deploy endpoint protection - Harden network controlsPhase 5: Recovery
Section titled “Phase 5: Recovery”Recovery Planning
RecoverySteps: SystemRestoration: - Restore from clean backups - Validate system integrity - Reinstall critical applications - Test functionality
Validation: - Security testing - Performance validation - Access control verification - Monitoring confirmation
Communication: - Stakeholder notifications - Customer communications - Regulatory reports - Post-incident briefingsPhase 6: Learning
Section titled “Phase 6: Learning”Post-Incident Activities
PostIncidentActivities: RootCauseAnalysis: - Identify security gaps - Analyze detection failures - Review response effectiveness - Document lessons learned
ImprovementPlanning: - Update security controls - Enhance monitoring capabilities - Improve response procedures - Conduct additional training
KnowledgeSharing: - Update incident response playbooks - Share threat intelligence - Update security awareness training - Document for compliance audits