65%
of SOC analysts report burnout from alert fatigue (Exabeam 2024)
4,000+
daily alerts processed by average enterprise SOC
19 minutes
average time SOC analysts spend on each alert — most is manual
53%
of organizations use or plan to use a managed SOC or MDR service

A Security Operations Center is an organizational function, not a room. The traditional image of analysts staring at dashboards in a dark room with screens covering the walls is a poor model for effective security operations. An effective SOC is defined by its processes (detection, triage, investigation, response), its people (structured tiers, clear escalation, manageable alert loads), and its technology (SIEM, SOAR, EDR, threat intelligence). This guide covers the design decisions that determine whether a SOC operates effectively or burns through analysts while attackers move freely.

SOC Model Decision: Build, Buy, or Hybrid

The first and most consequential SOC decision is whether to build an in-house capability, contract a managed service, or combine both. Each model has different costs, capabilities, and organizational requirements.

In-house SOC

Full internal ownership of detection, triage, investigation, and response. Advantages: deepest knowledge of the environment, fastest response (no hand-off to a third party), full control over processes and tooling, retention of institutional knowledge. Disadvantages: highest cost (24/7 coverage requires 8-10 FTEs minimum), difficult to staff with senior talent, challenging to maintain expertise across all attack vectors. Viable for large enterprises with mature security programs and sufficient budget.

Managed Detection and Response (MDR)

A managed security provider operates the detection and response function. MDR providers supply analysts, tooling (typically their own SIEM/EDR stack), and 24/7 coverage. Advantages: lower upfront cost than building in-house, immediate access to experienced analysts, threat intelligence sharing across the provider's client base. Disadvantages: less contextual knowledge of your environment, dependency on provider processes, limited customization. Appropriate for organizations without the scale or budget for an internal SOC.

Hybrid model

Internal team handles Tier 2-3 investigation, threat hunting, and high-severity incidents; MDR provider handles Tier 1 alert triage and 24/7 overnight/weekend coverage. This is the most common model for mid-market enterprises: the provider covers the volume and off-hours, the internal team covers depth and institutional knowledge. Requires clear escalation procedures and SLAs between internal and provider teams.

Co-managed SOC

The organization owns the tooling (SIEM, EDR) and an MDR provider accesses it to provide analyst coverage. The organization retains data ownership and tooling investment; the provider supplies the analyst capacity. Good for organizations that have invested in a SIEM but lack analysts to operate it effectively.

SOC Staffing: Tiers and Roles

SOC staffing follows a tiered model that separates high-volume routine triage from deep investigation and engineering work.

Tier 1 — Alert Triage Analysts

Handle initial alert triage: confirm or dismiss alerts based on playbooks, enrich alerts with threat intelligence, and escalate confirmed incidents to Tier 2. Tier 1 handles the highest volume of work and requires the most coverage (including overnight and weekends). Target: 15-30 minutes per alert; triage rate of 50-100 alerts per analyst per 8-hour shift. Tier 1 analysts run playbooks, not judgments.

Tier 2 — Incident Response Analysts

Investigate confirmed incidents escalated from Tier 1. Perform deeper analysis: timeline reconstruction, scope determination, containment decisions, and root cause identification. Tier 2 analysts require deeper technical knowledge: memory forensics, log analysis, malware triage, and lateral movement detection. Target: 5-10 active investigations per analyst.

Tier 3 — Senior Analysts and Threat Hunters

Handle the most complex incidents, perform threat hunting, and develop new detection content. Tier 3 analysts bridge security operations and detection engineering — their investigation findings become new SIEM rules and playbooks. Some organizations separate threat hunting into a dedicated function rather than treating it as Tier 3.

Detection Engineers

Write and maintain detection rules (Sigma rules, SIEM correlation rules, EDR custom detections), tune existing rules to reduce false positives, and convert threat hunting findings into automated detections. Not directly involved in incident response; focused on improving the detection capability that feeds Tier 1.

SOC Manager

Manages the SOC function: staffing, metrics, process improvement, escalation, and stakeholder communication. The SOC manager translates operational metrics into security program reporting for leadership and coordinates with other security teams (vulnerability management, identity, network security).

Free daily briefing

Briefings like this, every morning before 9am.

Threat intel, active CVEs, and campaign alerts, distilled for practitioners. 50,000+ subscribers. No noise.

Staffing for 24/7 Coverage

24/7 SOC coverage requires more FTEs than most security leaders expect. A common mistake is planning for 24/7 coverage with 3 analysts.

Shift coverage math

24/7 coverage requires 4-5 shifts per position to account for weekends, holidays, vacation, and sick time. A single Tier 1 analyst position operating 24/7 requires 4.5 FTEs. A minimal 24/7 Tier 1 operation with two analysts per shift requires 9 FTEs in that tier alone. Adding Tier 2 and management, a minimal in-house 24/7 SOC requires 15-20 FTEs. This is why the hybrid model (MDR for overnight/weekend Tier 1) is so common.

Analyst burnout and retention

Alert fatigue and overnight shift work drive high SOC analyst turnover — average SOC analyst tenure is 18-24 months. SOAR automation of routine alert enrichment, playbook automation for common alert types, and reasonable alert loads (fewer than 50 actionable alerts per shift) are the primary mitigations. Organizations that maintain 3-year+ analyst tenure have typically automated Tier 1 triage to the point where analysts make judgments rather than performing mechanical data lookups.

SOC Technology Stack

The SOC technology stack must be designed for analyst workflow, not vendor feature lists. Each tool must serve a specific function and integrate with adjacent tools.

SIEM (foundation)

Aggregates all security telemetry, runs detection rules, generates alerts, and provides the investigation interface. All other SOC tools feed into or are accessed from the SIEM. Selection: see the SIEM Buyer's Guide for evaluation criteria. Non-negotiable: the SIEM must process data fast enough for real-time triage.

SOAR (automation)

Automates alert enrichment and playbook execution. Every Tier 1 alert should be automatically enriched before an analyst sees it: threat intelligence lookup, asset ownership lookup, similar recent alerts, and user context. SOAR that handles this automation reduces analyst time per alert by 50-70%.

EDR (endpoint visibility and response)

Provides process execution telemetry, network connection data, and remote response capability for analysts. EDR is the primary investigation tool for endpoint-based incidents. Integration with SIEM is essential: EDR alerts should be correlated with SIEM detections for full context.

Threat Intelligence Platform (TIP)

Aggregates threat intelligence from commercial and open source feeds and correlates IOCs against internal telemetry. A TIP that integrates with SIEM and SOAR automates IOC matching and alert enrichment. MISP (open source) and commercial platforms (Anomali, ThreatQ, Recorded Future) serve this function.

Ticketing and case management

Every incident needs a tracked case with timeline, evidence, actions taken, and outcome. ServiceNow, Jira Service Management, and SIEM-native case management (Sentinel, Splunk ITSM) are common choices. Key requirement: case data must be searchable for post-incident review and metrics calculation.

SOC Metrics That Matter

SOC metrics must measure security outcomes, not activity volume. Common metrics that mislead: total alerts processed, total incidents created. Metrics that reflect effectiveness:

Mean Time to Detect (MTTD)

Time from when an attack begins to when the SOC generates an alert. This measures detection coverage and rule quality. Target: under 24 hours for high-severity threats; under 1 hour for critical severity. Track by threat category — MTTD for phishing should differ from MTTD for lateral movement.

Mean Time to Respond (MTTR)

Time from alert generation to containment action. This measures triage and response efficiency. Broken into components: time to triage (analyst acknowledges and begins investigation), time to escalate, time to contain. Each component reveals different improvement opportunities.

Alert-to-incident ratio

The percentage of alerts that escalate to confirmed incidents. A healthy ratio is 5-15% in a well-tuned environment. Below 5% suggests overly specific detection rules missing threats; above 30% suggests under-tuned rules generating excessive false positives. Track over time — ratio should improve as rules are tuned.

False positive rate by rule

Track false positive rates at the individual detection rule level. Rules with false positive rates above 50% should be tuned or removed. Rules with 100% false positive rates waste analyst time and train teams to ignore alert categories.

SLA compliance

Define and track SLAs for each alert severity: Critical alerts acknowledged within 15 minutes, High within 1 hour, Medium within 4 hours. SLA compliance rate measures staffing adequacy and prioritization discipline.

Playbooks and Process: The Operational Foundation

Technology alone does not make an effective SOC. Documented playbooks ensure consistent, quality response regardless of which analyst handles an incident.

Alert-level playbooks

Every common alert type should have a documented triage playbook: what to check, what questions to answer, what constitutes a true positive, when to escalate, and what immediate containment steps to take. Playbooks reduce analyst decision-making time and ensure junior analysts perform consistent triage.

Incident response playbooks

Higher-level playbooks for confirmed incident types: ransomware, BEC, credential compromise, data exfiltration. IR playbooks cover investigation steps, containment actions, evidence preservation, stakeholder notification, and recovery sequencing.

Escalation procedures

Define clear escalation criteria: which alert types go directly to Tier 2, when does an incident trigger executive notification, when is legal/compliance involved, when are law enforcement or external IR firms engaged. Unclear escalation procedures during an incident cause delays that expand impact.

The bottom line

An effective SOC is built on clear process, right-sized staffing for the chosen model, a technology stack optimized for analyst workflow, and metrics that measure security outcomes rather than activity volume. The build vs. hybrid vs. MDR decision should be driven by realistic assessment of available budget, talent, and the organization's required response capability — not by aspiration to build a world-class in-house SOC that the budget cannot support. A well-operated MDR service consistently outperforms an understaffed in-house team.

Frequently asked questions

What does a Security Operations Center (SOC) do?

A SOC is the team (and function) responsible for monitoring security telemetry, detecting threats, investigating incidents, and coordinating response. SOC operations include: continuous monitoring of SIEM alerts, threat hunting for undetected threats, incident triage and investigation, containment and remediation coordination, and security metrics reporting. The SOC is the organization's primary defense against active attacks and the first line of investigation when a breach occurs.

How many people do you need for a 24/7 SOC?

A minimal 24/7 in-house SOC covering Tier 1 triage with two analysts per shift requires approximately 9-10 FTEs in the Tier 1 role alone (accounting for shift coverage, weekends, vacation, and sick time). Adding Tier 2 analysts, a Tier 3 senior analyst, detection engineer, and SOC manager, a realistic minimum for an in-house 24/7 SOC is 15-20 total FTEs. Most organizations with fewer than 5,000 employees find MDR or a hybrid model more cost-effective.

What is the difference between a SOC and an MDR service?

An internal SOC is a team built and operated within the organization using the organization's own tools, processes, and analysts. An MDR (Managed Detection and Response) service is a third-party provider that supplies analysts, tooling, and 24/7 monitoring as a service. MDR typically includes the provider's own SIEM and EDR stack. The hybrid model — internal team for Tier 2-3, MDR for Tier 1 triage and 24/7 coverage — is the most common approach for mid-market enterprises.

What technology does a SOC need?

Essential SOC technology: (1) SIEM for telemetry aggregation, detection, and investigation; (2) SOAR for alert enrichment automation and playbook execution; (3) EDR for endpoint visibility and remote response; (4) Threat Intelligence Platform for IOC enrichment; and (5) case management for incident tracking. Optional but valuable: NDR for network threat detection, deception technology, UEBA for behavioral analytics. The stack should be evaluated for workflow integration — tools that require analysts to manually copy data between systems waste investigation time.

What are the most important SOC metrics?

The metrics that best measure SOC effectiveness: Mean Time to Detect (MTTD — how long before threats are found), Mean Time to Respond (MTTR — how long to contain after detection), alert-to-incident ratio (what percentage of alerts are confirmed threats — reflects detection tuning quality), false positive rate by rule (identifies rules that waste analyst time), and SLA compliance rate (whether alerts are investigated within defined timeframes). Avoid metrics that measure activity without measuring outcomes: raw alert count, tickets closed, emails sent.

How do you reduce SOC analyst burnout?

SOC analyst burnout is primarily driven by alert fatigue — too many low-quality alerts requiring manual, repetitive work. The most effective interventions: (1) SOAR automation of routine alert enrichment (eliminate manual context gathering); (2) aggressive rule tuning to reduce false-positive rate below 15%; (3) capping analyst alert loads to a manageable daily volume; (4) rotation of analysts through different functions (hunting, engineering, response) to break alert triage monotony; (5) clear career progression paths so analysts see a future beyond Tier 1 triage.

Sources & references

  1. SANS — Building a World-Class Security Operations Center
  2. NIST SP 800-61 — Computer Security Incident Handling Guide
  3. Exabeam — State of the SOC Report 2024
  4. Gartner — Market Guide for Managed Detection and Response

Free resources

25
Free download

Critical CVE Reference Card 2025–2026

25 actively exploited vulnerabilities with CVSS scores, exploit status, and patch availability. Print it, pin it, share it with your SOC team.

No spam. Unsubscribe anytime.

Free download

Ransomware Incident Response Playbook

Step-by-step 24-hour IR checklist covering detection, containment, eradication, and recovery. Built for SOC teams, IR leads, and CISOs.

No spam. Unsubscribe anytime.

Free newsletter

Get threat intel before your inbox does.

50,000+ security professionals read Decryption Digest for early warnings on zero-days, ransomware, and nation-state campaigns. Free, weekly, no spam.

Unsubscribe anytime. We never sell your data.

Eric Bang
Author

Founder & Cybersecurity Evangelist, Decryption Digest

Cybersecurity professional with expertise in threat intelligence, vulnerability research, and enterprise security. Covers zero-days, ransomware, and nation-state operations for 50,000+ security professionals weekly.

Free Brief

The Mythos Brief is free.

AI that finds 27-year-old zero-days. What it means for your security program.

Joins Decryption Digest. Unsubscribe anytime.

Daily Briefing

Get briefings like this every morning

Actionable threat intelligence for working practitioners. Free. No spam. Trusted by 50,000+ SOC analysts, CISOs, and security engineers.

Unsubscribe anytime.

Mythos Brief

Anthropic's AI finds zero-days your scanners miss.