PRACTITIONER GUIDE | DETECTION ENGINEERING

Practitioner GuideMay 15, 202616 min read

How to Write KQL Queries for Microsoft Sentinel: Detection Engineering Guide

Sources:Microsoft: KQL Quick Reference for Microsoft Sentinel|Microsoft Learn: Write Your First KQL Query|Microsoft Sentinel: Scheduled Analytics Rules|Microsoft Sentinel GitHub: Community Detection Rules|MITRE ATT&CK: Techniques and Mitigations

Eric Bang

Founder & Cybersecurity Evangelist

15,000+

Microsoft Sentinel customers globally across enterprise and government sectors

30+

Native data connector categories available in the Sentinel content hub

Top 5

Most-searched Sentinel topic on Microsoft Learn is KQL query writing

Under 5 minutes

Near-real-time (NRT) detection rule latency from event ingestion to alert generation

Microsoft Sentinel processes billions of security events daily across its customer base, but a SIEM is only as useful as the queries running against it. KQL is the layer where raw log data becomes actionable detection. Every scheduled detection rule, every near-real-time alert, every threat hunting workbook, and every investigation query in Sentinel is written in KQL. It is not optional knowledge for anyone operating Sentinel seriously.

The good news is that KQL's pipe-based model is logically intuitive once you stop trying to map it onto SQL. The bad news is that writing KQL that is both correct and performant at enterprise log volumes requires deliberate practice and an understanding of how Azure Log Analytics executes queries under the hood. This guide covers KQL fundamentals, the core operators security analysts use daily, production-grade detection scenarios with annotated queries, and performance optimization patterns that matter when querying tables with hundreds of millions of rows.

KQL Syntax Fundamentals for Security Analysts

Every KQL query starts with a table name and then applies a series of pipe-separated operators. The result of each operator becomes the input to the next. Reading a KQL query from top to bottom is reading the transformation pipeline in order.

The simplest useful query is a table with a time filter:

SecurityEvent
| where TimeGenerated > ago(24h)
| where EventID == 4625
| limit 100

This retrieves the last 24 hours of failed logon events (EventID 4625) from the Windows Security Event log, returning the first 100 rows. The ago() function is one of KQL's most important time helpers: ago(1h) means one hour ago, ago(7d) means seven days ago. This is far more readable than SQL's WHERE timestamp > DATEADD(hour, -24, GETDATE()).

KQL is case-sensitive for table names and column names but case-insensitive for operators and functions. SecurityEvent and securityevent refer to different things; where and WHERE are identical. String comparisons are case-sensitive by default: where AccountName == "Administrator" will not match "administrator". Use the =~ operator for case-insensitive string equality: where AccountName =~ "administrator".

The pipe symbol | (not to be confused with SQL's OR operator) chains operators. Unlike SQL, there is no concept of a subquery or nested SELECT; instead, operators compose linearly. This makes KQL queries very readable once the operator vocabulary is learned.

Key syntax rules for security analysts:

String literals use double quotes: "value"
Comments use // (single line) or /* */ (block)
Time literals: 1h, 30m, 7d, 1y. No quotes needed when used with ago() or between()
List membership: where EventID in (4624, 4625, 4648). The in operator accepts a comma-delimited list
Null handling: where isnotempty(TargetUserName) : use isempty() and isnotempty() rather than == null
The let statement defines a variable or inline function: let threshold = 5; followed by the query that references threshold

Understanding these fundamentals eliminates the most common syntax errors that block new KQL writers.

Core Operators Every Security Analyst Must Know

Six operators cover the majority of security detection and investigation queries: where, summarize, project, join, extend, and parse. Adding mv-expand for handling multi-value fields rounds out the essential set.

where filters rows. It is the most performance-critical operator because it reduces the row count early in the pipeline. Always filter TimeGenerated first, then indexed fields like account names, host names, and IP addresses before computed fields.

summarize aggregates rows into groups, equivalent to SQL's GROUP BY. The security analyst's most common pattern is counting events per entity:

SecurityEvent
| where TimeGenerated > ago(1h)
| where EventID == 4625
| summarize FailedCount = count() by TargetAccount, IpAddress
| where FailedCount > 10
| order by FailedCount desc

This counts failed logins per account and source IP, filtering to cases with more than 10 failures in an hour, a basic brute-force pattern.

project selects and renames columns, equivalent to SQL's SELECT with column aliasing. Use it after summarize to trim results to only what matters for the alert: | project TargetAccount, IpAddress, FailedCount.

extend adds computed columns without removing existing ones. It is the right operator for enrichment: | extend Hour = bin(TimeGenerated, 1h) adds a rounded time bucket for time-series analysis, or | extend IsExternalIP = ipv4_is_private(IpAddress) == false adds a boolean flag.

join combines two tables on a common key. The most common security pattern is leftouter (return all rows from the left even without a match) or inner (return only matching rows). The kind parameter is required: | join kind=inner (AnotherTable | where ...) on AccountName. Always filter the right-side table inside parentheses before the join, not after, to reduce the row set being joined.

parse extracts structured fields from unstructured strings using pattern matching. For log sources that arrive as raw strings (Syslog, custom application logs), parse extracts fields inline: | parse RawData with * "src=" SrcIP " " * extracts the source IP from a firewall log line.

mv-expand expands a column containing a dynamic array or property bag into multiple rows, one per element. This is essential for working with JSON-formatted fields in Azure logs: | mv-expand Entities expands a Sentinel alert's entity list so each entity becomes its own row for filtering.

Free daily briefing

Briefings like this, every morning before 9am.

Threat intel, active CVEs, and campaign alerts, distilled for practitioners. 50,000+ subscribers. No noise.

Writing Scheduled and NRT Detection Rules

Detection rules in Sentinel are analytics rules that run KQL queries on a schedule and generate alerts and incidents when the query returns results. Understanding rule types and their configuration options is essential for building a detection program that produces signal without alert fatigue.

Scheduled rules are the most flexible rule type. Configuration parameters include:

Query: the KQL query to run
Run frequency: how often the rule executes (minimum 5 minutes, configurable up to 14 days)
Lookback period: how far back the query searches for data (up to 14 days)
Alert threshold: how many results trigger an alert (zero results, at least N results, or more than N results)
Event grouping: group all matching events into one alert, or create one alert per event row
Entity mapping: map columns in the query result to Sentinel entity types (Account, Host, IP, URL, FileHash, Process) to enable entity-based investigation and UEBA enrichment
Alert enrichment: alert name and description can include dynamic values from the query result using the {{ColumnName}} syntax
Incident creation: automatically create incidents from alerts, with grouping rules to merge related alerts into a single incident

NRT (Near-Real-Time) rules run with less than 5-minute latency from event ingestion. They are designed for time-critical detections where a 15-minute scheduled rule cycle is too slow. Limitations: NRT rules can only query a single table and cannot use join or union operators. They are best suited for high-confidence, single-table detections like impossible-to-miss privilege escalation events (EventID 4728, 4732, 4756: members added to privileged groups) or high-severity authentication events.

A well-structured scheduled rule for detecting password spray attacks:

let threshold = 20;
let distinctAccountThreshold = 10;
SigninLogs
| where TimeGenerated > ago(1h)
| where ResultType != "0"  // Exclude successful sign-ins
| where ResultType in (50126, 50053, 50055, 50056)  // Specific failure codes
| summarize
    FailureCount = count(),
    DistinctAccounts = dcount(UserPrincipalName),
    DistinctIPs = dcount(IPAddress),
    Accounts = make_set(UserPrincipalName, 20)
  by IPAddress, bin(TimeGenerated, 1h)
| where FailureCount > threshold
| where DistinctAccounts > distinctAccountThreshold
| project TimeGenerated, IPAddress, FailureCount, DistinctAccounts, DistinctIPs, Accounts

This detects IP addresses that fail authentication against more than 10 distinct accounts more than 20 times per hour, the signature pattern of a password spray. The let statements at the top make the thresholds easy to tune without modifying the query logic.

Common Security Detection Scenarios with Annotated Queries

The following detection patterns cover the scenarios most commonly requested by security teams building a Sentinel detection program from scratch.

Failed logins followed by success (credential stuffing indicator):

let failWindow = 10m;
let lookback = 1h;
SigninLogs
| where TimeGenerated > ago(lookback)
| where ResultType != "0"
| summarize FailCount = count(), FailTime = max(TimeGenerated)
  by UserPrincipalName, IPAddress
| where FailCount >= 5
| join kind=inner (
    SigninLogs
    | where TimeGenerated > ago(lookback)
    | where ResultType == "0"
    | project SuccessTime = TimeGenerated, UserPrincipalName, IPAddress
  ) on UserPrincipalName, IPAddress
| where SuccessTime > FailTime
| where SuccessTime - FailTime < failWindow
| project UserPrincipalName, IPAddress, FailCount, FailTime, SuccessTime

Privilege escalation: user added to a privileged group:

SecurityEvent
| where TimeGenerated > ago(1h)
| where EventID in (4728, 4732, 4756)  // Member added to security-enabled groups
| where TargetUserName in~ ("Domain Admins", "Enterprise Admins", "Schema Admins",
    "Administrators", "Account Operators", "Backup Operators")
| project TimeGenerated, SubjectUserName, MemberName, TargetUserName, Computer

Lateral movement via PsExec detection (process creation):

DeviceProcessEvents
| where TimeGenerated > ago(24h)
| where FileName =~ "PSEXESVC.exe" or InitiatingProcessFileName =~ "psexec.exe"
| project TimeGenerated, DeviceName, AccountName, FileName,
    InitiatingProcessFileName, InitiatingProcessCommandLine, RemoteIP

DNS exfiltration: high-frequency unique subdomain queries:

DnsEvents
| where TimeGenerated > ago(1h)
| where QueryType == "A" or QueryType == "TXT"
| extend Domain = tostring(split(Name, ".", -2))
| summarize SubdomainCount = dcount(Name), QueryCount = count()
  by Computer, Domain
| where SubdomainCount > 100
| where Domain !in ("microsoft.com", "windows.com", "azure.com")  // Allowlist common domains
| order by SubdomainCount desc

Each of these queries should be validated against at least 7 days of historical data before deployment. Review the result sets carefully: false positives in privileged-group queries often come from automated provisioning scripts, and DNS exfiltration queries generate noise from legitimate CDNs and DNS-based load balancers that use many subdomains.

Performance Optimization for Large Tables

Microsoft Sentinel's largest tables, SecurityEvent, SigninLogs, AuditLogs, CommonSecurityLog, and DeviceEvents, regularly contain hundreds of millions of rows per day in enterprise deployments. A poorly written query against these tables can time out, return incomplete results, or consume excessive compute units that increase the workspace cost.

Rule 1: Always filter TimeGenerated first. Azure Log Analytics uses TimeGenerated as the primary partition key. Every query should begin with | where TimeGenerated > ago(Xh) as the first operator after the table name. Filters on other columns applied before TimeGenerated force the engine to scan across time partitions.

Rule 2: Use specific EventID filters immediately. SecurityEvent contains every Windows event log event category. Without an EventID filter, a 24-hour SecurityEvent query can scan 50 to 500 million rows depending on audit policy configuration. Always filter to the specific event IDs relevant to your detection before any summarize or join: | where EventID in (4624, 4625, 4648, 4688, 4698).

Rule 3: Avoid contains with short strings. The contains operator performs a full substring scan that cannot use the inverted index. For strings longer than 4 characters, has uses the index and is 10 to 100 times faster. For exact matches, == is fastest. Use contains only when pattern matching is genuinely required and no alternative exists.

Rule 4: Filter before joining. When joining two tables, filter each table to the minimum necessary row set inside the join's parentheses before the join executes. A join that receives 10 million rows from the right table is far more expensive than one that receives 1,000 pre-filtered rows.

Rule 5: Use summarize to reduce cardinality before joins. When building correlation detections that join aggregated counts to raw events, compute the aggregated counts first (summarize), filter on the aggregate result, then join back to raw events for context. This pattern is called "summarize then join" and it keeps the join operand small.

Rule 6: Use materialized views for frequently queried patterns. Azure Log Analytics supports summary rules (in preview as of early 2026) that pre-aggregate data at ingestion time into a target table. For detection patterns that run every 5 minutes against large tables, a summary rule that pre-computes hourly aggregates can reduce per-query compute by 90 percent or more.

Monitoring query performance: use the | extend QueryExecutionTime = now() trick at the end of a query and compare against the startofquery to estimate runtime, or check the Azure Monitor Logs query performance metrics in the workspace's Monitoring section.

Testing and Deploying Rules Safely

A detection rule that produces hundreds of false positive alerts per day is worse than no rule at all: it trains analysts to ignore alerts, increases triage fatigue, and obscures real detections buried in the noise. Safe rule deployment requires deliberate validation before enabling.

Step 1: Run the query against historical data. Before creating any rule, run the KQL query in the Logs blade with a 14-day lookback window. Review every row in the result set. Ask: is this a real threat or a known-good process, account, or IP address? Document the legitimate activities that match the rule and build exclusions for them.

Step 2: Use the rule wizard's simulation. The Sentinel analytics rule creation wizard includes a "Results simulation" section that previews how many alerts the rule would have generated over the past 50 days based on the current query and threshold settings. If the simulation shows more than 5 to 10 alerts per day for a rule that should fire rarely, the threshold or exclusion logic needs adjustment.

Step 3: Start with a lower severity and disabled alerting. New detection rules can be configured to create alerts at Informational severity with no incident creation enabled. This lets the rule run in production and generate alerts that analysts can review without the alert appearing in the incident queue. After confirming the signal-to-noise ratio, escalate severity and enable incident creation.

Step 4: Use watchlists for dynamic exclusions. Rather than hardcoding exclusion values in the KQL query (which requires editing the rule each time the exclusion list changes), use a Sentinel watchlist to store exclusion values. The watchlist is queryable via _GetWatchlist("ExclusionList") and can be updated without modifying the rule. This is the production-grade approach for exclusions that change frequently, such as IT admin accounts, known-good IPs, or authorized scanning tools.

Step 5: Validate entity mappings. Entity mappings are critical for Sentinel's UEBA (User and Entity Behavior Analytics) enrichment and for the entity timeline investigation workflow. Verify that Account, Host, and IP entity mappings point to the correct columns in your query results by checking the alert entities after the first alert fires.

Step 6: Document rule logic in the description field. Production rules should include a description that explains what the rule detects, why the threshold was set to its current value, which ATT&CK techniques it covers, and known false positive patterns with their exclusion rationale. This documentation is critical for analyst triage and future rule maintenance.

The bottom line

KQL proficiency is the highest-leverage skill for any analyst or detection engineer operating Microsoft Sentinel. The core operator set (where, summarize, project, join, extend, parse, mv-expand) covers 90 percent of security use cases. The discipline that separates good detection rules from noisy ones is the same discipline that separates good KQL from slow KQL: filter early, filter specifically, and validate against real data before deploying. Build a library of tested query patterns, use watchlists for dynamic exclusions, and document every rule's logic and false positive profile for the analyst who will inherit it.

Frequently asked questions

What is KQL and how is it different from SQL?

KQL (Kusto Query Language) is a read-only, pipe-based query language where each operator receives a tabular result set from the previous operator and transforms it. SQL is declarative with SELECT, FROM, WHERE, and GROUP BY clauses that can appear in any order; KQL queries are strictly linear pipelines read left-to-right, top-to-bottom. KQL has no INSERT, UPDATE, or DELETE operations. Its syntax is optimized for time-series and log data: operators like `ago()`, `between()`, and `bin()` make time-based filtering natural, while `parse`, `extract`, and `mv-expand` handle semi-structured fields common in security logs. Analysts coming from SQL typically need two to three weeks of practice before KQL syntax becomes fluent.

What are the most important tables in Sentinel for security detection?

The highest-value tables for security detection are SecurityEvent (Windows Event Log forwarded via the Log Analytics agent or Azure Monitor Agent), SigninLogs and AADNonInteractiveUserSignInLogs (Azure AD authentication events), AuditLogs (Azure AD directory changes), OfficeActivity (Microsoft 365 operations), DeviceEvents and DeviceProcessEvents (Microsoft Defender for Endpoint telemetry via Microsoft 365 Defender connector), CommonSecurityLog (CEF-formatted events from firewalls and network appliances), and Syslog (Linux host events). SecurityEvent is the most queried table but also the highest-volume; filtering early with `where EventID in (4624, 4625, 4648, 4688, 4698, 4720, 4732, 4768, 4769)` before summarizing dramatically improves query performance.

How do I write a detection rule for impossible travel in Sentinel?

An impossible travel detection joins SigninLogs to itself on the same user account, looking for two successful sign-ins from different geographic locations within a time window too short for physical travel. The core logic uses `summarize` to get the first and last sign-in location per user per time window, then calculates the distance between them using the `geo_distance()` function (available in KQL). You filter where the distance exceeds a threshold (commonly 500 km) and the time delta is under a threshold (commonly 1 hour). Microsoft Sentinel ships a built-in impossible travel rule in the analytics content hub under the "Azure Active Directory" solution; reviewing and tuning that rule's logic is a better starting point than writing one from scratch.

What is the difference between Scheduled, NRT, and Fusion rules?

Scheduled rules run your KQL query on a configurable interval (minimum 5 minutes, commonly 1 to 24 hours) against a lookback window and generate alerts when the query returns results. NRT (Near-Real-Time) rules run with under 5-minute latency and are designed for high-priority detections where timing matters, but are limited to single-table queries without joins. Fusion rules are machine-learning-based correlation rules maintained by Microsoft that combine signals across multiple data sources to detect multi-stage attacks; they cannot be written by analysts and only fire when the Fusion engine's model identifies a correlated sequence. Microsoft Security rules automatically create Sentinel alerts from alerts generated by connected Microsoft security products (Defender for Endpoint, Defender for Cloud, etc.) without requiring KQL authoring.

How do I optimize slow KQL queries?

The single most impactful optimization is filtering on time and high-cardinality columns as early in the pipeline as possible, before any join or summarize operation. Use `where TimeGenerated > ago(1h)` at the top of every query, and add indexed column filters (AccountName, Computer, IPAddress) immediately after. Avoid using `contains` or `has` with short strings; prefer `startswith` or exact equality checks which use the inverted index. For joins, place the smaller table on the left side and use `join kind=leftsemi` when you only need to filter, not enrich. Replace `where * has "value"` full-text scans with specific column references. Use `summarize` with `bin(TimeGenerated, 1h)` to aggregate before joining when building time-bucketed correlation rules.

What is the best way to test a Sentinel detection rule before enabling it?

The safest testing workflow is to run the KQL query directly in the Logs blade against a representative time window (at least 7 days) before creating the rule. Examine the result set for false positives and tune the `where` filters accordingly. Use the rule creation wizard's "Test with current data" button to preview how many alerts the rule would have generated over the past 50 days. Set the alert threshold appropriately and use alert grouping to avoid alert storms if the query returns multiple rows per incident. For high-confidence new rules, start with a lower severity (Informational or Low) and monitor for a week before escalating to Medium or High. The Microsoft Sentinel GitHub repository contains community-contributed rules with known false positive patterns documented in rule metadata.

How do I convert a Sigma rule to KQL for Sentinel?

The primary tool for Sigma-to-KQL conversion is the `sigma` Python CLI with the `pySigma-backend-microsoft365defender` or `pySigma-backend-azuremonitor` backend depending on whether your data is in Microsoft 365 Defender tables or Azure Monitor / Sentinel tables. Install with `pip install sigma-cli pySigma-backend-azuremonitor` and run `sigma convert -t azuremonitor rule.yml`. The output KQL requires manual review: field name mappings between Sigma's generic schema and Sentinel's actual table field names (for example, Sigma's `CommandLine` maps to `Process.CommandLine` in DeviceProcessEvents) may need correction. The Uncoder.IO online tool provides browser-based Sigma conversion without requiring local tooling setup and supports multiple Sentinel table targets.

Sources & references

Free resources

Free download

Critical CVE Reference Card 2025–2026

25 actively exploited vulnerabilities with CVSS scores, exploit status, and patch availability. Print it, pin it, share it with your SOC team.

Free download

Ransomware Incident Response Playbook

Step-by-step 24-hour IR checklist covering detection, containment, eradication, and recovery. Built for SOC teams, IR leads, and CISOs.

Free newsletter

Get threat intel before your inbox does.

50,000+ security professionals read Decryption Digest for early warnings on zero-days, ransomware, and nation-state campaigns. Free, weekly, no spam.

Unsubscribe anytime. We never sell your data.

Author

Eric BangCISSP

Founder & Cybersecurity Evangelist, Decryption Digest

Cybersecurity professional with expertise in threat intelligence, vulnerability research, and enterprise security. Covers zero-days, ransomware, and nation-state operations for 50,000+ security professionals weekly.

View profile →LinkedIn

Back to all briefings

Subscribe for Updates

KQL Microsoft Sentinel Kusto Query Language Microsoft Sentinel detection rules threat hunting KQL Sentinel scheduled rules NRT detection rules detection engineering SecurityEvent table SigninLogs KQL SIEM query language

Free Brief

The Mythos Brief is free.

AI that finds 27-year-old zero-days. What it means for your security program.