Shadow AI: How to Discover, Govern, and Reduce Risk from Unauthorized AI Tool Use
Shadow AI follows the same pattern as shadow IT: technology adoption by employees outpaces governance by security teams, creating uncontrolled risk. The difference is scale and sensitivity. When employees paste source code, customer records, M&A documents, or internal financial data into a personal ChatGPT account, that data leaves the corporate environment and may be used to train future model versions, stored in breach-accessible cloud infrastructure, or subject to foreign government data requests depending on the provider's jurisdiction.
CASB telemetry from 2026 shows that the average enterprise employee uses 4.7 distinct AI applications, of which the security team has approved an average of 1.2. Closing that gap requires discovery, classification, policy, and enforcement in that order.
Discovering Shadow AI Usage Across the Enterprise
You cannot govern what you cannot see. Discovery is the prerequisite step, and it requires data from multiple sources because AI tool usage happens across web browsers, API calls, installed applications, and mobile devices.
CASB (Cloud Access Security Broker) as the primary discovery source CASBs that operate inline (as a forward proxy or via API connectors) can identify AI application traffic by destination domain, URL patterns, and application fingerprinting. Major CASB vendors (Netskope, Microsoft Defender for Cloud Apps, Zscaler, Palo Alto Prisma Access) have built dedicated AI application catalogs that classify hundreds of AI tools by vendor, data handling practices, risk rating, and compliance certifications. Start here.
Key CASB queries to run:
- List all AI applications accessed by category (generative AI, code AI, image generation, AI-enhanced SaaS)
- Volume of data uploaded to each AI application (bytes uploaded, number of sessions)
- Users with the highest AI application usage outside approved tools
- AI applications with high-risk data handling ratings (no enterprise data agreements, training on user data, non-EU jurisdiction)
Secure Web Gateway (SWG) and DNS logs URL categorization databases in SWGs typically classify AI domains. Pull reports of traffic to the AI category over 30-90 days. DNS query logs are useful for identifying AI tools accessed from non-proxied devices (direct internet from corporate laptops, mobile devices on carrier networks).
Endpoint DLP telemetry Endpoint DLP agents (Microsoft Purview, Forcepoint, Symantec DLP) can log clipboard paste events and file upload events by destination application. If an employee copies content from an internal document and pastes it into a browser tab pointing to chat.openai.com, endpoint DLP captures both the source (SharePoint document containing PII) and the destination (ChatGPT).
Developer toolchain scanning Code-focused AI tools (GitHub Copilot, Cursor, Tabnine, Codeium, JetBrains AI Assistant) are used heavily by engineering teams and may not appear in standard CASB reports if they operate via IDE plugins rather than browser tabs. Audit installed IDE extensions and developer tool configurations. Review .gitignore and .env files for API keys indicating direct API usage of AI providers.
Survey-based discovery Combine technical discovery with a voluntary self-reporting survey. Ask employees which AI tools they use for work. Survey data identifies tools that evade technical controls (e.g., AI tools accessed via personal mobile devices) and provides insight into use case categories (writing assistance, code generation, data analysis, customer research).
Risk Classification Framework for AI Applications
Not all shadow AI carries equal risk. Apply a classification framework to prioritize remediation and policy decisions.
Risk dimension 1: Data handling practices The highest-risk AI applications use submitted content to train future models and do not offer enterprise data agreements that opt out of training. Verify the following for each discovered AI tool:
- Does the provider use user-submitted content for training? (Check Terms of Service and Enterprise agreements)
- Does the provider offer a Business Associate Agreement (BAA) for HIPAA-regulated data?
- Where is data processed and stored? (EU AI Act, China PIPL, and GDPR jurisdiction implications)
- What is the data retention period for submitted content?
Risk dimension 2: Access to sensitive data categories Classify discovered AI tools by the type of data employees are submitting, using CASB DLP inspection of uploaded content:
- Source code and proprietary algorithms
- Customer PII and financial data
- Internal documents classified as Confidential or Restricted
- Credentials, API keys, or secrets in pasted content
- M&A, legal, or HR sensitive information
Risk dimension 3: Tool category and capability Agentic AI tools that can take actions (browse the web, execute code, send email) carry higher risk than passive generation tools. Browser-integrated AI assistants that see all browser content carry higher risk than single-prompt tools.
Risk tiers:
- Tier 1 (Approved): Tools with enterprise agreements, data processing agreements, opt-out from training, and security certifications (SOC 2, ISO 27001)
- Tier 2 (Conditional): Tools acceptable for non-sensitive use cases with user training and monitoring
- Tier 3 (Blocked): Tools with no enterprise agreements, no data handling commitments, or that operate from high-risk jurisdictions
Briefings like this, every morning before 9am.
Threat intel, active CVEs, and campaign alerts, distilled for practitioners. 50,000+ subscribers. No noise.
Building an AI Acceptable Use Policy
An AI acceptable use policy (AUP) is the governance foundation. Without it, enforcement is arbitrary and employees have no clear guidance. The policy must be practical enough that employees follow it rather than route around it.
Core policy components:
Approved tool list with clear use cases. Name the specific tools employees may use (example: Microsoft 365 Copilot for internal content, GitHub Copilot Enterprise for code with internal context, Grammarly Business for writing assistance). Specify approved use cases for each tool. Vague permission creates ambiguity.
Data classification rules. Specify which data classifications may not be submitted to AI tools under any circumstances. Example: Confidential and Restricted data (as defined in the Data Classification Policy) must never be submitted to AI tools, including approved tools, unless the tool has a validated enterprise data agreement and explicit CISO approval.
Specific prohibitions. Name behaviors that are always prohibited regardless of tool: submitting customer PII, source code from unreleased products, legal communications, M&A documents, credentials or API keys, unpublished financial data.
Attribution and disclosure requirements. Specify when AI-generated content must be disclosed (customer-facing content, regulatory filings, academic submissions, contractual representations).
Exception and approval process. Define how employees can request approval for tools not on the approved list. Make the process fast (2-5 business days) or employees will bypass it. Automate approvals for tools already assessed as Tier 1.
Consequences. State the disciplinary consequences of policy violation clearly. Ambiguity signals that the policy is not enforced.
Technical Enforcement Controls
Policy without enforcement is a wishlist. Implement technical controls aligned to the policy tiers.
CASB blocking for Tier 3 applications
Configure the CASB to block access to Tier 3 AI applications by application category or specific domains. Implement a user-facing block page that explains why the tool is blocked and provides links to approved alternatives. Blind blocking without explanation drives shadow usage to mobile devices; a clear message with alternatives reduces workarounds.
DLP rules for sensitive data submission to AI
Deploy inline CASB DLP inspection on uploads to AI application categories, not just Tier 3. Even approved tools should not receive data classified as Restricted or Confidential. Create DLP policies that detect source code patterns, PII data types, and internal document classification markers in content uploaded to any AI domain. Alert or block on policy-matching submissions rather than allowing them silently.
Endpoint DLP clipboard monitoring
Configure endpoint DLP agents to monitor clipboard paste events to AI application browser tabs. This catches sensitive data copied from internal applications (Outlook, SharePoint, Salesforce) and pasted into AI chat interfaces. Apply a lighter-touch response (alert and log) rather than block for initial rollout to avoid employee productivity complaints while building baseline data.
Browser extension management
Many AI tools deploy as browser extensions (Grammarly, Otter, various AI writing assistants) that have permissions to read all page content. Audit and enforce allowed browser extensions via Chrome Enterprise Policy or Microsoft Edge management. Block extensions from unknown publishers or those with high-risk permissions (read all website data, access to all browser tabs).
API key and credential scanning
Developers frequently paste OpenAI, Anthropic, or Google AI API keys into shared documents, Slack messages, or version control. Deploy secrets scanning in GitHub/GitLab (GitHub Advanced Security, Gitleaks), Slack DLP, and SharePoint to detect and alert on AI provider API keys. Rotating compromised keys is more cost-effective than investigating API abuse after the fact.
Measuring Shadow AI Risk Reduction Over Time
Shadow AI governance is not a one-time project. Track these metrics to measure program effectiveness and demonstrate progress to leadership.
Discovery metrics:
- Number of distinct AI applications in use (decreasing over time as policy takes hold)
- Percentage of AI application usage covered by approved tools
- Volume of data submitted to Tier 3 applications (target: approaching zero after blocking)
Policy compliance metrics:
- Percentage of employees who have completed AI acceptable use training
- Number of DLP policy violations involving AI tools (trending down over time)
- Number of exception requests processed and average approval time
Risk reduction metrics:
- Number of sensitive data submissions to non-approved AI tools blocked by DLP
- Number of AI-related API keys detected and rotated via secrets scanning
- Risk score of the AI application portfolio (weighted average of data handling risk across all tools)
Report these metrics quarterly to CISO and annually to the board or risk committee alongside the AI tool catalog and policy update status.
The bottom line
Shadow AI is not going away. The productivity value of AI tools is real, and blocking all AI usage is both impractical and counterproductive. The correct response is rapid, pragmatic governance: discover what is in use, classify by risk, build an approved tool list with clear use cases, enforce DLP on sensitive data submission, and block only the highest-risk tools. Organizations that govern AI usage effectively capture the productivity benefit while protecting sensitive data. Those that ignore it are already experiencing the data leakage consequences.
Frequently asked questions
How common is shadow AI in enterprise environments?
Extremely common. Netskope and Salesforce research from 2026 consistently shows that more than 65% of enterprise employees use AI tools their IT department has not approved. The tools used most frequently include personal ChatGPT and Claude accounts, AI-enhanced browser extensions, AI writing tools (Grammarly, Quillbot), AI coding assistants, and AI-powered meeting note tools (Otter, Fireflies). The gap between employee AI usage and approved AI tools is wider than the gap was for shadow SaaS five years ago.
What data is most commonly submitted to unauthorized AI tools?
CASB telemetry and incident data consistently show: internal communications and email drafts (highest volume), source code and technical documentation, customer-facing content being drafted, presentation decks and strategy documents, and data analysis tasks involving exports from internal databases. The source code category is particularly high-risk because it may contain embedded credentials, proprietary algorithms, and customer data in test fixtures.
Does Microsoft 365 Copilot eliminate shadow AI risk?
No, it reduces it for Microsoft-specific workflows but does not eliminate it. Employees with M365 Copilot still use other AI tools for tasks where Copilot is not well-suited: image generation, long-form video transcription, specialized coding tools, AI-enhanced research tools, and use cases where the employee prefers a different model. Deploying an approved enterprise AI suite reduces shadow AI volume but governance and monitoring are still required.
What are the regulatory implications of shadow AI for GDPR and HIPAA-covered organizations?
Significant. Under GDPR, submitting EU personal data to a third-party AI provider requires a Data Processing Agreement (DPA) with the provider and a valid legal basis for the transfer. If an employee pastes customer PII into an unauthorized AI tool, the organization may have processed EU personal data without a valid DPA, creating Article 28 and Article 46 violations. Under HIPAA, submitting Protected Health Information (PHI) to an AI tool that lacks a Business Associate Agreement (BAA) is a HIPAA violation regardless of the employee's intent. Discovery and blocking of sensitive data submission to non-compliant tools is a compliance requirement, not just a security best practice.
How do I write an AI acceptable use policy that employees will actually follow?
Three principles: First, make the approved list generous enough to cover real use cases. If employees need AI writing assistance and you only approve enterprise tools that require IT provisioning, they will use personal accounts. Second, explain the why behind prohibitions. Employees who understand that pasting customer data into ChatGPT may feed it into training data are more compliant than those who see unexplained blocks. Third, make the exception process fast. A 2-5 business day turnaround for tool approval requests prevents workarounds. Pair the policy with training, not just a signature acknowledgment.
Which CASB vendors have the best AI application coverage?
Netskope leads in AI application catalog depth with over 800 classified AI applications as of 2026. Microsoft Defender for Cloud Apps integrates natively with M365 environments and has strong coverage of Microsoft-adjacent AI tools. Zscaler and Palo Alto Prisma Access both have AI-specific application categories with risk ratings. When evaluating CASB for AI governance, ask specifically about the number of AI applications in the catalog, the frequency of catalog updates, and whether the vendor provides data handling risk ratings alongside application identity.
Should we monitor employee AI usage after deploying governance controls?
Yes, with appropriate disclosure. Inform employees in the AI acceptable use policy and privacy notice that AI application usage is monitored for security and compliance purposes, consistent with your existing acceptable use monitoring disclosures. Focus monitoring on anomalous patterns (large data uploads to AI tools, usage of blocked tools via workarounds) rather than individual prompt content. Aggregate, anonymized reporting is sufficient for most governance needs and raises fewer employee relations concerns than individual-level surveillance.
Sources & references
Free resources
Critical CVE Reference Card 2025–2026
25 actively exploited vulnerabilities with CVSS scores, exploit status, and patch availability. Print it, pin it, share it with your SOC team.
Ransomware Incident Response Playbook
Step-by-step 24-hour IR checklist covering detection, containment, eradication, and recovery. Built for SOC teams, IR leads, and CISOs.
Get threat intel before your inbox does.
50,000+ security professionals read Decryption Digest for early warnings on zero-days, ransomware, and nation-state campaigns. Free, weekly, no spam.
Unsubscribe anytime. We never sell your data.

Founder & Cybersecurity Evangelist, Decryption Digest
Cybersecurity professional with expertise in threat intelligence, vulnerability research, and enterprise security. Covers zero-days, ransomware, and nation-state operations for 50,000+ security professionals weekly.
The Mythos Brief is free.
AI that finds 27-year-old zero-days. What it means for your security program.
