> For the complete documentation index, see [llms.txt](https://help.nightfall.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://help.nightfall.ai/data-exfiltration-prevention/ai-agent-security/ai-governance/auditability-and-control/creating-an-ai-agent-security-policy.md).

# Creating an AI Agent Security Policy

AI Agent Security policies are configured as exfiltration policies in Nightfall. This guide walks through each step of the policy creation wizard.

***

### Getting Started

1. Navigate to Configuration > Policies > Exfiltration.
2. Click + New Policy.
3. Select AI Agent Security as the integration type.

***

### Step 1: Choose Hook Types

Enable one or more hook types. Each can be independently toggled:

| Hook Type       | Can Block | What It Scans                                   |
| --------------- | --------- | ----------------------------------------------- |
| User Prompts    | Yes       | Prompt text before it reaches the AI model      |
| Tool Calls      | Yes       | Tool name and input parameters before execution |
| Tool Responses  | No        | Tool output after execution (monitor only)      |
| Model Responses | No        | Model response after execution (monitor only)   |
| Shell Commands  | Yes       | Shell command string before execution           |

***

### Step 2: MCP Server Scope

Defines which MCP servers this policy is evaluated against. This scope also applies to tool responses (data coming back from the server, not just outbound calls). What happens when a match occurs - block, alert, etc. - is configured separately under Remediation Actions.

* All MCP servers - the policy applies to every connected MCP server.
* Specific MCP servers - the policy applies only to a chosen list of servers.
* All except these MCP servers - the policy applies to every server except a chosen list of excluded servers.

When you select “Specific MCP servers” or "All except these MCP servers," a drop-down picker appears:

#### 1: MCP Server Collections

Select one or more named server collections. All servers across selected collections are combined.

* There are pre-defined collections organized by category:
* Code Hosting
* Databases
* Communication
* Cloud Infrastructure
* Observability
* Project Management
* File System

You can navigate to Collections list page under AI Governance > Collections and manually add a new MCP server, tool calls for a server.  Select individual servers and optionally limit to specific tools within each server. Tool inventory will be captured and will be available in the Collections list page via the Add server and Add tools button. There is no blanket collection which will have all the servers and tools discovered.&#x20;

For example, you could allow the GitHub MCP server but only for read operations. To do so, specify this in the MCP server collection and configure an appropriate policy.&#x20;

#### 2: Wildcard Patterns

#### How servers are identified

Nightfall identifies the MCP server from the tool name reported by each AI client. Because clients format these names differently, the server is not always identifiable. The table below uses the fetch tool on a server named github as an example.

| AI client      | Tool name format            | Example                | Server identified? |
| -------------- | --------------------------- | ---------------------- | ------------------ |
| Claude Code    | mcp\_\_\<server>\_\_\<tool> | mcp\_\_github\_\_fetch | Yes                |
| GitHub Copilot | mcp\_\<server>\_\<tool>     | mcp\_github\_fetch     | Usually            |
| Cursor         | MCP:\<tool>                 | MCP:fetch              | No                 |

#### What this means for your policies

* Claude Code - Server-specific scoping works as expected.
* GitHub Copilot - Server-specific scoping works in most cases. When a server or tool name contains underscores, Nightfall may not be able to tell the server and tool apart reliably.
* Cursor - Cursor does not include the server name in its tool names. A Specific MCP servers or All except these MCP servers policy therefore cannot match Cursor traffic by server, and Cursor activity is treated as if All MCP servers were selected.

Recommendation: If you need to scope policies by server and your organization uses Cursor, pair the policy with a broader All MCP servers rule so Cursor traffic is still covered.

***

### Step 4: Shell Command Patterns (Optional)

When Shell Commands monitoring is enabled, you can optionally scope to specific command patterns. Leaving this field empty scans all shell commands.

Enter patterns as chips (type + Enter to add). Recommended patterns are shown as clickable suggestions below the input.

***

### Step 5: Detection Rules

Select the Nightfall detectors that define what sensitive data to look for. This works the same as any other exfiltration policy:

* Built-in detectors: PII (SSN, credit cards, phone numbers), credentials (API keys, passwords, tokens), source code patterns
* Custom detectors: Regular expressions, dictionaries, or ML-based classifiers you have created
* Detection rule logic: Combine multiple detectors with AND/OR logic and set confidence thresholds

***

### Step 6: Actions and Alerts

#### Enforcement action

| Action  | Behavior                                                          |
| ------- | ----------------------------------------------------------------- |
| Block   | The AI agent action is denied. The end-user sees a block message. |
| Monitor | The action proceeds. An incident is created for review.           |

#### Admin alerting

Configure where violation alerts are sent:

* Slack - post to a channel
* Jira - create a ticket
* Email - send to specified recipients
* Webhook - POST to a custom endpoint

#### End-user notification

End-user notifications are not available with AI Agent Security policy at this time. The custom message will be displayed in AI clients like Cursor, Claude Code & VS Code.&#x20;

* The notification text as per the custom block message (e.g., "This action was blocked because it contains sensitive data. Contact <security@company.com> for help.")

#### Policy metadata

* Policy name and description
* Risk score - use the Nightfall default or set a custom severity (Critical, High, Medium, Low)

***

### Example: Block Credentials in Prompts

Here is an example of a common policy configuration:

1. AI Clients: Claude Code, Cursor, VS Code
2. Hook Types: User Prompts (Block), Tool Calls (Block), Shell Commands (Monitor)
3. MCP Server Scope: All MCP servers
4. Detection Rules: API Keys, Passwords, AWS Credentials (High confidence)
5. Action: Block
6. Alerts: Slack #security-alerts + Email to security team

This policy prevents developers from accidentally pasting API keys or credentials into AI prompts or tool calls, while monitoring shell commands for credential exposure.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.nightfall.ai/data-exfiltration-prevention/ai-agent-security/ai-governance/auditability-and-control/creating-an-ai-agent-security-policy.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
