1 of 100

Firewall for AI

Welcome

Welcome to the amazing world of the Nightfall Firewall for AI (formerly known as Nightfall Developer Platform). Here you can find all the information about Nightfall's APIs, and SDKs, and also usage examples of these APIs and SDKs.

Introduction

Welcome to Firewall for AI

Welcome to Nightfall's Firewall for AI Developers Scan and Workflow APIs documentation. This documentation helps developers leverage Nightfall AI's industry-leading detection engine to identify and protect sensitive customer and corporate data anywhere. It prevents unauthorized access and data breaches and allows you to focus on innovation.

Scan APIs

Scan prompts, text, documents, spreadsheets, logs, zips, JSON, images, etc., for PII, PHI, PCI, banking information, API keys, passwords, and network information with the highest accuracy and lightning-fast response times. Redact sensitive findings with customizable formatting.

Workflow APIs

Leverage the full potential of the Nightfall console application through our Workflow APIs. Customize your SIEM workflows and reporting, take actions, update support tickets, alert users, search violations, annotate findings, create reports, and more.

Key Features

AI-Powered Identification: Utilize advanced AI models to detect and prevent security threats in real-time.
Comprehensive Sensitive Data Detection: Identify PII, PHI, PCI, banking information, API keys, passwords, and network information across various formats including text, documents, spreadsheets, logs, zips, and images.
Customizable Redaction: Tailor data protection to your needs with fully customizable redaction for each sensitive entity type.
Flexible Detectors: Leverage Nightfall’s comprehensive list of machine learning-based detectors, customize them, or create your own with specialized logic.
High Accuracy and Performance: Achieve precision and recall rates of 95% or higher, handle over 1K requests per second, and experience latency of less than 100 ms.
Seamless Integration: Easily integrate with your existing AI development and data engineering tools for smooth and efficient operation.

Customizable and Built-in Machine Learning-based Detectors

You can leverage Nightfall’s machine learning-based detectors or create your own detectors with customized logic to scan third-party apps, internal services, and data silos to identify instances of potentially sensitive types of data such as:

Personally Identifiable Information (PII) including Social Security Numbers, passport numbers, email addresses, or date of birth
Protected Health Information (PHI) such as insurance claim numbers or ICD10 codes
Financial information like credit card numbers or bank routing numbers
Secrets such as API and cryptographic Keys, database connection strings, passwords, etc.
Network information such as IP Address or MAC Address

A Flexible Data Security Solution

Key features of Nightfall’s detection engine include:

Defining minimum confidence thresholds and minimum finding counts on detectors to reduce the chance of false positives.
Specifying context rules and exclusion rules on detectors to fine-tune their accuracy to better suit your use cases.
Choosing which detectors are triggered for each policy.

Using the API

The Nightfall API consumes arbitrary data as input either as strings or as files and allows you to use any combination of detectors to return a collection of “findings" objects.

The detectors may be defined in our web app and referenced in an API call or defined as part of the payload to an API call.

The findings display the relevant detector, the likelihood of a match, and the location within the given data where the matched token occurred (not only in terms bytes — there is support for tabular and JSON data as well).

You can take protective action on sensitive text by redacting, substituting, or encrypting it with the API. You may also set up webhooks to receive asynchronous notifications when findings are detected.

The Nightfall API is RESTful and uses JSON for its payloads. Our API is designed to have predictable, resource-oriented URLs for each endpoint and uses HTTP response codes to indicate any API errors.

You may test out the API through the interactive reference documentation .

Where to Go From Here

The following guide will walk you through getting started and describe the API functionality in more detail. If you want to execute an API call immediately, see our Quickstart guide to see how to obtain an API Key and make a simple scan request.

After that, you can learn about Nightfall with our Key Concepts section, which will also help you get set up with Nightfall.

If you’re looking for more ideas about best to leverage Nightfall’s functionality, see our Use Cases guide.

We have created numerous tutorials and example implementations that demonstrate how to implement DLP for a variety of platforms (including OpenAI, LangChang, Amazon, Datadog, and Elasticsearch) and handle various scenarios (such as detecting sensitive data in GenAI prompts or detecting PII on your machine in real-time).

We also have several language-specific SDKs to get you up and running in Java, Python, Go, Node.js, and Ruby.

You can also quickly test out Nightfall detectors or your custom Detection Rules in the Nightfall Playground. Please also consult our Detector Glossary to see the variety of built-in detectors that Nightfall offers.

The Firewall for AI Overview page allows you to create API keys and manage Detectors and Detection Rules through a straightforward user interface. Log in here to access the Dashboard, or sign up to create a free account.

For frequently asked questions, feedback, and other help, please contact Nightfall support at support@nightfall.ai. We also host Nightfall Developer Office Hours on Wednesdays at 12pm PT to help answer questions, talk through any ideas, and chat about data security. We would love to see you there!

Quickstart

The Document will guide you in making your first API request.

This page will get you up and running with the Nightfall API so you can start scanning for sensitive data.

Obtain an API key

The Nightfall API requires a valid API key to authenticate your API requests.

You can create API keys in the Dashboard.

Learn more about Authentication and Security.

Make an API Scan Request

Below is an example request to the scan endpoint.

To run this example yourself, replace the API key (NF-rEpLaCe...) with the one you created in the dashboard or set it as the environment variable NIGHTFALL_API_KEY as necessary.

The cURL example may be run from the command line without any additional installation. To run the Python example, you will need to download the corresponding SDK.

curl --request POST \
     --url https://api.nightfall.ai/v3/scan \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer  NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
     --header 'Content-Type: application/json' \
     --data '
{
     "policy": {
          "detectionRules": [
               {
                    "detectors": [
                         {
                              "minNumFindings": 1,
                              "minConfidence": "VERY_LIKELY",
                              "displayName": "US Social Security Number",
                              "detectorType": "NIGHTFALL_DETECTOR",
                              "nightfallDetector": "US_SOCIAL_SECURITY_NUMBER"
                         },
                         {
                              "redactionConfig": {
                                   "maskConfig": {
                                        "charsToIgnore": [
                                             "-"
                                        ],
                                        "maskingChar": "X",
                                        "maskRightToLeft":true,
                                        "numCharsToLeaveUnMasked":4
                                   }
                              },
                              "minNumFindings": 1,
                              "minConfidence": "VERY_LIKELY",
                              "displayName": "Credit Card Number",
                              "detectorType": "NIGHTFALL_DETECTOR",
                              "nightfallDetector": "CREDIT_CARD_NUMBER"
                         }
                    ],
                    "name": "My Match Rule",
                    "logicalOp": "ANY"
               }
          ]
     },
     "payload": [
          "The customer social security number is 458-02-6124",
          "No PII in this string",
          "My credit card number is 5310-2768-6832-9293"
     ]
}
'

// By default, the client reads your API key from the environment variable NIGHTFALL_API_KEY
const nfClient = new Nightfall();

const payload = [
          "The customer social security number is 458-02-6124",
          "No PII in this string",
          "My credit card number is 5310-2768-6832-9293"
     ];
     

const policy = {
		 "detectionRules": [
               {
                    "detectors": [
                         {
                              "minNumFindings": 1,
                              "minConfidence": "LIKELY",
                              "displayName": "US Social Security Number",
                              "detectorType": "NIGHTFALL_DETECTOR",
                              "nightfallDetector": "US_SOCIAL_SECURITY_NUMBER"
                         },
                         {
                              "redactionConfig": {
                                   "maskConfig": {
                                        "charsToIgnore": [
                                             "-"
                                        ],
                                        "maskingChar": "#"
                                   }
                              },
                              "minNumFindings": 1,
                              "minConfidence": "LIKELY",
                              "displayName": "Credit Card Number",
                              "detectorType": "NIGHTFALL_DETECTOR",
                              "nightfallDetector": "CREDIT_CARD_NUMBER"
                         }
                    ],
                    "name": "My Match Rule",
                    "logicalOp": "ANY"
               }
          ]
     };
     
const response = await nfClient.scanText(payload, policy);

if (response.isError) {
  console.log(response.getError());
} else {
  response.data.findings.forEach((finding) => {
    if (finding.length > 0) {
      finding.forEach((result) => {
        console.log(`Finding: ${result.finding}, Confidence: ${result.confidence}`);
      });
    }
  });
}// Some code

>>> from nightfall import Confidence, DetectionRule, Detector, Nightfall

>>> # By default, the client reads the API key from the environment variable NIGHTFALL_API_KEY
>>> nightfall = Nightfall()

>>> # A rule contains a set of detectors to scan with
>>> cc = Detector(min_confidence=Confidence.LIKELY, nightfall_detector="CREDIT_CARD_NUMBER")
>>> ssn = Detector(min_confidence=Confidence.POSSIBLE, nightfall_detector="US_SOCIAL_SECURITY_NUMBER")
>>> detection_rule = DetectionRule([cc, ssn])
>>> payload = ["hello world", "my SSN is 678-99-8212", "4242-4242-4242-4242"]
>>> findings, _ = nightfall.scan_text( payload, detection_rules=[detection_rule])

The Policy (policy) you define indicates what to scan for in your payload with a logical grouped (ANY or ALL) set of Detection Rules (detectionRules).

Detection Rules can be defined two ways:

inline as code, as shown above
in the Nightall app, which you will then reference by UUID.

Learn more about setting up Nightfall in the Nightfall app to create your own Detectors, Detection Rules, and Policies. See Using Pre-Configured Detection Rules for an example as to how to execute queries using an existing Detection Rules UUID.

In the example above, two of Nightfall's native Detectors are being used: US_SOCIAL_SECURITY_NUMBER and CREDIT_CARD_NUMBER.

You can find a full list of native Detectors in the Detector Glossary.

If you don't want to create your Detectors, Detection Rules, and Policies in the Nightfall app, but would prefer to do it in code, it is possible to define Detectors inline with your own regular expressions or word list as well as extend our native Detectors with exclusion and context rules.

When defining a Detection Rule, you configure the minimum confidence level (minConfidence) and minimum number of times the match must be found (minNumFindings) for the rule to be triggered.

Another feature Nightfall offers is the ability to redact sensitive findings. Detectors may be configured (via redactionConfig) to replace the text that triggered them with a variety of customizable masks, including an encrypted version of the text.

In the payload body, you can see that we are submitting a list of three different strings to scan (payload). The first will trigger the U.S. Social Security Detector. The last will trigger the credit card Detector. The middle example will trigger neither.

Example Nightfall API Scan Response

The Nightfall API returns a response with an array (findings) with a length that corresponds to the length of the payload array. In this example, only the first and last items in the request payload triggered the Detectors, so the second element of the array is empty.

In the first element of the array, you can see details about which Detection Rule was triggered and the data that was found (finding). The response also provides a confidence level (confidence), as well as the location within the original text where the data was found either in terms of bytes (byteRange) or characters (codepointRange).

{
  "findings": [
    [
      {
        "finding": "458-02-6124",
        "redactedFinding": "XXX-XXXX-XXXX-9293",
        "detector": {
          "name": "US Social Security Number",
          "uuid": "e30d9a87-f6c7-46b9-a8f4-16547901e069"
        },
        "confidence": "VERY_LIKELY",
        "location": {
          "byteRange": {
            "start": 39,
            "end": 50
          },
          "codepointRange": {
            "start": 39,
            "end": 50
          },
          "rowRange": null,
          "columnRange": null,
          "commitHash": ""
        },
        "matchedDetectionRuleUUIDs": [],
        "matchedDetectionRules": [
          "My Match Rule"
        ]
      }
    ],
    [],
    [
      {
        "finding": "5310-2768-6832-9293",
        "redactedFinding": "XXXX-XXXX-XXXX-9293",
        "detector": {
          "name": "Credit Card Number",
          "uuid": "74c1815e-c0c3-4df5-8b1e-6cf98864a454"
        },
        "confidence": "VERY_LIKELY",
        "location": {
          "byteRange": {
            "start": 25,
            "end": 44
          },
          "codepointRange": {
            "start": 25,
            "end": 44
          },
          "rowRange": null,
          "columnRange": null,
          "commitHash": ""
        },
        "redactedLocation": {
          "byteRange": {
            "start": 25,
            "end": 44
          },
          "codepointRange": {
            "start": 25,
            "end": 44
          },
          "rowRange": null,
          "columnRange": null,
          "commitHash": ""
        },
        "matchedDetectionRuleUUIDs": [],
        "matchedDetectionRules": [
          "My Match Rule"
        ]
      }
    ]
  ],
  "redactedPayload": [
    "",
    "",
    "My credit card number is XXXX-XXXX-XXXX-9293"
  ]
}

Congratulations! You have successfully completed the Nightfall Quickstart.

You can modify the Detectors or payload in the example request to get more practice with the Nightfall API.

Use Cases

There are many use cases for a high accuracy data classification and protection system like Nightfall. Here are some of the most popular to spark your imagination.

We can't wait to hear more about what you're planning to build: reach out to us anytime at support@nightfall.ai to discuss your use case.

Protect sensitive data from transferring to downstream 3rd party services like LLM APIs.

Motivation

Third-party APIs provide services that greatly augment the capabilities of your applications.
- For example, GenAI LLMs can automatically generate content. These LLMs can be accessed via APIs, such as OpenAI or Anthropic APIs.
- Another example are telecom/communications APIs like SendGrid and Twilio that provide communications infrastructure.
The challenge is that these services may unnecessarily receive sensitive or confidential information from your application that is calling these APIs, which can pose data privacy risks because customer data is being shared outside the intended scope. For example, LLMs can handle very large inputs, or prompts, and these prompts may contain sensitive customer information.

Benefits

By filtering out customer data from API inputs, you will be able to leverage cutting-edge third-party services and APIs without introducing data privacy risks by oversharing sensitive or confidential information.

Sanitize user input to prevent unnecessary collection or proliferation of sensitive customer data.

Motivation

Applications collect and store sensitive information from consumers. Users may “overshare” or incorrectly input information, leading to sensitive data ending up in places it is not expected, or internal services may proliferate or handle this data in unexpected ways.
- Fintech applications that intake, store, and generate files with PII like W-2s and paystubs.
- Healthcare applications that handle protected health information or SSNs.
Marketplaces and social media applications allow for user generated content that may contain sensitive or illicit information, such as profanity, toxicity.
Support channels receive any inbound information from consumers, and can include highly sensitive information or over-sharing that is then exposed to support agents.
This data can come in a variety of unstructured formats - whether that be screenshots, images, documents, plaintext, compressed folders or archives, so to inspect this content requires high quality text extraction.

Benefits

Reduce the possibility of users inputting sensitive data that should not be collected or retained within your application or service by scanning data upon submission. Warn or prevent users from inputting sensitive data into form fields or file uploads.
Diminish collection of sensitive data types that could result in regulatory fines or brand damage, if leaked or breached.
Limit exposure of sensitive data to internal personnel like support agents that could lead to accidental misuse or intentional theft.

Audit and remove sensitive data in data silos and processing workflows for compliance.

Motivation

Compliance regimes like FedRAMP, PCI, and HIPAA may require that sensitive data is not proliferating into unsanctioned data silos, like project management systems, data warehouses, and logging infrastructure.
Many different development teams may be writing data into these internal services like logging and data warehousing, so it is challenging to enforce data sanitization on data ingress.
CDP tools like Segment and Fivetran can further proliferate sensitive data into a broader set of data silos than its original location.
Data analytics and data science teams may replicate and transform data, leading to further copies and versions across internal systems.
Edge cases, unexpected errors, and stack traces can lead to sensitive data landing or replicating in application logs.

Benefits

Identify and remove sensitive data from places that it shouldn’t be.
Monitor data at rest in data silos instead of at points of ingress/egress that would be hard to monitor or track.
Scan extremely high volumes of unstructured data at scale.
Build workflows to delete data, redact data, or alert the right teams when sensitive data is found where it shouldn’t be.

Build data classification and DLP features directly into your SaaS application.

Motivation

Data classification and DLP capabilities are increasingly expected by regulated institutions such as big banks.
Building data classification and DLP from scratch is complex and has high opportunity costs in moving developers away from working on the core product offering. Building a half-baked solution erodes customer trust, especially when there is already a high degree of skepticism around the quality of traditional DLP solutions.
SaaS and security vendors can deliver additional customer value and drive additional revenue through premium enterprise feature tiers that include security features like DLP, SAML SSO, audit logging, and more.

Benefits

Reduce time-to-market by leveraging out of the box components.
Reduce the overhead of an in-house data classification service that requires text extraction services, detector research and tuning, machine learning model development and deployment, maintenance & support.
Deliver best in class accuracy, reducing the risk of alert fatigue or missing sensitive data that erodes customer trust.

Centralize detection logic, custom detectors & regexes all in one place instead of embedded directly in code, and reduce the number of regexes required.

Motivation

Detecting a single type of sensitive data well (e.g. a credit card number) can be complex - requiring research and maintenance as the detector evolves over time. This becomes especially challenging for esoteric detectors, for example those that are region or industry-specific.
Managing regexes and input validation is complex and evolving. For example, a regex embedded in code to validate a Google Docs link may need to be updated over time as the format for Google Docs links changes, false positives are identified and accounted for, any performance implications are observed.
Many data types cannot be detected accurately with a regex because they require a certain level of validation, are heavily context dependent, or are highly variable or entropic in nature leading to a regex being overly sensitive or overly specific.

Benefits

Leverage out of the box detectors so no engineering time is spent on research, training, tuning detectors. No need to reinvent the wheel. These detectors span the categories of PII, PCI, PHI, credentials & secrets, ID numbers, and more.
Reduce time spent finding, tuning, and sharing regular expressions.
Build upon out of the box detectors with custom logic, instead of having to start from scratch with a regex or custom validation logic.

Improve accuracy of existing content inspection systems.

Motivations

Existing content inspection systems may yield a high degree of false positives (i.e. noise), leading to alert fatigue and significant time wasted on inaccurate alerts.
On the contrary, existing solutions may also be very limited in detection scope, leading to a high degree of false negatives (i.e. misses), putting the business at risk when sensitive data is missed.

Benefits

Replace existing, brittle solutions with a highly accurate content inspection system.
Reduce engineering time spent analyzing false positives and attempting to tune them out.

Sanitize inputs to labeled data used to train machine learning models.

Motivation

In training complex learning models, data scientists must compile and use large corpuses of data to improve the accuracy of the trained model. Unknowingly leveraging sensitive data in this effort can lead to violations of compliance regimes like HIPAA, GDPR, or PCI.
Models that focus on health, finance, public sector applications are particularly at risk for ingesting sensitive data that may violate industry specific compliance mandates.
Labeled data is often ingested from unregulated sources like customer communications, emails, public repos, and more. Inspecting all of these input sources manually is untenable.
Additionally, the data being leveraged may be in a variety of unstructured formats like screenshots, images, documents, plaintext, compressed folders or archives – to inspect this content requires high quality text extraction.

Benefits

Ensure the hygiene of the labeled data you are using to train your machine learning models
Diminish collection of sensitive data types that could result in regulatory fines or brand damage, if leaked or breached.

Example use cases by team and industry.

Healthcare: Detect PHI to ensure HIPAA compliance in your apps
Financial services: Secure PII and PCI like bank account numbers, payment card details, and social security numbers
E-commerce: Prevent costly data breaches of PII and PCI that can damage brand reputation
Education: Protect student and faculty privacy within applications
Customer support: Redact sensitive data in customer support system, shielding agents from information they shouldn’t see
IT Operations: Search for API keys, credentials, and secrets across internal and external data silos
Product: Create custom solutions for data classification, DLP, content moderation and more within your applications
Compliance: Address PCI-DSS, HIPAA, FedRAMP, GDPR, CCPA, GLBA, FERPA, PHIPA, and more
People & Community: Content moderation to detect profanity, toxicity
Gaming: Detecting profanity, toxicity, or even personal or financial information being shared in community chat rooms

Authentication and Security

The Nightfall API uses API keys to authenticate requests. You can create and view your API keys in the Nightfall app on the Manage API Keys page.

Your API keys carry many privileges, so be sure to keep them secure. Do not share your secret API keys in publicly accessible areas such as GitHub, client-side code, or anywhere else that would compromise their secrecy. If you believe one of your API Keys has been compromised, you should delete it through the Dashboard.

All API requests must be made over HTTPS.

Calls made over plain HTTP will fail.

API requests without authentication will fail.

Pricing

The Nightfall Developer Platform offers two different types of subscription plans: Free, and Enterprise. Pricing is based on the uncompressed data volume scanned by Nightfall

Free Plan When you sign up for the Nightfall Developer Platform, you are automatically enrolled in the Free plan, which comes with a set volume limit of 3 GB of data scanned per month.

Enterprise Plans If you are consistently scanning significant data volumes each month, you may want to reach out to sales@nightfall.ai to discuss our Enterprise plans which offer custom pricing and rate limits for DLP APIs and SaaS APIs.

For additional information, please contact our team to discuss your use cases.

Key Concepts

Entities and Terms to Know

This section describes the terms you will need to know when using the API.

Detectors

Detectors provide the logic to find potentially sensitive pieces of data.

When this logic detects such data, the Detector is considered "triggered."

Nightfall's has numerous pre-built Detectors that are trained via machine learning. Detectors may also be defined with regular expressions or dictionaries. Their accuracy may be further refined with exclusion rules and context rules. Whether a Detector is triggered may be controlled by a minimum confidence threshold per Detector and minimum number of findings per Detector as set on a Detection Rule.

The built-in set of Detectors cover a number of different categories of data, including:

Standard PII (e.g. social security number, driver's license number, ID card image)
PCI (Credit Card Number, credit card image)
Healthcare (e.g. PHI, US Medicare Beneficiary Number)
Finance - Banking (e.g. SWIFT code, IBAN code, US bank routing number)
Network (e.g. an IP Address)

The full set is enumerated in the Detector Glossary.

Custom Detectors

Nightfall also supports RE2 regexes and word lists for any custom detectors that you may want to implement.

Over time, we've aggregated the following regex library, which you're welcome to select from to save you some time. Please note that a regular expression is an established yet limited method that searches for pre-defined patterns, so your mileage may vary.

You can test regular expressions here.

You can input custom detectors in two ways: directly in the Nightfall Dashboard by navigating to Detectors → New Detector → Regular expression, or define them inline .

Exclusion Rules

An exclusion rule is a regular expression or word list that will be used once a Detector is triggered by its primary expression or word list to eliminate false positives.

For instance, you may have a Detector designed to detect phone numbers. However, you may have a particular set of phone numbers that you use for testing purposes that are known not to be valid (e.g. they start with the prefix 555) and this should be ignored. Adding an exclusion rule would allow you to prevent those matches from being returned by the API.

See: Using Exclusion Rules

Context Rules

Context Rules are additional matching expressions for a Detector that may be used to adjust the confidence score of a match.

You may provide a regular expression and the number of leading or trailing characters within which a match of that expression must occur in order to adjust the confidence level to a particular level.

For instance, if you found a sequence that appeared to be a social security number based on its length or formatting, you might boost the confidence score if it was preceded by the text like “SSN” or “Social Security Number.”

Returning Surrounding Context

You may request that a sequence of bytes of a given length be provided from before and after the text that triggers a Detection Rule.

This information can help you better understand whether or not something is an actual violation by observing the circumstances within which the detected text was found.

You are limited to a maximum of 40 bytes of this context text preceding and trailing the match for a total of 80 bytes overall.

See: Using Context

Detection Rules

Detection Rules are aggregations of Detectors that are assigned a minimum confidence level. The identifiers of Detection Rules are used as a parameter to the API.

You may create Detection Rules as described in the section Creating Detection Rules and use their identifier as part of API calls to scan content.

Alternatively you may specify Detection Rules programmatically in each API call, as described in the scan method documentation below.

A Detection Rule is composed of a list of Detectors with which you wish to scan each request payload, where any or all Detectors may be satisfied in order to trigger the rule. You can add up to 50 total Detectors with a limit of 30 regular expression type custom detectors.

Additionally, each Detector in the Detection Rule is assigned a “minimum confidence” level (see below and a minimum number of findings to determine if the Detection Rule should be considered triggered.

Confidence Levels

Detection results will be returned with one of the following confidence values.

In practice, the API will only return detections assigned a POSSIBLE or higher confidence level.

VERY_LIKELY (recommended)
LIKELY
POSSIBLE
UNLIKELY
VERY_UNLIKELY

Learn more about what different confidence levels mean and how to choose the right minimum confidence level for your detection rule here.

Policies

Policies allow you to create templates for the most common workflows by unifying a set of Detection Rules with the actions to be taken when those rules are triggered, including:

automated actions such as redaction of findings
alerting through webhooks

Once defined, a Policy may be used in requests to the Nightfall API, such as calls to scan file uploads, though automated redactions are not available for uploaded files at this time.

Setting Up Nightfall

Before you use the scan endpoint, there are a number of actions to do within the Nightfall dashboard to get your environment set up properly.

See to see how to create the necessary Authentication token for making API calls.
See for how to define your own custom logic for detecting sensitive data
See for how to aggregate Detectors for use in the scan endpoint
See for how to set up common workflows that combine your Detection Rules with remediation actions such as alerting.

Creating API Key

The API expects an API Key to be passed via the Authorization: Bearer <key> HTTP header.

To create and manage API keys:

Log in to Nightfall.
Click Overview under the Firewall for AI section.

Click Create key.

The Generate API Key window is displayed.

Enter a name for the API key and click Create.

The API key is generated and displayed (blurred in the following image). Click the copy button to copy the API key and store it in a. secure location. Once you click the Got it button, you cannot retrieve the API key again.

🚧Be Sure to Record the API Key's Value
For security reasons, after closing the window, you will not be able to recover the key's value.

Once you close the window, the My API Keys page will display your newly generated key, with the majority of the Key redacted.

You can return to the Overview page at any time to create new keys (assuming your license allows you to generate additional keys) or delete old keys.

Creating Detectors

You can customize your Detection Rules by creating custom detectors in the .

To create a Detector, select "Detectors" from the left-hand navigation and click the + New Detector button

Custom detectors can add context and exclusion rules on top of pre-built Nightfall detectors, or can be built off your own custom regular expressions.

Be aware that you may not have two detectors based on the same Nightfall data type within the same detection rule.

A full glossary of Nightfall's prebuilt detectors can be found in the Detector Glossary

Updated 2 months ago

Creating Detection Rules

You can define Detection Rules “inline” in the body of each request to the scan endpoint. See the example in the walk through of the scan endpoint Creating an Inline Detection Rule.

You can also use the > to predefine your Detection Rules. Once you have created a Detection Rule, you will receive a UUID, which you can pass in as part of your API request payloads.

You may add up to 50 detectors to your detection rule.

To create a Detection Rule in the Nightfall UI, Select "Detection Rules" from the left hand navigation.

Click the + New Detection Rule button in the upper right hand corner.

First, enter a name for your Detection Rule as well as an optional description.

Then click the + Detectors button to add Detectors to your Detection Rule.

In this example we have selected the US drivers license and Canada Government ID detectors.

Click the Add button in the lower right hand corner at the end of the detector list when you are done adding detectors.

Now that your Detectors are set, choose a minimum confidence level and a minimum # of findings for each detector.

If these minimums for a Detector are not met, the Detection Rule will not be triggered.

Save your Detection Rule in the lower left hand corner once you are done.

Once the Detection Rule is saved, it is available for use in requests to the Nightfall API to scan your data for sensitive information. Pass in the UUID of the Detection Rule as the detectionRuleUUIDs field of your requests to the the scan endpoints.

The UUID may be obtained by clicking the "copy" icon, the left most icon in the set of icons that appear next to the Detection Rules name when your cursor highlights a Detection Rule in the list of Detection Rules.

See Using Pre-Configured Detection Rules for an example of using a Detection Rule UUID.

Creating Policies

This document applies only to the Nightfall Firewall for AI customers. If you are a Nightfall SaaS application customer, refer to .

Policies allow customers to create templates for their most common workflows by unifying a set of Detection Rules with the actions to be taken when those rules are triggered, including:

automated actions such as redaction of findings
alerting through webhooks

Once defined, a Policy may be used in requests to the Nightfall API, such as calls to scan file uploads, though automated redactions are not available for uploaded files at this time.

To create a policy:

Log in to Nightfall.
Click Overview under the Firewall for AI section.

Click Create Policy.

The policy creation page is displayed as follows.

If you click the Policies button under the Setting Up section, you need to execute a couple of additional steps to reach the policy creation page, as displayed in the following image.

Enter a name for the policy.
(Optional) Enter a Description for the policy.
Click + Detection rule to add a Detector rule to the policy.
Select the check box of the Detector rules that you wish you add to the Policy.

Select the Redact Violations check box to mask sensitive information found in your transmitted data.
Select one of the alerting method available.
- Click + HTTP Alerts to configure a website as alert notification channel.
- Click + Email to notify recipients through an Email.
- Click + Slack to select a Slack channel to which the alerts must be sent.
Click Save Policy.

Configuring Webhook Alerts

When you click + Application Webhook, the following window is displayed.

If you have custom headers you would like to add to requests sent to the Webhook URL, you can do this from the overlay that appears when you click the "+ Webhook" button on the policy creation and edit page. These headers may be used for the purpose of authentication as well as integrating with Security Incidents and Event Management (SIEMs) or similar tools that aggregate content through HTTP event collection.

Click the "Add Header" button to add your custom headers.

Once your header key and value is entered you may obfuscate it by clicking on the "lock" icon next to the value field for the header. Click the "Save" button to persist your changes to the headers.

When you have completed configuring your Webhook URL and Headers, click the "Save" button.

🚧Limits On Webhook Headers
It is currently not possible to configure headers for webhooks programmatically when defining policies through the API.

After you click the "Save Policy" button, your policy should be immediately available for use. You can refer to the API Docs for the comprehensive list of endpoints that support policy UUIDs.

Alerting

Nightfall has the ability to send alerts when a violation is detected.

Policies for alerting may be configured through the Nightfall app user interface or they may be set up . Policies that are configured under Developer Platform > Overview > Policies may be used in the API by referencing their Policy UUID.

The way that an alert notification presents itself depends on the platform in question.

For example, notifications sent to Slack will appear as formatted messages sent by the Nightfall Alerts Bot. Other destinations such as email, SIEM url, and webhooks, will present the information as JSON objects.

In the case of webhooks, detailed information about the finding will be sent. For other destinations, sensitive information is redacted.

Supported Alert Platforms

Slack

In order to use asynchronous notifications with Slack, you must install the Nightfall Alerts plugin from the Slack Marketplace.

See our end user documentation on installing for more details.

Once you have authenticated Nightfall to your Slack workspace, you can provide any public channel name (e.g. #general) as part of a request to the Nightfall API.

To send notifications to a private channel, a member of the channel should invite the Nightfall bot to the specific private channel and allow channel access to the bot.

Follow the steps below to invite Nightfall Alerts bot to a private channel:

Go to the Slack channel in question
Type /invite @Nightfall Alerts as a message
Press 'Enter' (you should see a message that Nightfall Alerts has now joined the channel)

If any findings are detected as part of that request, then the Nightfall Alerts bot will send a message to the channel you configured. Conversely, if there are no findings in the request payload, then Nightfall will not send an alert message.

Teams

Documentation TBD

Email

Email is unauthenticated, so you can get started using Nightfall to send email alerts without any initial setup work.

Nightfall will send an email to the provided address only if findings were detected as part of the request. The findings themselves will be attached in a JSON file.

SIEM

You may send your alerts to a designated url, such as an endpoint hosted by SIEM software for log collection.

In addition to the url, you may provide headers, either for security or logging purposes.

Webhook

You may use a webhook server to programmatically handle a finding, allowing you to create your own custom workflows with your own or 3rd party systems.

Nightfall will always send an alert to the client's webhook server if it is provided as part of an API request, even if the scan request yielded no findings.

Alert Schemas

The request body sent by Nightfall is JSON, and uses the schemas in the section documented below.

File Scans

Since file scans can produce a large number of results, findings are not transmitted directly in the notification that Nightfall sends. The notification object looks like the following:

The requestMetadata field contains arbitrary contents provided by the client at request time, and can be used by the client to correlate this response to the original request.

The value of the findingsURL field is a pre-signed URL, which means anyone with the link can download the file. Therefore, this URL itself should be treated as sensitive and must not be leaked. The object stored at this URL is a JSON file containing a single key findings containing a list of all data detected from the request. The schema for the finding object inside the list is shared between the text-based and file-based API endpoints.

Text Scans

Scanning Text

The scan endpoint allows you to apply Policies and Detection Rules to a list of text strings provided as a payload.

curl --request POST \
     --url https://api.nightfall.ai/v3/scan \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer  NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
     --header 'Content-Type: application/json' \
     --data '
{
     "policy": {
          "detectionRules": [
               {
                    "detectors": [
                         {
                              "minNumFindings": 1,
                              "minConfidence": "LIKELY",
                              "displayName": "US Social Security Number",
                              "detectorType": "NIGHTFALL_DETECTOR",
                              "nightfallDetector": "US_SOCIAL_SECURITY_NUMBER"
                         }
                    ],
                    "name": "My Match Rule",
                    "logicalOp": "ANY"
               }
          ]
     },
     "payload": [
          "The customer social security number is 458-02-6124",
          "No PII in this string"
     ]
}

You may use Pre-Configured Detection Rules or Create Inline Detection Rules

Text scanning supports the use of Exclusion Rules, Context Rules, and Redaction as well as other Scanning Features.

For scanning files, see Scanning Files.

Note that you must generate an API key to send requests to the Nightfall API.

Scanning Files

Nightfall’s file scan API allows a user to upload a file in chunks, then to scan it with Detection Rules once the upload is complete.

The scan will then be processed asynchronously before sending the results to the webhook URL that is provided along with your Detection Rules.

The following sequence diagram illustrates the full process for scanning a binary file with Nightfall.

For a detailed walkthrough of the API calls necessary to upload and scan a file and full script that shows the entire process, see Uploading and Scanning Files .

Prerequisites

In order to utilize the File Scanning API you need the following:

An active API Key authorized for file scanning passed via the header Authorization: Bearer <key> — see Authentication and Security
A Nightfall Detection Policy associated with a webhook URL
A web server configured to listen for file scanning results (detailed information to follow)

File scanning also support Nightfall's functionality for Using Exclusion Rules and Using Context Rules as part of your scan requests.

Supported File Types

The file scan API has first-class support for text extraction and scanning on all MIME types enumerated below.

Certain file types receive special handling, such as tabular data and archives of Git repositories, that results in more precise information about the location of findings within the source file..

Handling of MIME Types Not Listed

Files with a MIME type not listed below are processed using an unoptimized text extractor. As a result, the quality of the text extraction for unrecognized types may vary.

Accepted Text and Derivatives

application/json
application/x-ndjson
application/x-php
text/calendar
text/css
text/csv (treated as tabular data and may be redacted )
text/html
text/javascript
text/plain
text/tab-separated-values (treated as tabular data)
text/tsv (treated as tabular data)
text/x-php

Accepted Office Formats

application/pdf
application/vnd.openxmlformats-officedocument.presentationml.presentation
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet (treated as tabular data)
application/vnd.openxmlformats-officedocument.wordprocessingml.document
application/vnd.ms-excel (treated as tabular data)

Accepted Archive and Compressed File Types

application/bzip2
application/ear
application/gzip
application/jar
application/java-archive
application/tar+gzip
application/vnd.android.package-archive
application/war
application/x-bzip2
application/x-gzip
application/x-rar-compressed
application/x-tar
application/x-webarchive
application/x-zip-compressed
application/x-zip
application/zip

Accepted Image File Types

image/apng
image/avif
image/gif
image/jpeg
image/jpg
image/png
image/svg+xml
image/tiff
image/webp

Rejected MIME Types

The file scan API explicitly rejects requests with MIME types that are not conducive to extracting or scanning text. Sample rejected MIME types include:

application/photoshop
audio/midi
audio/wav
video/mp4
video/quicktime

Spreadsheets and Tabular Data

File scans of Microsoft Office, Apache parquet, csv, and tab separated files will provide additional properties to locate findings within the document beyond the standard byteRange, codepointRange, and lineRange properties.

Findings will contain a columnRange and a rowRange that will allow you to identify the specific row and column within the tabular data wherein the finding is present.

This functionality is applicable to the following mime types:

text/csv
text/tab-separated-values
text/tsv
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
application/vnd.ms-excel

Apache parquet data files are also accepted.

Below is a sample match of a spreadsheet containing dummy PII where a SSN was detected in the 2nd column and 55th row.

{
   "findings":[
      {
         "path":"Sheet1 (5)",
         "detector":{
            "id":"e30d9a87-f6c7-46b9-a8f4-16547901e069",
            "name":"US social security number (SSN)",
            "version":1
         },
         "finding":"624-84-9182",
         "confidence":"LIKELY",
         "location":{
            "byteRange":{
               "start":2505,
               "end":2516
            },
            "codepointRange":{
               "start":2452,
               "end":2463
            },
            "lineRange":{
               "start":55,
               "end":55
            },
            "rowRange":{
               "start":55,
               "end":55
            },
            "columnRange":{
               "start":2,
               "end":2
            },
            "commitHash":""
         },
         "matchedDetectionRuleUUIDs":[
            "950833c9-8608-4c66-8a3a-0734eac11157"
         ],
         "matchedDetectionRules":[
            
         ]
      },
...

Redacting CSV Files

Findings within csv files may be redacted.

To enable redaction in files, set the enableFileRedaction flag of your policy to "true"

The csv file will be redacted based on the configuration of the defaultRedactionConfig of the policy

Below is an example curl request for a csv file that has already been uploaded .

curl --request POST \
     --url https://api.nightfall.ai/v3/upload/02a0c5e1-c950-4e28-a988-f6fffefc4205/scan \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer NF-<Your API Key>' \
     --header 'Content-Type: application/json' \
     --data '
{
     "policy": {
          "detectionRuleUUIDs": [
               "950833c9-8608-4c66-8a3a-0734eac11157"
          ],
          "alertConfig": {
               "email": {
                    "address": "<your email addres>"
               }
          },
          "defaultRedactionConfig": {
               "maskConfig": {
                    "charsToIgnore": [
                         "-",
                         "@"
                    ],
                    "maskingChar": "*"
               }
          },
          "enableFileRedaction": true

     },
     "requestMetadata": "csv redaction test"
}
'

When results are sent to the location specified in the alertConfig (in this case an email address) a redactedFile property will be set with a fileURL in addition the findingsURL

{
   "errors":null,
   "findingsPresent":true,
   "findingsURL":"https://files.nightfall.ai/asdfc5e1-c950-4e28-a988-f6fffefc4205.json?Expires=1655324479&Signature=zjo1nT-PECHC-fiTvAgdA8aDnceoY~6iGfzOBCcBjscKqOHnIar8hoH4gGufffiulBw5BpfJuvWwBW~lXO~ZNhN139LDwoTsfLJswJiQCB2Hj-Az0Em6go~1j8WBqCS8G0Gk17M-zcPedHGX3z~1pw8nm5sh6Pa-jJwfw9NIEiqmBb3Vdcj3J-~Wzag~ENV4499rnG299ee-ig5Ms1oVlzycb4YxzgTMrTL5Q07ozNenwFZcGDNQre1inLXmV-m8teLX-K3boklenp9KXiNDDV0wi74ADN-QfIR1q1oU7mEI1f3aVC3kju0QRErp2lsfs08EtZKLE3C4N17jDJdYcw__&Key-Pair-Id=K24YOPZ1EKX0YC",
   "redactedFile":{
      "fileURL":"https://files.nightfall.ai/asdfc5e1-c950-4e28-a988-f6fffefc4205-redacted.csv?Expires=1655324479&Signature=Hx8kRh88maLeStysy3fsLbFVG9VELEtfemtQe2lWUnFjAMd9HqlEksTmirqAWFWV4zPVUB73izlMj5cSer8v2N5ZCcnD3dz~nnwR4P5LewGJ2CQzGnDnXgh70HW5qp04gnUD-pYWp~bGPVspkJKCkl1zH-EoGonvcNVq3SNsVzOlsVIjep7Y7otQKEEyAZ7JmHiVfuBxrvn8pleuC5lEJ3f9miPyoRqH9DyPlNTJTIuijqe9q32Qcui2RsDR6IT-foFX52dy6rRa01ZV0gZMDWJokMlCr8Iu5An~qnhxC49bqTtI82oz9FcBaP-Yea8cq1TiAfGxX7CJ0~JeTLvr6g__&Key-Pair-Id=K24YOPZ1EKX0YC",
      "validUntil":"2022-06-15T20:21:19.750990823Z"
   },
   "requestMetadata":"csv redaction test",
   "uploadID":"02a0c5e1-c950-4e28-a988-f6fffefc4205",
   "validUntil":"2022-06-15T20:21:19.723045787Z"
}

This redacted file will be a modified version of the original csv file.

Below is an example of a redacted csv file.

name,email,phone,alphanumeric
Ulric Burton,*****@*************,*-***-***-****,TEL82EBM1GQ
Wade Jones,******************@***********,(********-****,VVF64PJV2EF
Molly Mccullough,*****************@**********,(********-****,OHO41SFZ2BR
Raja Riggs,************@**********,(********-****,UVD51JTE5NZ
Colin Carter,**********************@*********,(********-****,LNI34LLC5WV// Some code

Git Repositories

Nightfall provides special handling for archives of Git repositories.

Nightfall will scan the repository history to discover findings in particular checkin, returning the hash for the checkin.

In order to scan the repository, you will need to create a clone, i.e.

git clone https://github.com/nightfallai/nightfall-go-sdk.git

This creates a clone of the Nightfall go SDK.

You will then need to create an archive that can be uploaded using Nightfall's file scanning sequence.

zip -r directory.zip directory

Note that in order to work, the hidden directory .github must be included in the archive.

When you initiate the file upload sequence with this file, you will receive scan results that contain the commitHash property filled in.

Using the Nightfall go SDK archive created above, a simple example would be to scan for URLs (i.e. strings starting with http:// or https://), which will send results such as the following:

{
   "findings":[
      {
         "path":"f607a067..53e59684/nightfall.go",
         "detector":{
            "id":"6123060e-2d9f-4f35-a7a1-743379ea5616",
            "name":"URL"
         },
         "finding":"https://api.nightfall.ai/\"",
         "confidence":"LIKELY",
         "location":{
            "byteRange":{
               "start":142,
               "end":168
            },
            "codepointRange":{
               "start":142,
               "end":168
            },
            "lineRange":{
               "start":16,
               "end":16
            },
            "rowRange":{
               "start":0,
               "end":0
            },
            "columnRange":{
               "start":0,
               "end":0
            },
            "commitHash":"53e59684d9778ceb0f0ed6a4b949c464c24d35ce"
         },
         "beforeContext":"tp\"\n\t\"os\"\n\t\"time\"\n)\n\nconst (\n\tAPIURL = \"",
         "afterContext":"\n\n\tDefaultFileUploadConcurrency = 1\n\tDef",
         "matchedDetectionRuleUUIDs":[
            "cda0367f-aa75-4d6a-904f-0311209b3383"
         ],
         "matchedDetectionRules":[
            
         ]
      },
 ...

Support for Large Repositories

Currently, processing is limited to repositories with a total number of commits lower than 5000.

Large repositories result in a large volume of data sent at once. We are working on changes to allow these and other large surges of data to be processed in a more controlled manner, and will increase the limit or remove it altogether once those changes are complete.

Sensitive Data in GitHub Repositories

If the finding in a GitHub repository is considered to be sensitive, it should be considered compromised and appropriate mitigation steps (i.e. secrets should be rotated).

To retrieve the specific checkout, you will need to clone the repository, i.e.

git clone https://github.com/nightfallai/nightfall-go-sdk.git

You can then checkout the specific commit using the commit hash returned by Nightfall.

cd nightfall-go-sdk
git checkout 53e59684d9778ceb0f0ed6a4b949c464c24d35ce

Note that you are in a 'detached HEAD' state when workin with this sort of check out of a repository.

File Scanning and Webhooks

As part of submitting a file scan request, the request payload must contain a reference to a webhook server URL defined as part of a policy defined inline.

When Nightfall prepares a file scan operation, it will issue a challenge to the webhook server to verify its legitimacy.

After the file scan has been processed asynchronously, the results will be delivered to the webhook.

Webhook Payload and Findings for File Scans

For a file scan, your webhook will receive a request body that will be a JSON payload containing:

the upload UUID (uploadID)
a boolean indicating whether or not any data in the file matched the provided detection rules (findingsPresent)
a pre-signed S3 URL where the caller may fetch the findings for the scan (findingsURL). if there are no findings in the file, this field will be empty.
the date until which the findingsURL is valid (validUntil) formatted to RFC 3339. Results are valid for 24 hours after scan completion. The time will be in UTC.
the value you supplied for requestMetadata. Callers may opt to use this to help identify their input file upon receiving a webhook response. Maximum length 10 KB.

Below is an example of a payload sent to the webhook URL.

{
    "findingsURL": "https://files.nightfall.ai/asdfasdf-asdf-asdf-asdf-asdfasdfasdf.json?Expires=1635135397&Signature=asdfasdfQ2qTmPFnS9uD5I3QGEqHY2KlsYv4S-WOeEEROj~~x6W2slP2GvPPgPlYs~lwdr-mtJjVFu4LtyDhdfYezC7B0ysfJytyMIyAFriVMqOGsRJXqoQfsg8Ckd2b6kRcyDZXJE25cW8zBS08lyVwMBCsGS0BKSin8uSuD7pQu3QAubT7p~MPkfc6PSXYIJREBr3q4-8c7UnrYOAiXfSW1AmFE47rr3Wxh2TpU3E-Fxu-6e3DKN4q6meACdgZb2KHZo3e-NK7ug9f8sxBp1YT0n5oiVuW4KXguIyXWN~aKEHMa6DzZ4cUJ61LmnMzGndc2sVKhii39FHwTsYog__&Key-Pair-Id=asdfOPZ1EKX0YC",
    "validUntil": "2021-10-25T04:16:37.734633129Z",
    "uploadID": "152848af-2ac9-4e0a-8563-2b82343d964a",
    "findingsPresent": true,
    "requestMetadata": "",
    "errors": []
}

If you follow the URL (before it expires) it will return a JSON representation of the findings similar to those returned by the Scan Plain Text endpoint.

In this example, we have uploaded a zip file with a python script (upload.py) and a README.md file. A Detector in our DetectionRule checks for the presence of the string http://localhost

{
   "findings":[
      {
         "path":"fileupload/upload.py",
         "detector":{
            "id":"58861dee-b213-4dbc-97fa-a148acb8bd1a",
            "name":"localhost url"
         },
         "finding":"http://localhost",
         "confidence":"LIKELY",
         "location":{
            "byteRange":{
               "start":105,
               "end":121
            },
            "codepointRange":{
               "start":105,
               "end":121
            },
            "lineRange":{
               "start":7,
               "end":7
            }
         },
         "beforeContext":"PLOAD_URL = getenv(\"FILE_UPLOAD_HOST\", \"",
         "afterContext":":8080/v3\")\nNF_API_KEY = getenv(\"NF_API_K",
         "matchedDetectionRuleUUIDs":[
            "950833c9-8608-4c66-8a3a-0734eac11157"
         ],
         "matchedDetectionRules":[
            
         ]
      },
      {
         "path":"fileupload/README.md",
         "detector":{
            "id":"58861dee-b213-4dbc-97fa-a148acb8bd1a",
            "name":"localhost url"
         },
         "finding":"http://localhost",
         "confidence":"LIKELY",
         "location":{
            "byteRange":{
               "start":570,
               "end":586
            },
            "codepointRange":{
               "start":570,
               "end":586
            },
            "lineRange":{
               "start":22,
               "end":22
            }
         },
         "beforeContext":"t the script will send the requests to `",
         "afterContext":":8080`, but this can be overridden using",
         "matchedDetectionRuleUUIDs":[
            "950833c9-8608-4c66-8a3a-0734eac11157"
         ],
         "matchedDetectionRules":[
            
         ]
      },
      {
         "path":"fileupload/README.md",
         "detector":{
            "id":"58861dee-b213-4dbc-97fa-a148acb8bd1a",
            "name":"localhost url"
         },
         "finding":"http://localhost",
         "confidence":"LIKELY",
         "location":{
            "byteRange":{
               "start":965,
               "end":981
            },
            "codepointRange":{
               "start":965,
               "end":981
            },
            "lineRange":{
               "start":26,
               "end":26
            }
         },
         "beforeContext":"ice deployment you want to connect to | ",
         "afterContext":":8080 |\n| `NF_API_KEY`      | the API Ke",
         "matchedDetectionRuleUUIDs":[
            "950833c9-8608-4c66-8a3a-0734eac11157"
         ],
         "matchedDetectionRules":[
            
         ]
      }
   ]
}

Uploading and Scanning API Calls

Nightfall's upload process is built to accommodate files of any size. Once files are uploaded, they may be scanned with Detection Rules and Policies to detect potential violations.

Many users will find it more convenient to use our our native language SDKs to complete the upload process.

Uploading files using Client SDK libraries requires fewer steps as all the required API operations are wrapped in a single function call. Furthermore these SDKs handle all the programmatic logic necessary to send files in smaller chunks to Nightfall.

For users that are looking to understand the entire upload process end-to-end, that is also outlined in this document. We will walk you through the order of operations necessary to upload the file.

Using Nightfall's SDKs to Upload Files

Rather than implementing the full sequence of API calls for the upload functionality yourself, the Nightfall’s native language SDKs provide a single method that wraps the steps required to upload your file.

Below is an example of uploading a file from our Python SDK and our Node SDK.

>>> from nightfall import Confidence, DetectionRule, Detector, Nightfall, EmailAlert, AlertConfig
>>> import os

>>> # use your API Key here
>>> nightfall = Nightfall("NF-y0uRaPiK3yG03sH3r3")

>>> # A rule contains a set of detectors to scan with
>>> cc = Detector(min_confidence=Confidence.LIKELY, nightfall_detector="CREDIT_CARD_NUMBER")
>>> ssn = Detector(min_confidence=Confidence.POSSIBLE, nightfall_detector="US_SOCIAL_SECURITY_NUMBER")
>>> detection_rule = DetectionRule([cc, ssn])
>>> # The scanning is done asynchronously, so provide a valid email address as the simplest way of getting results
>>> alertconfig = alert_config=AlertConfig(email=EmailAlert("whatever@example.com"))
    

>>> # Upload the file and start the scan.
>>> id, message = nightfall.scan_file( "./README.md", detection_rules=[detection_rule], alert_config=alertconfig)
>>> print("started scan", id, message)

//this script assumes the node sdk has been installed locally with `npm install` and `npm run build`
import { Nightfall } from "./nightfall-nodejs-sdk/dist/nightfall.js";
import { Detector } from "./nightfall-nodejs-sdk/dist/types/detectors.js";


// By default, the client reads your API key from the environment variable NIGHTFALL_API_KEY
const uploadit = async() => {
    var data = null;
    
    const nfClient = new Nightfall();
    	
    try{
   
		const response = await nfClient.scanFile('./README.md', {
		  detectionRules: [
			{
			  name: 'Secrets Scanner',
			  logicalOp: 'ANY',
			  detectors: [
				{
				  minNumFindings: 1,
				  minConfidence: Detector.Confidence.Possible,
				  displayName: 'Credit Card Number',
				  detectorType: Detector.Type.Nightfall,
				  nightfallDetector: 'CREDIT_CARD_NUMBER',
				},
			  ],
			},
		  ],
		  alertConfig: {
				email: {
						address: "whatever@example.com"
					}
		   }
		});

		if (response.isError) {
		  data = response.getError();
		}
		else{ 
			data = (response.data.id);
		}
	 
    }
	catch(e){
		console.log(e);
	}


	return data;

}

uploadit().then(data => console.log(data));

To run the node sample script you must compile it as TypesScript. Save it as a .ts file and run

tsc <yourfilename>.ts -lib ES2015,DOM

You can then run the resulting JavaScript file:

NIGHTFALL_API_KEY=<YourApiKey> node yourscriptname.js

Note that these examples use an email address to receive the results for simplicity.

You may also want to use a webhook. See Webhooks and Asynchronous Notifications for additional information on how to set up Webhook server to receive these results.

The Upload Process

The upload process consists of 3 stages:

Initializing
Uploading
Completing

Once the upload is complete, you may initiate the file scan.

After we discuss each API call in the sequence, you will find a script that walks through the full sequence at the end of this guide.

Initializing Phase

POST /v3/upload

The first step in the process of scanning a binary file is to initiate an upload in order to get a fileId through the Initiate a File Upload endpoint.

As part of the initialization you must provide the total byte size of the file being uploaded.

You may also provide the mime-type, otherwise the system will attempt to determine it once the upload is complete.

curl --location --request POST 'https://api.nightfall.ai/v3/upload' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
--data-raw '{
    "fileSizeBytes": 73891,
    "mimeType" : "image/png"
}'

The id of the returned JSON object will be used as the fileId in subsequent requests.

The chunkSize is the maximum number of bytes to upload during the uploading phase.

{
    "id": "f9dbdb15-c9fa-46ff-86ec-cd5c09aa550d",
    "fileSizeBytes": 73891,
    "chunkSize": 10485760,
    "mimeType": "image/png"
}

Uploading Phase

PATCH /v3/upload/<uploadUUID>

Use the Upload a Chunk of a File endpoint to upload the file contents in chunks.

The size of these chunks are determined by the chunkSize value returned by POST /upload endpoint used in the previous step.

Below is a simple example where the file is less than the chunkSize so may safely be uploaded with one call to the upload endpoint.

curl --location --request PATCH 'https://api.nightfall.ai/v3/upload/f9dbdb15-c9fa-46ff-86ec-cd5c09aa550d' \
--header 'X-Upload-Offset: 0' \
--header 'Content-Type: application/octet-stream' \
--header 'Authorization: Bearer NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
--data-binary '@/Users/myname/Documents/work/Nightfall/Nightfall Upload Sequence.png'

If your file's size exceeds the chunkSize, to upload the complete file you will need to send iterative requests as you read portions of the file's contents. This means you will send multiple requests to the upload endpoint as shown above. As you do so, you will be updating the value of the X-Upload-Offset header based on the portion of the file being sent.

Each request should send a chunk of the file exactly chunkSize bytes long except for the final uploaded chunk. The final uploaded chunk is allowed to contain fewer bytes as the remainder of the file may be less than the chunkSize returned by the initialization step.

The request body should be the contents of the chunk being uploaded.

The value of the X-UPLOAD-OFFSET header should be the byte offset specifying where to insert the data into the file as an integer. This byte offset is zero-indexed.

Successful calls to this endpoint return an empty response with an HTTP status code of 204

See the full example script below for an illustration as to how this upload process can be done programmatically.

Completion Phase

POST /v3/upload/<uploadUUID>/finish

Once all chunks are uploaded, mark the upload as completed using the Complete a File Upload endpoint.

curl --location --request POST 'https://api.nightfall.ai/v3/upload/f9dbdb15-c9fa-46ff-86ec-cd5c09aa550d/finish' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer  NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
--data-raw '""'

When an upload completes successfully, the returned payload will indicate the mimeType the system determined to file to be if it was not provided during upload initialization.

{
    "id": "152848af-2ac9-4e0a-8563-2b82343d964a",
    "fileSizeBytes": 2349,
    "chunkSize": 10485760,
    "mimeType": "application/zip"
}

Once a file has been marked as completed, you may initiate a scan of the uploaded file.

Scanning Uploaded Files

After an upload is finalized, it can be scanned against a Detection Policy. A Detection Policy represents a pairing of:

a webhook URL
a set of detection rules to scan data against

The scanning process is asynchronous, with results being delivered to the webhook URL configured on the detection policy. See Webhooks and Asynchronous Notifications for more information about creating a Webhook server.

Exactly one policy should be provided in the request body, which includes a webhookURL to which the callback will be made once the file scan has been completed (this must be an HTTPS URL) as well as a Detection Rule as either an a list of UUIDs or as a rule that has been defined in-line.

You may also supply a value to the requestMetadata field to help identify the input file upon receiving a response to your webhook. This field has a maximum length 10 KB.

curl --request POST \
     --url https://api.nightfall.ai/v3/upload/f9dbdb15-c9fa-46ff-86ec-cd5c09aa550d/scan \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
     --header 'Content-Type: application/json' \
     --data '
{
     "policy": {
          "detectionRuleUUIDs": [
               "950833c9-8608-4c66-8a3a-0734eac11157"
          ],
          "webhookURL": "https://mycompany.org/webhookservice"
     },
     "requestMetadata": "your file metadata"
}
'

Webhook Verification

Nightfall will verify that the webhook URL is valid before launching its asynchronous scan by issuing a challenge.

Full Upload Process Example Script

Below is a sample Python script that handles the complete sequence of API calls to upload a file using a path specified as an argument.

from os import getenv, path

import fire
import requests


BASE_UPLOAD_URL = getenv("FILE_UPLOAD_HOST", "http://api.nightfall.ai/v3")
NF_API_KEY = getenv("NF_API_KEY")


def upload(filepath, mimetype, policy_uuid):
    """Upload the given file using the provided MIMEType and PolicyUUID.

    Arguments:
        file_path -- an absolute or relative path to the file that will be
            uploaded to the API.
        mimetype -- (optional) The mimetype of the file being uploaded.
        policy_uuid -- The UUID corresponding to an existing policy. This
            policy must be active and have a webhook URL associated with it.
    """
    default_headers = {
        "Authorization": F"Bearer {NF_API_KEY}",
    }

    # =*=*=*=*=* Initiate Upload =*=*=*=*=*=*
    file_size = path.getsize(filepath)
    upload_request_body = {"fileSizeBytes": file_size, "mimeType": mimetype}
    r = requests.post(F"{BASE_UPLOAD_URL}/upload",
                      headers=default_headers,
                      json=upload_request_body)
    upload = r.json()
    if not r.ok:
        raise Exception(F"Unexpected error initializing upload - {upload}")

    # =*=*=*=*=*=* Upload Chunks =*=*=*=*=*=*
    chunk_size = upload["chunkSize"]
    i = 0
    with open(filepath, "rb") as file:
        while file.tell() < file_size:
            upload_chunk_headers = {
                **default_headers,
                "X-UPLOAD-OFFSET": str(file.tell())
            }
            r = requests.patch(F"{BASE_UPLOAD_URL}/upload/{upload['id']}",
                               headers=upload_chunk_headers,
                               data=file.read(chunk_size))
            if not r.ok:
                raise Exception(F"Unexpected error uploading chunk - {r.text}")
            i += 1

    # =*=*=*=*=*=* Finish Upload =*=*=*=*=*=*
    r = requests.post(F"{BASE_UPLOAD_URL}/upload/{upload['id']}/finish",
                      headers=default_headers)
    if not r.ok:
        raise Exception(F"Unexpected error finalizing upload - {r.text}")

    # =*=*=*=*=* Scan Uploaded File =*=*=*=*=*
    r = requests.post(F"{BASE_UPLOAD_URL}/upload/{upload['id']}/scan",
                      json={"policyUUID": policy_uuid},
                      headers=default_headers)
    if not r.ok:
        raise Exception(F"Unexpected error initiating scan - {r.text}")

    print("Scan Initiated Successfully - await response on configured webhook")
    quota_remaining = r.headers.get('X-Quota-Remaining')
    if quota_remaining is not None and int(quota_remaining) <= 0:
        print(F"Scan quota exhausted - Quota will reset on {r.headers['X-Quota-Period-End']}")


if __name__ == "__main__":
    fire.Fire(upload)

Special File Types

Spreadsheets and Tabular Data

Findings will contain a columnRange and a rowRange that will allow you to identify the specific row and column within the tabular data wherein the finding is present.

This functionality is applicable to the following mime types:

text/csv
text/tab-separated-values
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
application/vnd.ms-excel

Apache parquet data files are also accepted.

Below is a sample match of a spreadsheet containing dummy PII where a SSN was detected in the 2nd column and 55th row.

{
   "findings":[
      {
         "path":"Sheet1 (5)",
         "detector":{
            "id":"e30d9a87-f6c7-46b9-a8f4-16547901e069",
            "name":"US social security number (SSN)",
            "version":1
         },
         "finding":"624-84-9182",
         "confidence":"LIKELY",
         "location":{
            "byteRange":{
               "start":2505,
               "end":2516
            },
            "codepointRange":{
               "start":2452,
               "end":2463
            },
            "lineRange":{
               "start":55,
               "end":55
            },
            "rowRange":{
               "start":55,
               "end":55
            },
            "columnRange":{
               "start":2,
               "end":2
            },
            "commitHash":""
         },
         "matchedDetectionRuleUUIDs":[
            "950833c9-8608-4c66-8a3a-0734eac11157"
         ],
         "matchedDetectionRules":[
            
         ]
      },
...

Git Repositories

Nightfall provides special handling for archives of GitHub repositories.

Nightfall will scan the repository history to discover findings in particular checkin, returning the hash for the checkin.

In order to scan the repository, you will need to create a clone, i.e.

git clone https://github.com/nightfallai/nightfall-go-sdk.git

This creates a clone of the Nightfall go SDK.

You will then need to create an archive that can be uploaded using Nightfall's file scanning sequence.

zip -r directory.zip directory

Note that in order to work, the hidden directory .github must be included in the archive.

When you initiate the file upload sequence with this file, you will receive scan results that contain the commitHash property filled in.

Using the Nightfall go SDK archive created above, a simple example would be to scan for URLs (i.e. strings starting with http:// or https://), which will send results such as the following:

{
   "findings":[
      {
         "path":"f607a067..53e59684/nightfall.go",
         "detector":{
            "id":"6123060e-2d9f-4f35-a7a1-743379ea5616",
            "name":"URL"
         },
         "finding":"https://api.nightfall.ai/\"",
         "confidence":"LIKELY",
         "location":{
            "byteRange":{
               "start":142,
               "end":168
            },
            "codepointRange":{
               "start":142,
               "end":168
            },
            "lineRange":{
               "start":16,
               "end":16
            },
            "rowRange":{
               "start":0,
               "end":0
            },
            "columnRange":{
               "start":0,
               "end":0
            },
            "commitHash":"53e59684d9778ceb0f0ed6a4b949c464c24d35ce"
         },
         "beforeContext":"tp\"\n\t\"os\"\n\t\"time\"\n)\n\nconst (\n\tAPIURL = \"",
         "afterContext":"\n\n\tDefaultFileUploadConcurrency = 1\n\tDef",
         "matchedDetectionRuleUUIDs":[
            "cda0367f-aa75-4d6a-904f-0311209b3383"
         ],
         "matchedDetectionRules":[
            
         ]
      },
 ...

Sensitive Data in GitHub Repositories

If the finding in a GitHub repository is considered to be sensitive, it should be considered compromised and appropriate mitigation steps (i.e. secrets should be rotated).

To retrieve the specific checkout, you will need to clone the repository, i.e.

git clone https://github.com/nightfallai/nightfall-go-sdk.git

You can then checkout the specific commit using the commit hash returned by Nightfall.

cd nightfall-go-sdk
git checkout 53e59684d9778ceb0f0ed6a4b949c464c24d35ce

Note that you are in a 'detached HEAD' state when working with this sort of check out of a repository.

Specialized File Detectors

Nightfall supports Detectors that will scan for file names, file types, and file finger prints.

Detecting File Names

In addition to scanning the content of files, you may configure the Detectors to scan file names as well.

This is done through the “scope” attribute of a Detector.

The scope attribute allows you to scan either within file contents, the file name, or both the file contents and file name.

File extensions can be scanned for by creating a Regular Expression type custom Detector with a scope to scan only file names ("File") or both the content and file name ("ContentAndFile"), as shown in the example request below.

Note that confidence sensitivity does not apply to file names. Sensitive findings will always be reported on.

Detecting File Types

Nightfall’s File Type detection allows you to implement compliance policies that detect and alert you when particular file types that are not allowed in a given location are discovered.

This functionality is implemented by creating a specific Detector called a “File Type Detector”

To create a File Type Detector, select “Detectors” from the left hand navigation and click the button labeled “+New Detector” in the upper right hand corner. From there a drop down list of Detector types will be displayed which will include the “File Type” Detector type.

You can either scroll through the list of mime-types in the select box or you may type in a portion of the mime-type and the contents of the select box will be filtered to match your input.

File Type Detectors vary from other Nightfall Detectors in that the attributes of scope and confidence are not relevant to File Type Detectors

Detecting Files Through Fingerprinting

Nightfall allows you to discover the location of specific files that you have deemed sensitive and want to avoid sharing.

This discovery is done through document fingerprinting. Fingerprinting is the process of algorithmically creating a unique identifier for a file by mapping the data of the document to a signature that can be recalled quickly. This allows the file to be identified in a manner akin to how human fingerprints uniquely identify individual people.

This functionality is achieved in Nightfall by creating a specific Detector type called a File Fingerprint Detector.

The Fingerprint Detector allows you to create a fingerprint for one more files (a sort “handful” of fingerprints, if you would).

To create a Fingerprint Detector, select “Detectors” from the left hand navigation and click the button labeled “+New Detector” in the upper right hand corner. From there a drop down list of Detector types will be displayed which will include the “Fingerprint” Detector type.

When you create a File Fingerprint Detector you can upload up to 50 files that need to be fingerprinted. The file size limit is 25MB.

Once the fingerprint is generated, the actual content of the file is discarded so no sensitive content is stored on Nightfall’s system.

These Detectors may only be created through the console.

Updates to Fingerprinted Files

You can not update Fingerprint Detectors, so any modification to the original file or underlying requires that you create a brand new Fingerprint Detector.

Webhooks and Asynchronous Notifications

The Nightfall API supports the ability to send asynchronous notifications when findings are detected as part of a scan request.

The supported destinations for these notifications include external platforms, such as Slack, email, or url to a SIEM log collector as well as to a webhook server.

Nightfall issues notifications under the following scenarios:

to notify a client about the results of a . File scans themselves are always performed asynchronously because of complexity relating to text extraction and data volume.
to notify a client about results from a text scan request. Although results are already delivered synchronously in the response object, clients may configure the request to forward results to other platforms such a webhook, SIEM endpoint, or email through a

To create a webhook you will need to and then set up a

For more information on how webhooks and asynchronous notifications are used please see our guides on:

Accessing Your Webhook Signing Key

In order to accept requests from Nightfall, a Webhook server must use a signing key to verify requests.

To access or generate your Webhook signing key, start by logging in to the Nightfall .

Select the Developer Platform > Manage API Keys using the navigation bar on the left side of the page. You will see the Webhook signing section:

Unlike the API Key, it is possible to reveal the signature via the "eye" icon furtherest to the left of the three icons displayed.

You may copy the current value to your clipboard with the "copy" icon in the center of the three icons displayed.

You may also regenerate the key with the circular arrow icon furthest to the right.

Use this value as shown in the code examples that are used in the following sections.

Creating a Webhook Server

Learn how to set up a server to handle results of file scans and alerts sent based on policy alert configurations.

Webhook Challenges

Nightfall will send a POST request with a JSON payload with a single field challenge containing randomly-generated bytes when it sends a message to a user-provided webhook address. This is to ensure that the caller owns the server.

In order to authenticate your webhook server to Nightfall, you must reply with (1) a 200 HTTP Status Code, and (2) a plaintext request body containing only the value of the challenge key.

If Nightfall receives the expected value back, then the file scan operation will proceed; otherwise it will be aborted.

When a server responds successfully to a challenge request, the validity of that URL will be cached for up to 24 hours, after which it will need to be validated again.

If the webhook cannot be reached, you will receive an error with the code "40012" and the description "Webhook URL validation failed" when you initiate the scan.

If the webhook challenge fails, you will receive an error with the code "42201" and the description "Webhook returned incorrect challenge response" when you initiate the scan.

Webhook Signature Verification

When a customer signs up for the developer platform, Nightfall automatically generates a unique for them.

This secret is used to sign requests to the customer's configured webhook URL.

Signing Secret Security

The signing secret should never be stored in plaintext, as a leak compromises the authenticity of webhook requests.

If you has any concerns that their signing secret may have leaked, you can request rotation at any time by reaching out to Nightfall Customer Success.

For security purposes, the webhook includes a signature header containing an HMAC-SHA256 digital signature that customers may use to authenticate the client.

In order to authenticate requests to the webhook URL, customers may use the following algorithm:

Check for the presence of the headers X-Nightfall-Signature and X-Nightfall-Timestamp. If these headers are not both present, discard the request.
Read the entire request body into a string body.
Verify that the value in the X-Nightfall-Timestamp header (the POSIX time in seconds) occurred recently. This is to protect against replay attacks, so a threshold on the order of magnitude of minutes should be reasonable. If a request occurred too far in the past, it should be discarded.
Concatenate the timestamp and body with a colon delimiter, i.e. timestamp:body.
Compute the HMAC SHA-256 hash of the payload from the previous step, using your unique signing secret as the key. Encode this computed value in hex.
Compare the value of the X-Nightfall-Signature header to the value computed in the previous step. If the values match, authentication is successful, and processing should proceed. Otherwise, the request must be discarded.

The snippet below shows how you might implement this authentication validation in Python:

Example Webhook Server

An example implementation of a simple webhook server is below.

In the above example, the webhook server is running on port 8075. To route ngrok requests to this server, once you run the python script (having installed the necessary dependencies such getenv and Flask), you would run ngrok as follow:

./ngrok http 8075

Scanning Features

Nightfall offers many useful features beyond its detectors, including:

The ability to use and to narrow the scope of matches.

The ability to create in a way that is highly configurable so that sensitive data is appropriately obfuscated.

The ability to create that determine how leaks of sensitive information should be mitigated (i.e. through alerts sent to email or Slack).

Using Pre-Configured Detection Rules

In this example, we'll walk through making a request to the scan endpoint.

The endpoint inspects the data you provide via the request body and reports any detected occurrences of the sensitive data types you are searching for.

Please refer to the API reference of the scan endpoint for more detailed information on the request and response schemas.

In this sample request, we provide two main fields:

a policy and its detection rules that we want to use when scanning the text payload
a list of text strings to scan

In the example below we will use a Detection Rule that has been configured in the by supplying its UUID.

The aggregate length of all strings in payload list must not exceed 500 KB, and the number of items in the payload may not exceed 50,000.

Executing the curl request will yield a response as follows.

The API call returns a list, where the item at each index is a sublist of matches for the provided detector types.

The indices of the response list correspond directly to the indices of the list provided in the request payload.

In this example, the first item in the response list contains a finding because one credit card number was detected in the first string we provided. The second item in the response list is an empty list because there is no sensitive data in the second input string we provided. The third item in the returned list contains multiple findings as a result of multiple Detectors within the Detection Rule being triggered.

Creating an Inline Detection Rule

In addition to using pre-defined Detection Rules, you may define Detection Rules within the body of your scan method by either supplying:

the identifier of one of Nightfall's native detectors
the UUID of an a Detector defined through the UI
a Regular Expression
a Word List.

Built In Detectors

Out of the box, Nightfall comes with an extensive library of native detectors.

In the example below two of Nightfall's native Detectors (detectorType = "NIGHTFALL_DETECTOR") are being used:

US_SOCIAL_SECURITY_NUMBER
CREDIT_CARD_NUMBER.

When defining a Detection Rule in line, you configure the minimum confidence level (minConfidence) and minimum number of times the match must be found (minNumFindings) for the rule to be triggered. .

For more information on the parameters related to redaction, see Using Redaction.

curl --request POST \
     --url https://api.nightfall.ai/v3/scan \
     --header 'Authorization: Bearer NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
     --header 'Content-Type: application/json' \
     --data '{
       "policy": {
            "detectionRules": [
                 {
                      "detectors": [
                           {
                                "detectorType": "NIGHTFALL_DETECTOR",
                                "nightfallDetector": "US_SOCIAL_SECURITY_NUMBER",
                                "minNumFindings": 1,
                                "minConfidence": "LIKELY",
                                "displayName": "US Social Security Number"
                           },
                           {
                                "detectorType": "NIGHTFALL_DETECTOR",
                                "nightfallDetector": "CREDIT_CARD_NUMBER",
                                "minNumFindings": 1,
                                "minConfidence": "LIKELY",
                                "displayName": "Credit Card Number",
                                "redactionConfig": {
                                    "maskConfig": {
                                        "maskingChar": "👀",
                                        "charsToIgnore": ["-"]
                                    }
                                }
                           }
                      ],
                      "name": "My Match Rule",
                      "logicalOp": "ANY"
                 }
            ]
       },
       "payload": [
            "The customer social security number is 458-02-6124",
            "No PII in this string",
            "My credit card number is 5310-2768-6832-9293"
       ]
     }'

Below is the response payload to the previous request.

{
  "findings": [
    [
      {
        "finding": "458-02-6124",
        "detector": {
          "name": "US Social Security Number",
          "uuid": "e30d9a87-f6c7-46b9-a8f4-16547901e069"
        },
        "confidence": "VERY_LIKELY",
        "location": {
          "byteRange": {
            "start": 39,
            "end": 50
          },
          "codepointRange": {
            "start": 39,
            "end": 50
          }
        },
        "matchedDetectionRuleUUIDs": [],
        "matchedDetectionRules": [
          "My Match Rule"
        ]
      }
    ],
    [],
    [
      {
        "finding": "5310-2768-6832-9293",
       "redactedFinding": "👀👀👀👀-👀👀👀👀-👀👀👀👀-👀👀👀👀",
        "detector": {
          "name": "Credit Card Number",
          "uuid": "74c1815e-c0c3-4df5-8b1e-6cf98864a454"
        },
        "confidence": "VERY_LIKELY",
        "location": {
          "byteRange": {
            "start": 25,
            "end": 44
          },
          "codepointRange": {
            "start": 25,
            "end": 44
          }
        },
        "redactedLocation": {
          "byteRange": {
            "start": 25,
            "end": 44
          },
          "codepointRange": {
            "start": 25,
            "end": 44
          }
        },
        "matchedDetectionRuleUUIDs": [],
        "matchedDetectionRules": [
          "My Match Rule"
        ]
      }
    ]
  ],
  "redactedPayload": [
    "",
    "",
    "My credit card number is 👀👀👀👀-👀👀👀👀-👀👀👀👀-👀👀👀👀"
  ]
}

Regular Expression Example

The following example shows a Detection Rule composed of two Detectors defined using regular expressions – one for the format of an International Standard Recording Code (ISRC) and one for the format of an International Standard Musical Work Code (ISWC) – matching either of which will trigger the Detection Rule (by using the logicalOp “Any”).

We will provide a payload of two strings, one of which will match the ISRC and one of which will match the ISWC.

curl --location --request POST 'https://api.nightfall.ai/v3/scan' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
--header 'Content-Type: application/json' \
--data-raw '{
     "config": {
          "detectionRules": [
               {
                    "detectors": [
                         {
                              "regex": {
                                   "isCaseSensitive": false,
                                   "pattern": "[A-Z]{2}-?\\w{3}-?\\d{2}-?\\d{5}"
                              },
                              "minNumFindings": 1,
                              "minConfidence": "POSSIBLE",
                              "detectorType": "REGEX",
                              "displayName": "ISRC Code Detector"
                         },
                         {
                              "regex": {
                                   "isCaseSensitive": false,
                                   "pattern": "T-[0-9]{3}\\.[0-9]{3}\\.[0-9]{3}-[0-9]"
                              },
                              "minNumFindings": 1,
                              "minConfidence": "POSSIBLE",
                              "detectorType": "REGEX",
                              "displayName": "ISWC Code Detector"
                         }                         
                    ],
                    "name": "ISRC and ISWC Code Detection Rule",
                    "logicalOp": "ANY"
               }
          ]
     },
     "payload": [
          "Non Matching Payload",
          "US-S1Z-99-00001 is an example ISRC Code: ",
          "The ISWC for Symphony No. 9 is T-905.029.737-5"
     ]
}
'

The returned response demonstrates how findings are returned, with a finding per payload entry and the Detection Rule and Detector that matched the payload, if any.

The byte range that triggered the match is also provided. In the case of the 2nd item in the payload, since the match occurred at the beginning of the string, it has a location where the byteRange start is 0. In the case of the 3rd payload entry the location offset is 31.

{
    "findings": [
        [],
        [
            {
                "finding": "US-S1Z-99-00001",
                "detector": {
                    "name": "ISRC Code Detector",
                    "uuid": "d8be87c9-4b44-41fd-b78c-8d638fe56069"
                },
                "confidence": "LIKELY",
                "location": {
                    "byteRange": {
                        "start": 0,
                        "end": 15
                    },
                    "codepointRange": {
                        "start": 0,
                        "end": 15
                    }
                },
                "matchedDetectionRuleUUIDs": [],
                "matchedDetectionRules": [
                    "ISRC and ISWC Code Detection Rule"
                ]
            }
        ],
        [
            {
                "finding": "T-905.029.737-5",
                "detector": {
                    "name": "ISWC Code Detector",
                    "uuid": "faf4c830-f2ac-4934-bf9c-ff20f5a6f420"
                },
                "confidence": "LIKELY",
                "location": {
                    "byteRange": {
                        "start": 31,
                        "end": 46
                    },
                    "codepointRange": {
                        "start": 31,
                        "end": 46
                    }
                },
                "matchedDetectionRuleUUIDs": [],
                "matchedDetectionRules": [
                    "ISRC and ISWC Code Detection Rule"
                ]
            }
        ]
    ]
}

Word List Example

The following example shows how a word list may be used instead of a regular expression.

curl --location --request POST 'https://api.nightfall.ai/v3/scan' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header 'x-api-key: NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
--data-raw '{
     "config": {
          "detectionRules": [
               {
                    "detectors": [
                         {
                              "wordList": {
                                   "values": [
                                        "cat",
                                        "dog",
                                        "rat"
                                   ],
                                   "isCaseSensitive": false
                              },
                              "minNumFindings": 1,
                              "minConfidence": "POSSIBLE",
                              "displayName": "animals",
                              "detectorType": "WORD_LIST"
                         }
                    ],
                    "name": "WordListExamples",
                    "logicalOp": "ANY"
               }
          ]
     },
     "payload": [
          "THE CAT SAT ON THE MAT",
          "The dog and the rat are on the west bank of the river",
          "No one here but use chickens"
     ]
}'

Below is the resulting payload with the findings detected in our different payload strings.

Note that since the isCaseSensitive flag is set to "false" for the detector, so the first string in our payload matches a word from our word list.

Also note that the confidence level for a word list match defaults to "LIKELY" so you should not set a minConfidence level higher than that if you want matches to result.

{
    "findings": [
        [
            {
                "finding": "cat",
                "detector": {
                    "name": "animals",
                    "uuid": "c033e224-034a-417f-9c0d-0c8d13f462bb"
                },
                "confidence": "LIKELY",
                "location": {
                    "byteRange": {
                        "start": 4,
                        "end": 7
                    },
                    "codepointRange": {
                        "start": 4,
                        "end": 7
                    }
                },
                "matchedDetectionRuleUUIDs": [],
                "matchedDetectionRules": [
                    "WordListExamples"
                ]
            }
        ],
        [
            {
                "finding": "dog",
                "detector": {
                    "name": "animals",
                    "uuid": "c033e224-034a-417f-9c0d-0c8d13f462bb"
                },
                "confidence": "LIKELY",
                "location": {
                    "byteRange": {
                        "start": 4,
                        "end": 7
                    },
                    "codepointRange": {
                        "start": 4,
                        "end": 7
                    }
                },
                "matchedDetectionRuleUUIDs": [],
                "matchedDetectionRules": [
                    "WordListExamples"
                ]
            },
            {
                "finding": "rat",
                "detector": {
                    "name": "animals",
                    "uuid": "c033e224-034a-417f-9c0d-0c8d13f462bb"
                },
                "confidence": "LIKELY",
                "location": {
                    "byteRange": {
                        "start": 16,
                        "end": 19
                    },
                    "codepointRange": {
                        "start": 16,
                        "end": 19
                    }
                },
                "matchedDetectionRuleUUIDs": [],
                "matchedDetectionRules": [
                    "WordListExamples"
                ]
            }
        ],
        []
    ],
    "redactedPayload": [
        "",
        "",
        ""
    ]
}

Using Exclusion Rules

An Exclusion Rule allows you to refine a Detector to make sure false positives are not surfaced by Nightfall.

For instance you may want to detect whether credit card numbers are being shared inappropriately in your organization. However, there may be cases where members of your QA are sharing test credit card numbers, which should not be considered a violation and should be ignored by Nightfall.

In the following example, we define a Detector with a regular expression to match credit cards.

We then add an exclusion for some known test credit cards.

curl --location --request POST 'https://api.nightfall.ai/v3/scan' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
--header 'Content-Type: application/json' \
--data-raw '{
    "policy": {
        "detectionRules": [
            {
                "detectors": [
                    {
                        "regex": {
                            "pattern": "(?:(4[0-9]{12}(?:[0-9]{3})?)|(5[1-5][0-9]{14})|(6(?:011|5[0-9]{2})[0-9]{12})|(3[47][0-9]{13})|(3(?:0[0-5]|[68][0-9])[0-9]{11})|((?:2131|1800|35[0-9]{3})[0-9]{11}))",
                            "isCaseSensitive": false
                        },
                        "exclusionRules": [
                            {
                                "wordList": {
                                    "values": [
                                        "4111111111111111",
                                        "5105105105105100"
                                    ]
                                },
                                "exclusionType": "WORD_LIST",
                                "matchType": "FULL"
                            }
                        ],
                        "minNumFindings": 1,
                        "minConfidence": "POSSIBLE",
                        "displayName": "Credit Card Reg Ex",
                        "detectorType": "REGEX"
                    }
                ],
                "name": "Credit Card Detection Rule",
                "logicalOp": "ALL"
            }
        ]
    },
    "payload": [
        "5105105105105100",
        "4111111111111111",
        "4012888888881881"
    ]
}'

As the resulting payload shows, only the 3rd provided Credit Card number matches because the first two items in the payload are included in our ExclusionRules word list.

{
   "findings":[
      [
         
      ],
      [
         
      ],
      [
         {
            "finding":"4012888888881881",
            "detector":{
               "name":"Credit Card Reg Ex",
               "uuid":"93024e88-e6de-4c84-8295-75157cdd1b52"
            },
            "confidence":"LIKELY",
            "location":{
               "byteRange":{
                  "start":0,
                  "end":16
               },
               "codepointRange":{
                  "start":0,
                  "end":16
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "matchedDetectionRuleUUIDs":[
               
            ],
            "matchedDetectionRules":[
               "Credit Card Detection Rule"
            ]
         }
      ]
   ],
   "redactedPayload":[
      "",
      "",
      ""
   ]
}

Using Context Rules

You can use the surrounding context of a match to help determine how likely it is that your potential match should actually be considered as a match by adjusting its confidence rating.

You can also tell the Detection Rule to return a portion of the surrounding context for manual review.

In the following example, in addition to providing a regular expression to match Social Security Numbers, we also look to see if someone has written the text “SSN” before and after the match, which might be a label indicating it is indeed a social security number. In which case, we change our confidence score to “VERY_LIKELY.” We then provide two possible matches in our payload, the first of which contains the string “SSN”.

curl --location --request POST 'https://api.nightfall.ai/v3/scan' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
--header 'Content-Type: application/json' \
--data-raw '{
     "policy": {
          "detectionRules": [
               {
                    "detectors": [
                         {
                              "regex": {
                                   "isCaseSensitive": false,
                                   "pattern": "\\d{3}-\\d{2}-\\d{4}"
                              },
                              "contextRules": [
                                   {
                                        "regex": {
                                             "pattern": "SSN",
                                             "isCaseSensitive": false
                                        },
                                        "proximity": {
                                             "windowBefore": 20,
                                             "windowAfter": 20
                                        },
                                        "confidenceAdjustment": {
                                             "fixedConfidence": "VERY_LIKELY"
                                        }
                                   }
                              ],
                              "minNumFindings": 1,
                              "minConfidence": "POSSIBLE",
                              "detectorType": "REGEX",
                              "displayName": "SSN Match Detector"
                         }
                    ],
                    "name": "SSN Match Detection Rule",
                    "logicalOp": "ALL"
               }
          ],
          "contextBytes": 20
     },
     "payload": [
          "My SSN is 555-55-5555",
          "Here it is : 555-55-5555"
     ]
}
'

In the results, you can see the confidence for the first finding in the payload has been set to VERY_LIKELY while the second item is only LIKELY.

{
   "findings":[
      [
         {
            "finding":"555-55-5555",
            "beforeContext":"My SSN is ",
            "detector":{
               "name":"SSN Match Detector",
               "uuid":"6131f41c-dbdd-47a9-8c6f-1819c9baf388"
            },
            "confidence":"VERY_LIKELY",
            "location":{
               "byteRange":{
                  "start":10,
                  "end":21
               },
               "codepointRange":{
                  "start":10,
                  "end":21
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "matchedDetectionRuleUUIDs":[
               
            ],
            "matchedDetectionRules":[
               "SSN Match Detection Rule"
            ]
         }
      ],
      [
         {
            "finding":"555-55-5555",
            "beforeContext":"Here it is : ",
            "detector":{
               "name":"SSN Match Detector",
               "uuid":"6131f41c-dbdd-47a9-8c6f-1819c9baf388"
            },
            "confidence":"LIKELY",
            "location":{
               "byteRange":{
                  "start":13,
                  "end":24
               },
               "codepointRange":{
                  "start":13,
                  "end":24
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "matchedDetectionRuleUUIDs":[
               
            ],
            "matchedDetectionRules":[
               "SSN Match Detection Rule"
            ]
         }
      ]
   ],
   "redactedPayload":[
      "",
      ""
   ]
}

Using Redaction

The Nightfall API is capable of returning a redacted version of your scanned text when a Detector is triggered.

This functionality allows you to hide potentially sensitive information while retaining the original context in which that information appeared.

Specifying a RedactionConfig

In order to redact content, when you call the scan endpoint you must provide a RedactionConfig as part of the definition of your Detection Rule.

You may specify one of the following different methods to redact content:

apply masking (e.g. asterisks)
substitute a custom phrase
substitute the name of the Detector triggered (referred to as "InfoType substitution")
use encryption

A RedactionConfig is defined per Detector in a Detection Rule, allowing you to specify a different redaction method for each type of Detector in the rule.

By default, the redaction feature will return both the sensitive finding and the redacted version of that finding. You may set the removeFinding field to true if you want only the redacted version of the finding returned in the response.

Masking Characters

Specifying a MaskConfig as part of your RedactionConfig substitutes a character for each character in the matched text. By default the masking character is an asterisk (*). You may specify an alternate character to use instead (maskingChar).

You may also choose to only mask a portion of the original text by specifying a number of characters to leave unmasked (numCharsToLeaveUnmasked). For instance, if you want to mask all but the last 4 digits of a credit card number, set this value to 4 so that the redacted finding would be rendered as ***************4242.

In the case where you want to leave characters unmasked at the front of the string you may use the maskLeftToRight flag. This flag determines if masking is applied left to right (*****/1984) instead of right to left (01/01*****). By default, this value is false.

Below is an example of how a RedactionConfig would be configured to redact the text that triggers a DATE_OF_BIRTH Detector such that the text 01/11/1995 becomes ??/??/??95

{
  "minNumFindings":1,
  "minConfidence":"POSSIBLE",
  "detectorType":"NIGHTFALL_DETECTOR",
  "nightfallDetector":"DATE_OF_BIRTH",
  "redactionConfig":{
     "maskConfig":{
     "charsToIgnore":[
        "/"
     ],
     "maskingChar":"?",
     "maskRightToLeft":true,
     "numCharsToLeaveUnMasked":2
     }
   }
 }

Phrase Substitution

The SubstitutionConfig substitutes a sensitive finding with the value assigned to the property substitutionPhrase.

If no value is assigned to substitutionPhrase, the finding will be replaced with an empty string.

InfoType Substitution

It is possible to replace a sensitive finding with the name of the NIGHTFALL_DETECTOR that triggered it by using an InfoTypeSubstitutionConfig.

If you use the built in credit card Detector, the string 4242-4242-4242-4242 will be redacted to [CREDIT_CARD_NUMBER]

This config is only valid for Detector's with a detectorType of NIGHTFALL_DETECTOR.

Encryption

A CryptoConfig will encrypt a sensitive finding with a public key (provided as the publicKey property of the config) using RSA encryption.

Note that you are responsible for passing public keys for encryption and handling any decryption of the response payload. Nightfall will not store your keys.

Below is an example of a CryptoConfig being used to redact an EMAIL_ADDRESS detector.

{
  "minNumFindings":1,
  "minConfidence":"POSSIBLE",
  "detectorType":"NIGHTFALL_DETECTOR",
  "nightfallDetector":"EMAIL_ADDRESS",
  "displayName":"email",
  "redactionConfig":{
	 "cryptoConfig":{
		"publicKey":"-----BEGIN PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAydYMwOYUGyBXDgHkzv19YR/dYQES4kYTMUps39qv/amNDywz4nsBDvCUqUvcN3nEpplHlYGH5ShSeA4G/FcmRqynSLVyFPZat/8E7n+EeHsgihFrr8oDWo5UBjCwRinTrC0m11q/5SeNzwVCWkf9x40u94QBz13dQoa9yPwaZBX5uBzyH86R7yeZHpad2cLq0ltpmJ3j5UfsFilkOb3JB60TNpNDdfabprot/y30CEnDDOgAXGtV1m0AhQpQjKRnkUs39DntqSbS+i0UgbyqzEGNUkeR1WsotXekW4KnbWA7k6S8SfkO27vnTSY5b9g/KKaOdysn5YaWJPfTVT/nywIDAQAB\n-----END PUBLIC KEY-----"
	 }
  }
}

Redactions in the Scan Response

The results of applying redactions are returned in the response payload for requests made to the scan endpoint as both part of an array named redactedPayload as well as additional properties of the finding object.

The original input payload with redactions made inline are returned as a list of strings under the redactedPayload property. Each item in the list of redacted payloads corresponds to the list of strings in the original input payload and, if a Detector was triggered, it will contain a redacted version of that corresponding string.

If an item in the input payload did not have any findings, the entry for that index will be an empty string ("").

The redactedPayload property is omitted if no RedactionConfig was provided.

Additionally, the fields redactedFinding and redactedLocation are added to the finding object when the redaction feature is invoked.

The redactedFinding field contains the redacted version of only the text of the finding without its surrounding context. This is useful when you are masking a portion of the text that triggered a Detector.

The redactedLocation property will be returned as part of the finding that corresponds to an item in the payload. This may be distinct from the location property that is returned for a finding by default.

In the unlikely case where there are findings that overlap, Nightfall will default to replacing the text of the overlapping findings with [REDACTED BY NIGHTFALL].

Example Redaction Call

The following example shows how the redaction functionality may be invoked, with a variety of different redaction methods applied to the different Detectors being used.

curl --location --request POST 'https://api.nightfall.ai/v3/scan' \
--header 'x-api-key: NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
--header 'Content-Type: text/plain' \
--data-raw '{
   "payload":[
      "my ssn is 123-45-5555 and date of birth is 01/11/1995 and my credit card number is  4242 4242 4242 4242 and my email is james@gmail.com.",
      "my date of birth is 03 23 4242 4242 4242 4242 amex"
   ],
   "policy":{
      "detectionRules":[
         {
            "detectors":[
               {
                  "minNumFindings":1,
                  "minConfidence":"POSSIBLE",
                  "detectorType":"NIGHTFALL_DETECTOR",
                  "nightfallDetector":"CREDIT_CARD_NUMBER",
                  "displayName":"cc",
                  "redactionConfig":{
                     "infoTypeSubstitutionConfig":{
                        
                     },
                     "removeFinding":true
                  }
               },
               {
                  "minNumFindings":1,
                  "minConfidence":"POSSIBLE",
                  "detectorType":"NIGHTFALL_DETECTOR",
                  "nightfallDetector":"US_SOCIAL_SECURITY_NUMBER",
                  "displayName":"ssn",
                  "redactionConfig":{
                     "substitutionConfig":{
                        "substitutionPhrase":"*REDACTED*"
                     }
                  }
               },
               {
                  "minNumFindings":1,
                  "minConfidence":"POSSIBLE",
                  "detectorType":"NIGHTFALL_DETECTOR",
                  "nightfallDetector":"EMAIL_ADDRESS",
                  "displayName":"email",
                  "redactionConfig":{
                     "cryptoConfig":{
                        "publicKey":"-----BEGIN PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAydYMwOYUGyBXDgHkzv19YR/dYQES4kYTMUps39qv/amNDywz4nsBDvCUqUvcN3nEpplHlYGH5ShSeA4G/FcmRqynSLVyFPZat/8E7n+EeHsgihFrr8oDWo5UBjCwRinTrC0m11q/5SeNzwVCWkf9x40u94QBz13dQoa9yPwaZBX5uBzyH86R7yeZHpad2cLq0ltpmJ3j5UfsFilkOb3JB60TNpNDdfabprot/y30CEnDDOgAXGtV1m0AhQpQjKRnkUs39DntqSbS+i0UgbyqzEGNUkeR1WsotXekW4KnbWA7k6S8SfkO27vnTSY5b9g/KKaOdysn5YaWJPfTVT/nywIDAQAB\n-----END PUBLIC KEY-----"
                     }
                  }
               },
               {
                  "minNumFindings":1,
                  "minConfidence":"POSSIBLE",
                  "detectorType":"NIGHTFALL_DETECTOR",
                  "nightfallDetector":"DATE_OF_BIRTH",
                  "redactionConfig":{
                     "maskConfig":{
                        "charsToIgnore":[
                           "/"
                        ],
                        "maskingChar":"?",
                        "maskRightToLeft":true,
                        "numCharsToLeaveUnMasked":2
                     }
                  }
               }
            ],
            "name":"cc",
            "logicalOp":"ANY"
         }
      ]
   }
}'

You can see in the response how the RedactionConfig associated with the various Detectors affects the different findings.

Note that because the 2nd item the payload matches multiple detectors, the redacted text in the redactedPayload property becomes [REDACTED BY NIGHTFALL]

{
   "findings":[
      [
         {
            "finding":"james@gmail.com",
            "redactedFinding":"X8QL0mZGHZ+N47nPEccjsLHf2F/5cFqjF16P6wgYJhy8IaxHipHWMBRAufKR4T8FFkvTuTEanu6ZAA+V8NTkNmTLxHarcWPSVClJ8kjXAPltLuR4I2H4eeT+sWEvUP3ik/BF1KcxRpsYWDQO1bNYk+WReXkWlW72Q7rbWuTGFj2uDFCPS+DUraDh9wNBsMPELFOnh1GSQIKCp9U5GMp/kkpo/0idh83RVHXyjZPT4ReKEST2oG2lQ9UuP5LJy/mHX1VYgd8DwlETn8nkhqJ1T0mGs6kHSh22G6N0ic0PjHnj73RiMnQdPwlLw3qyPmFf6RRLKtFuzmFan8ZGtZhcKA==",
            "detector":{
               "name":"email",
               "uuid":"c0235299-0f26-4ad6-ad8c-71f83daf44e9"
            },
            "confidence":"VERY_LIKELY",
            "location":{
               "byteRange":{
                  "start":120,
                  "end":135
               },
               "codepointRange":{
                  "start":120,
                  "end":135
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "redactedLocation":{
               "byteRange":{
                  "start":120,
                  "end":135
               },
               "codepointRange":{
                  "start":120,
                  "end":135
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "matchedDetectionRuleUUIDs":[
               
            ],
            "matchedDetectionRules":[
               "cc"
            ]
         },
         {
            "finding":"01/11/1995",
            "redactedFinding":"??/??/??95",
            "detector":{
               "name":"DATE_OF_BIRTH",
               "uuid":"540856cb-99cb-42e7-b8aa-cd4f22f019d7"
            },
            "confidence":"LIKELY",
            "location":{
               "byteRange":{
                  "start":43,
                  "end":53
               },
               "codepointRange":{
                  "start":43,
                  "end":53
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "redactedLocation":{
               "byteRange":{
                  "start":43,
                  "end":53
               },
               "codepointRange":{
                  "start":43,
                  "end":53
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "matchedDetectionRuleUUIDs":[
               
            ],
            "matchedDetectionRules":[
               "cc"
            ]
         },
         {
            "finding":"",
            "redactedFinding":"[CREDIT_CARD_NUMBER]",
            "detector":{
               "name":"cc",
               "uuid":"74c1815e-c0c3-4df5-8b1e-6cf98864a454"
            },
            "confidence":"VERY_LIKELY",
            "location":{
               "byteRange":{
                  "start":84,
                  "end":103
               },
               "codepointRange":{
                  "start":84,
                  "end":103
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "redactedLocation":{
               "byteRange":{
                  "start":84,
                  "end":103
               },
               "codepointRange":{
                  "start":84,
                  "end":103
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "matchedDetectionRuleUUIDs":[
               
            ],
            "matchedDetectionRules":[
               "cc"
            ]
         },
         {
            "finding":"123-45-5555",
            "redactedFinding":"*REDACTED*",
            "detector":{
               "name":"ssn",
               "uuid":"e30d9a87-f6c7-46b9-a8f4-16547901e069"
            },
            "confidence":"VERY_LIKELY",
            "location":{
               "byteRange":{
                  "start":10,
                  "end":21
               },
               "codepointRange":{
                  "start":10,
                  "end":21
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "redactedLocation":{
               "byteRange":{
                  "start":10,
                  "end":21
               },
               "codepointRange":{
                  "start":10,
                  "end":21
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "matchedDetectionRuleUUIDs":[
               
            ],
            "matchedDetectionRules":[
               "cc"
            ]
         }
      ],
      [
         {
            "finding":"",
            "redactedFinding":"[CREDIT_CARD_NUMBER]",
            "detector":{
               "name":"cc",
               "uuid":"74c1815e-c0c3-4df5-8b1e-6cf98864a454"
            },
            "confidence":"VERY_LIKELY",
            "location":{
               "byteRange":{
                  "start":26,
                  "end":45
               },
               "codepointRange":{
                  "start":26,
                  "end":45
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "redactedLocation":{
               "byteRange":{
                  "start":26,
                  "end":45
               },
               "codepointRange":{
                  "start":26,
                  "end":45
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "matchedDetectionRuleUUIDs":[
               
            ],
            "matchedDetectionRules":[
               "cc"
            ]
         },
         {
            "finding":"03 23 4242",
            "redactedFinding":"????????42",
            "detector":{
               "name":"DATE_OF_BIRTH",
               "uuid":"540856cb-99cb-42e7-b8aa-cd4f22f019d7"
            },
            "confidence":"LIKELY",
            "location":{
               "byteRange":{
                  "start":20,
                  "end":30
               },
               "codepointRange":{
                  "start":20,
                  "end":30
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "redactedLocation":{
               "byteRange":{
                  "start":20,
                  "end":30
               },
               "codepointRange":{
                  "start":20,
                  "end":30
               },
               "rowRange":null,
               "columnRange":null,
               "commitHash":""
            },
            "matchedDetectionRuleUUIDs":[
               
            ],
            "matchedDetectionRules":[
               "cc"
            ]
         }
      ]
   ],
   "redactedPayload":[
      "my ssn is *REDACTED* and date of birth is ??/??/??95 and my credit card number is  [CREDIT_CARD_NUMBER] and my email is X8QL0mZGHZ+N47nPEccjsLHf2F/5cFqjF16P6wgYJhy8IaxHipHWMBRAufKR4T8FFkvTuTEanu6ZAA+V8NTkNmTLxHarcWPSVClJ8kjXAPltLuR4I2H4eeT+sWEvUP3ik/BF1KcxRpsYWDQO1bNYk+WReXkWlW72Q7rbWuTGFj2uDFCPS+DUraDh9wNBsMPELFOnh1GSQIKCp9U5GMp/kkpo/0idh83RVHXyjZPT4ReKEST2oG2lQ9UuP5LJy/mHX1VYgd8DwlETn8nkhqJ1T0mGs6kHSh22G6N0ic0PjHnj73RiMnQdPwlLw3qyPmFf6RRLKtFuzmFan8ZGtZhcKA==.",
      "my date of birth is [REDACTED BY NIGHTFALL] amex"
   ]
}

Using Policies to Send Alerts

Policies allow customers to create templates for their most common workflows such as sending alerts when detection rules are triggered.

These policies may be created manually through the dashboard or may be defined programmatically.

When defining an a Policy inline, in addition to specifying the Detection Rules (either by referencing the UUID of an existing Detection Rule or defining a Detection Rule and its Detectors inline), you must define an alertConfig which will determine where findings are sent.

The alertConfig can be either:

an email address
a Slack channel
a webhook url
a url to a SIEM host as well authentication and other headers

Below is a simple example of a payload with a policy that will send alerts to an email address that you would use with our endpoint for Scan Plain Text.

{     
  "policy": {
    "detectionRules": [
               {
                    "detectors": [
                        {
                            "detectorType": "NIGHTFALL_DETECTOR",
                            "nightfallDetector": "US_SOCIAL_SECURITY_NUMBER",
                            "minNumFindings": 1,
                            "minConfidence": "LIKELY",
                            "displayName": "US Social Security Number"
                        }
                    ],
                    "name": "SSN Match Detection Rule",
                    "logicalOp": "ALL"
               }
          ],
    "contextBytes": 5,
    "alertConfig": {
      "email": {
        "address": "youremail@nightfall.ai"
      }
    }
  },
  "payload": [
        "The customer's social security number is 555-55-5555",
        "No SSN in this string"
   ]
}

You will receive the following response:

{
    "findings": [
        [
            {
                "finding": "555-55-5555",
                "beforeContext": "r is ",
                "detector": {
                    "name": "US Social Security Number",
                    "uuid": "e30d9a87-f6c7-46b9-a8f4-16547901e069"
                },
                "confidence": "VERY_LIKELY",
                "location": {
                    "byteRange": {
                        "start": 41,
                        "end": 52
                    },
                    "codepointRange": {
                        "start": 41,
                        "end": 52
                    },
                    "rowRange": null,
                    "columnRange": null,
                    "commitHash": ""
                },
                "matchedDetectionRuleUUIDs": [],
                "matchedDetectionRules": [
                    "SSN Match Detection Rule"
                ]
            }
        ],
        []
    ],
    "redactedPayload": [
        "",
        ""
    ]
}

Note that you may also use a pre-defined policy defined under Developer Platform > Overview > Policies by copying the Policy UUID and sending a request as shown below.

curl --request POST \
     --url https://api.nightfall.ai/v3/scan \
     --header 'accept: application/json' \
     --header 'authorization: Bearer <InsertYourApiKeyHere>' \
     --header 'content-type: application/json' \
     --data '
{
     "policyUUIDs": [
          "2b2ced32-80c3-4a89-8757-489743ec4640"
     ],
     "payload": [
          "My payload to scan"
     ]
}
'

policy vs. policyUUIDs vs. config

The policy object supersedes the config object. The use of config objects will still continue to be supported, but its use should be considered deprecated. If you specify policy object you cannot also specify a config object.

Also note that previous iterations of the API allowed for a simple list of policyUUIDs to be specified instead of of a policy object. This has been preserved for backwards compatibility, but it is recommended you use the policy object as it has a richer set of features. You may not use both a policyUUIDs list and a policy object.

The following payload will be sent to the given email address with the subject "🚨 Findings Detected by Nightfall! 🚨" as an attachment with the name nightfall-findings.json:

{
  "redactedPayload": [
    "", 
    ""
  ], 
  "findings": [
    [
      {
        "confidence": "LIKELY", 
        "matchedDetectionRules": [
          "SSN Match Detection Rule"
        ], 
        "matchedDetectionRuleUUIDs": [], 
        "location": {
          "codepointRange": {
            "start": 41, 
            "end": 52
          }, 
          "rowRange": null, 
          "byteRange": {
            "start": 41, 
            "end": 52
          }, 
          "columnRange": null, 
          "commitHash": ""
        }, 
        "finding": "555-55-5555", 
        "detector": {
          "name": "SSN Match Detector", 
          "uuid": "7270ccd5-07c5-44e5-b280-c768e0028963"
        }, 
        "beforeContext": "r is "
      }
    ], 
    []
  ]
}

This attachment has the same content as the response payload to the initial request.

Note that the sender address will be no-reply@nightfall.ai

This email address will not respond to messages sent to it.

Using Webhooks with Policies

Policies also allow you to send findings to a callback designated URL using the url property of the alertConfig object.

This mechanism allows you to programmatically consume findings and the data sent will contain sensitive information as well as additional metadata like the location of the findings in the payload. For this reason the URL must be an HTTPS URL and the service backing it be implemented to properly respond with your Webhook signing key and act as a Webhook Server .

Below is what Webhook URL should like in your policy's alertConfig in a payload sent to our endpoint used for scanning plain text.

{     
  "policy": {
    "detectionRuleUUIDs": [
      "c8d43147-0a63-4c01-8a57-83d8108422f5"
    ],
    "alertConfig": {
        "url": {
            "address": "https://mywebhookurl.com"
        }
    }
  },
  "payload": [
        "The customer's social security number is 555-55-5555"
   ]
}

Using Slack Channels With Policies

Another option supported by Policies is sending finding data to a designated Slack channel.

This feature requires that you have configured the Nightfall Slack integration.

Below is a sample payload for scanning plain text.

{     
     "policy": {
          "detectionRules": [
               {
                    "detectors": [
                        {
                            "detectorType": "NIGHTFALL_DETECTOR",
                            "nightfallDetector": "US_SOCIAL_SECURITY_NUMBER",
                            "minNumFindings": 1,
                            "minConfidence": "LIKELY",
                            "displayName": "US Social Security Number"
                        }
                    ],
                    "name": "Simple SSN Match Detection Rule",
                    "logicalOp": "ALL"
               }
          ],
        "alertConfig": {
            "slack": {
                "target": "#securityalert"
            }
        }
    },
     "payload": [
          "The customer's social security number is 555-55-5555",
          "No SSN in this string"
     ]
}

Below is an example as to how the violation will appear in Slack.

See the section on Slack in the overview on Alerting for more details.

Sending Alerts to SIEMs and other HTTP Event Collectors

SIEM (pronounced “sim”) is a combination of security information management (SIM) and security event management systems. SIEM technology collects event log data for analysis in order to provide visibility into network activity.

It is possible to send findings from a policy to a SIEM service such as LogRhythm, SumoLogic, or Splunk using the siem alertConfig.

This configuration will require a URL to a collector that uses an HTTPS endpoint.

Note that the URL for the siem alertConfig must:

use the HTTPS scheme
be able to accept requests made with the POST verb
respond with a 200 status code upon receipt of the event

See the documentation for your SIEM service for how to set up this URL.

Unlike the url alertConfig option, the siem alertConfig does not require that the endpoint for the service implement a custom challenge response. Events sent to the siem alertConfig endpoint contain a subset of what is sent to the url alertConfig. Furthermore the findings are sent in a redacted form similar to Slack or email alerts.

In addition to the URL, you may provide headers such as those that are used for authorization.

The headers in the SIEM alertConfig are divided into sensitiveHeaders and plainTextHeaders header mappings.

The sensitiveHeaders field is specifically for header values like authentication. Nightfall ensures that these header values are always hidden in our service. They are never logged or saved in analytic events.

You can use plainTextHeaders for all other type of information you would like passed along with Nightfall alerts to you HTTP endpoint. Nightfall assumes that the values stored plainTextHeaders do not contain any sensitive information so we do not take any action to hide or protect these values.

Below is an example of a payload using a siem alertConfig.

{
     "policy": {
          "detectionRules": [
               {
                    "detectors": [
                         {
                              "nightfallDetector": "CREDIT_CARD_NUMBER",
                              "detectorType": "NIGHTFALL_DETECTOR",
                              "minConfidence": "POSSIBLE",
                              "minNumFindings": 1
                         }
                    ],
                    "logicalOp": "ALL"
               }
          ],
		"alertConfig": {
		   "email": {
			"address": "<your email>"
		   },
		   "siem": {
				"sensitiveHeaders": {
					 "Authorization": "Splunk <your token value>"
				},
				"address": "https://http-inputs-<yourhost>.splunkcloud.com:8088/services/collector/event"
		   }
		}
     },
     "payload": [
          "4916-6734-7572-5015 is my credit card number",
          "This string does not have any sensitive data",
          "my api key is yr+ZWwIZp6ifFgaHV8410b2BxbRt5QiAj1EZx1qj and my 💳 credit card number 💰 is 30204861594838"
     ]
}

Other Policy Features

Using Redaction Within a Policy

A policy may be configured with default redaction rules as a defaultRedactionConfig that will affect the content of the redactedPayload field of the content that is sent to the alert locations specified in the policy alertConfig. Note that this redaction does not affect the findings themselves.

These redaction rules will be applied to Detection Rules that do not have a specified redaction configuration.

The redactionConfig specified must be one and only one of the four available redaction types:

maskConfig
infoTypeSubstitutionConfig
substitutionConfig
cryptoConfig

For more information on Redactions see: Using Redaction

Below is a simple example of a payload for scanning plain text using a policy set up to use a defaultRedactionConfig

{
  "policy": {
    "detectionRules": [
      {
        "detectors": [
          {
            "detectorType": "NIGHTFALL_DETECTOR",
            "nightfallDetector": "US_SOCIAL_SECURITY_NUMBER",
            "minNumFindings": 1,
            "minConfidence": "LIKELY",
            "displayName": "US Social Security Number"
          }
        ],
        "name": "Simple SSN Match Detection Rule",
        "logicalOp": "ALL"
      }
    ],
      "defaultRedactionConfig": {
        "maskConfig": {
          "charsToIgnore": [
            "-"
          ],   
            "maskingChar": "#",
              "numCharsToLeaveUnmasked": 4,
                "maskLeftToRight": true
        }
      },
        "contextBytes": 5,
          "alertConfig": {
            "email": {
              "address": "eric@nightfall.ai"
            }
          }
  },
    "payload": [
      "The customers social security number is 555-55-5555",
      "No SSN in this string"
    ]
}

Using Context Bytes Within a Policy

In additional to a defaultRedactionConfig it is possible to set the number of bytes to include as before and after a given finding as the contextBytes. This context can provide meaning to how the finding appears within the text to allow human readers to better understand the meaning of the finding. The maximum value for contextBytes is 40.

Detecting Secrets

Leaked secrets, such as credentials needed to authenticate and authorize a cloud provider’s API request, expose company software, services, infrastructure, and data to hackers.

Nightfall has developed technology to detect secrets and label findings to speed SecOPs workflows from being clogged and eliminate false positive alerts.

Overall Coverage

Nightfall uses machine learning models trained on a large (millions of lines of code) diverse dataset (including all programming languages and application types) to ensure best-in-class secret detection accuracy and coverage.

Explicit Labeling and Endpoint Validation for Popular Services

For a growing set of the most popular services, Nightfall will:

label detected secrets by vendor and service type (returned the kind field of the response)
label detected secrets as active risks by validating supported credential types with their associated service endpoints (returned as the status of the service)

Our current solution supports the following vendors covering a diverse set of use cases, including cloud storage/infrastructure, communication, social networks, software development, banking, observability, and payment processing.

This list is not static and will continue to grow as we add support for detecting API keys from additional services. If you want to detect API keys from a service not listed below, please contact us.

Key Detection Example

Below is an example of how an AWS Key would be shown in a finding.


{
  "finding": "zImaKNJJ8u/seIbm1UszokVz3SSARukJs6cghEBXD",
  "detector": {
    "name": "API key",
    "uuid": "0e95732f-bc5c-448f-9d15-bd1417177360"
  },
  "confidence": "VERY_LIKELY",
  ...
  "findingMetadata": {
    "apiKeyMetadata": {
      "status": "ACTIVE",
      "kind": "AWS",
      "description": "Access Key ID: AKIA52FSMBPZS1JIDTPX"
    }
  }
}

The following values are returned for the status field:

ACTIVE
EXPIRED
UNVERIFIED

This value will be based on what information is returned by the corresponding service when attempting the validate the key. If no data is returned fro the service, it will be considered UNVERIFIED.

To use this functionality, you use our existing built-in API_KEY detector to scan a data source such as Git Repository. Below is an example using a detection rule defined in line for a text scan.

curl --request POST \
     --url https://api.nightfall.ai/v3/scan \
     --header 'Authorization: Bearer NF-rEpLaCeM3w1ThYoUrNiGhTfAlLKeY123' \
     --header 'Content-Type: application/json' \
     --data '{
       "policy": {
            "detectionRules": [
                 {
                      "detectors": [
                           {
                                "detectorType": "NIGHTFALL_DETECTOR",
                                "nightfallDetector": "API_KEY",
                                "minNumFindings": 1,
                                "minConfidence": "LIKELY",
                                "displayName": "API Key"
                           }
                      ],
                      "name": "My Match Rule",
                      "logicalOp": "ANY"
                 }
            ]
       },
       "payload": [
            "Is this an active nightfall key? NF-OZ6F9fzF2z5mRxMrUdfL8FddFS51kPzE"
       ]
     }'

PHI Detection Rules

Protected health information (PHI), also referred to as personal health information, describes a patient's medical history — including ailments, various treatments, and outcomes. PHI may include:

demographic information
test and laboratory results
mental health conditions
insurance information

The Health Insurance Portability and Accountability Act (HIPAA) of 1996 is the primary law that oversees the use of, access to, and disclosure of PHI in the United States. HIPAA lists 18 different personal information identifiers (PII) that, when paired with health information, become PHI. In order to more accurately detect potential PHI, Nightfall has introduced specific new detectors that allow for specialized combinations.

These HIPAA PII and PHI-specific detectors intelligently aggregate Nightfall's built-in detector to ensure compliance with governing law. For example, finding a patient's name in a document or message is not considered HIPAA PII as it does not uniquely identify an individual, many people can share the same name. However, the information would be considered HIPAA PII if the patient's name and address were in the same message.

Specific PHI and HIPAA PII can be detected with greater confidence, especially as they relate to specific medical codes or terms in association with specific logical combinations of other PII. For instance when the patient's name and date of birth or a person's name and street address or any of a set of particular PII (phone number email, SSN, etc) it would be considered HIPAA PII.

If the combined detectors all match with a confidence of "Very Likely" it would match our "HIPAA PII Very Likely" Detection Rule. Otherwise if these detectors match with a confidence of "Likely" it would match our "HIPAA PII Likely" Detection Rule.

Alternatively when any of the above PII options are found in conjunction with a specific set of medical related codes or terms (IDC Codes, FDA Drug Names or Codes, Procedures), that finding could be flagged as PHI.

When all the detectors within these PHI Detection Rules make findings that have a confidence of "Very Likely," that would match our "PHI Very Likely" Detection Rule, while if some are all are met with a confidence of "Likely" that would match our "PHI Likely" Detection Rule.

Test Datasets

The following sample datasets can be used to test Nightfall's advanced AI-based detection capabilities.

This data has been fully de-identified and can be used to test any data loss prevention (DLP) platform.

Errors

While using Nightfall's Scan API, you may encounter some of the common errors outlined below. Try following the provided troubleshooting steps.

If problems persist, please for further assistance.

HTTP Error Codes

The following error codes are returned as part of a standard HTTP response.

HTTP Error Code

Description

Troubleshooting

Nightfall Playground

The Nightfall Developer Playground () is a sample app that you may use to test out API functionality before writing any code.

Our playground environment allows you to:

Test Detectors and Detection Rules. Here are some .
Generate sample data for DLP testing.
Explore a sample app built on our APIs

Nightfall APIs

DLP APIs - Firewall for AI Platform

Firewall for AI DLP APIs enables developers to write custom code to sanitize data anywhere–RAG data sets, analytics data stores, data pipelines, and unsupported SaaS applications.

Scan Plain Text

Initiate File Upload

Upload File Chunk

Complete File Upload

Scan Uploaded File

Rate Limits for Firewall APIs

To prevent misuse and ensure the stability of our platform, we enforce a rate limit on an API Key and endpoint basis, similar to the way many other APIs enforce rate limits.

When operating under our Free plan, accounts and their corresponding API Keys have a rate limit of 5 requests per second on average, with support for bursts of 15 requests per second. If you upgrade to a paid plan – the Enterprise plan – this rate increases to a limit of 10 requests per second on average and bursts of 50 requests per second.

Plan

Requests Per Second (Avg)

Burst

The Nightfall API follows standard practices and conventions to signal when these rate limits have been exceeded.

Successful requests return a header X-Rate-Limit-Remaining with the integer number of requests remaining before errors will be returned to the client.

When your application exceeds the rate limit for a given API endpoint, the Nightfall API will return an HTTP response code of 429 "Too Many Requests.” If your use case requires increased rate limiting, please reach out to support@nightfall.ai.

Additionally, these unsuccessful requests return the number of seconds to wait before retrying the request in a Retry-After Header.

Request Rate Limiting

Your Request Rate Limiting throttles how frequently you can make requests to the API. You can monitor your rate limit usage via the `X-Rate-Limit-Remaining` header, which tells you how many remaining requests you can make within the next second before being throttled.

Quotas

Your Quota limits how many bytes of data you're permitted to scan within a given period. Your current remaining quota and the end of your current quota period are denoted by the following response headers.

Response Headers

Type

Description

DLP APIs - Native SaaS Apps

The native SaaS app APIs can be utilized by customers using Nightfall’s SaaS apps, supported natively, to fetch violations, search violations by app meta-data attributes, and fetch findings within violations. These DLP APIs do not provide access to violations for apps scanned via the developer platform. These APIs require you to create an API key as outlined in the Getting Started with the Developer Platform section. However, to use these APIs, you need not create any detectors, detection rules, and policies in the developer platform.

If you are using Nightfall SaaS apps, you can use APIs to fetch violations, search through the violations, and fetch specific findings within the Violations. To scan data in any custom apps or cloud infrastructure services like AWS S3, you must use the APIs in the DLP APIs - Firewall for AI Platform section.

Fetch Violations

Fetch Violations by ID

Search Violations

Fetch Violation Findings

Take Action on Violations

Fetch Annotions

Annotate Findings

Remove Annotation Findings

Fetch GitHub Repositories

Policy User Scope Update API

Note

Internal only endpoint. This will change once Nightfall introduces CRUD API's for policies.

Rate Limits for Native SaaS app APIs

To prevent misuse and ensure the stability of our platform, we enforce a rate limit on an API Key and endpoint basis, similar to the way many other APIs enforce rate limits.

Plan

Requests Per Second (Avg)

Burst

The Nightfall API follows standard practices and conventions to signal when these rate limits have been exceeded.

Successful requests return a header X-Rate-Limit-Remaining with the integer number of requests remaining before errors will be returned to the client.

Additionally, these unsuccessful requests return the number of seconds to wait before retrying the request in a Retry-After Header.

Request Rate Limiting

Quotas

Your Quota limits how many requests you can make within a given period. Your current remaining quota and the end of your current quota period are denoted by the following response headers.

Response Headers

Type

Description

For the free plan, we allow 5 requests per second and 10000 requests in a day.

Nightfall Software Development Kit (SDK)

Nightfall SDKs

Leverage our software development kits (SDKs) to enable easier, faster, and more stable engagement with the Nightfall APIs. Nightfall has a growing library of language specific SDKs including for:

If there is a language-specific SDK that you would find valuable but is not here, please don't hesitate to reach out to product@nightfall.ai.

Language Specific Guides

Nightfall provides you the flexibility to easily integrate into applications using programming languages. The supported languages are as follows.

Python
Ruby
Java

Python

This guide describes how to use Nightfall with the Python programming language.

The example below will demonstrate how to use Nightfall’s text scanning functionality to verify whether a string contains sensitive PII using the Nightfall Python SDK.

To request the Nightfall API you will need:

A Nightfall API key
An existing Nightfall Detection Rule
Data to scan. Note that the API interprets data as plaintext, so you may pass it in any structured or unstructured format.

You can read more about obtaining a Nightfall API key or about our available data detectors in the linked reference guides.

In this tutorial, we will be downloading, setting up, and using the Python SDK provided by Nightfall.

We recommend you first set up a virtual environment. You can learn more about that here.

You can download the Nightfall SDK from PyPi like this:

We will be using the built-in os library to help run this sample API script. This will be used to help extract the API Key from the OS as an environment variable.

import os

from nightfall import Confidence, DetectionRule, Detector, LogicalOp, Nightfall

Next, we extract our API Key, and abstract a nightfall class from the SDK, for it. In this example, we have our API key set via an environment variable called NIGHTFALL_API_KEY. Your API key should never be hard-coded directly into your script.

nightfall = Nightfall(os.environ['NIGHTFALL_API_KEY'])

Next we define the Detection Rule with which we wish to scan our data. The Detection Rule can be pre-made in the Nightfall web app and referenced by UUID.

detection_rule_uuid = os.environ.get('DETECTION_RULE_UUID')

In this example, we will use some example data in the payload List.

🚧Payload Limit
Payloads must be under 500 KB when using the Scan API. If your file is larger than the limit, consider using the file api, which is also available via the Python SDK.

We will ignore the second parameter as we do not have redaction configured for this request.

With the Nightfall API, you can redact and mask your findings. You can add a Redaction Config, as part of your Detection Rule. For more information on how to use redaction, and its specific options, please refer to the guide here.

payload = [
    "The customer social security number is 458-02-6124",
    "No PII in this string",
    "My credit card number is 4916-6734-7572-5015"
]

result, _ = nightfall.scan_text(
        payload,
        detection_rule_uuids=[detection_rule_uuid]
    )

payload = [
    "The customer social security number is 458-02-6124",
    "No PII in this string",
    "My credit card number is 4916-6734-7572-5015"
]

result, _ = nightfall.scan_text(
    payload,
    detection_rules=[
        DetectionRule(
            name="Sample_Detection_Rule",
            logical_op=LogicalOp.ANY,
            detectors=[
                Detector(
                    min_confidence=Confidence.VERY_LIKELY,
                    min_num_findings=1,
                    display_name="Credit Card",
                    nightfall_detector="CREDIT_CARD_NUMBER",
                ),
                Detector(
                    min_confidence=Confidence.VERY_LIKELY,
                    min_num_findings=1,
                    display_name="Social",
                    nightfall_detector="US_SOCIAL_SECURITY_NUMBER",
                )
            ]
        )
    ]
)

Reviewing Results

Now we are ready to review the results from the Nightfall SDK to check if there is any sensitive data in our file. Since the results will be in a dataclass, we can use the built-in __repr__ functions to format the results in a user-friendly and readable manner.

All data and sample findings shown below are validated, non-sensitive, examples of sample data.

If there are no sensitive findings in our payload, the response will be as shown in the 'empty response' pane below:

[
    [Finding(finding='458-02-6124', redacted_finding=None, before_context=None, after_context=None, detector_name='US social security number (SSN)', detector_uuid='e30d9a87-f6c7-46b9-a8f4-16547901e069', confidence=<Confidence.VERY_LIKELY: 'VERY_LIKELY'>, byte_range=Range(start=39, end=50), codepoint_range=Range(start=39, end=50), matched_detection_rule_uuids=['c67e3dd7-560e-438f-8c72-6ec54979396f'], matched_detection_rules=[])],
    [],
    [Finding(finding='4916-6734-7572-5015', redacted_finding=None, before_context=None, after_context=None, detector_name='Credit card number', detector_uuid='74c1815e-c0c3-4df5-8b1e-6cf98864a454', confidence=<Confidence.VERY_LIKELY: 'VERY_LIKELY'>, byte_range=Range(start=25, end=44), codepoint_range=Range(start=25, end=44), matched_detection_rule_uuids=['c67e3dd7-560e-438f-8c72-6ec54979396f'], matched_detection_rules=[])]
]

[
    [Finding(finding='458-02-6124', redacted_finding=None, before_context=None, after_context=None, detector_name='Social', detector_uuid='e30d9a87-f6c7-46b9-a8f4-16547901e069', confidence=<Confidence.VERY_LIKELY: 'VERY_LIKELY'>, byte_range=Range(start=39, end=50), codepoint_range=Range(start=39, end=50), matched_detection_rule_uuids=[], matched_detection_rules=['Sample_Detection_Rule'])],
    [],
    [Finding(finding='4916-6734-7572-5015', redacted_finding=None, before_context=None, after_context=None, detector_name='Credit Card', detector_uuid='74c1815e-c0c3-4df5-8b1e-6cf98864a454', confidence=<Confidence.VERY_LIKELY: 'VERY_LIKELY'>, byte_range=Range(start=25, end=44), codepoint_range=Range(start=25, end=44), matched_detection_rule_uuids=[], matched_detection_rules=['Sample_Detection_Rule'])],
]

[[], [], []]

You are now ready to use the Python SDK for other scenarios.

Java

This guide describes how to use Nightfall with the Java programming language.

The example below will demonstrate how to use Nightfall’s text scanning functionality to verify whether a string contains sensitive PII using the Nightfall Java SDK.

In this tutorial, we will be downloading, setting up, and using the Java SDK provided by Nightfall.

To make a request to the Nightfall API you will need:

A Nightfall API key
Plaintext data to scan.

You can read more about obtaining or about our available from the linked reference guides.

You can add the Nightfall package to your project by adding a dependency to your pom.xml:

First add the required imports to the top of the file.

These are the objects we will use from the Nightfall SDK, as well as some collection classes for data handling.

We can then declare some data to scan in a List:

Create a ScanTextRequest to scan the payload with. First create a new instance of the credit card detector, and set to trigger if there are any findings that are confidence LIKELY or above.

Add a second detector, looking for social security numbers. Set it to be triggered if there is at least a possible finding.

Combine these detectors into a detection rule, which will return findings if either of these detectors are triggered.

Finally, combine the payload and configuration together as a new ScanTextRequest, and return it.

Use the ScanTextRequest instance with a NightfallClient to send your request to Nightfall.

The resulting ScanTextResponse may be used to print out the results:

You are now ready to use the Java SDK for other scenarios.

Tutorials

GenAI Protection

This section consists of various documents that assist you in scanning various popular SaaS GenAI services and frameworks using Nightfall APIs.

OpenAI Prompt Sanitization Tutorial

Protecting Sensitive Information in AI Interactions: The Critical Role of Content Filtering

Generative AI systems like OpenAI's ChatGPT have revolutionized how we interact with technology, but they come with a significant risk: the inadvertent exposure of sensitive information (OWASP LLM06). Without proper safeguards, these AI platforms may receive, process, and potentially retain confidential data, including:

Personally Identifiable Information (PII)
Protected Health Information (PHI)
Financial details (e.g., credit card numbers, bank account information)
Intellectual property

Real-world scenarios highlight the urgency of this issue:

Support Chatbots: Imagine a customer service AI powered by OpenAI. Users, in their quest for help, might unknowingly share credit card numbers or Social Security information. Without content filtering, this sensitive data could be transmitted to OpenAI and logged in your support system.
Healthcare Applications: Consider an AI-moderated health app that processes patient and doctor communications. These exchanges may contain protected health information (PHI), which, if not filtered, could be unnecessarily exposed to the AI system.

Content filtering is a crucial safeguard, removing sensitive data before it reaches the AI system. This ensures that only necessary, non-sensitive information is used for content generation, effectively preventing the spread of confidential data to AI platforms.

Steps to Identify and Sanitize ChatGPT Prompts

Let's look at a Python example using OpenAI and Nightfall's Python SDK. You can download this sample code here.

import os
from nightfall import Confidence, DetectionRule, Detector, RedactionConfig, MaskConfig, Nightfall
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Example user input with sensitive information
user_input = "My credit card number is 4916-6734-7572-5015 and the card is getting declined. My transaction number is 4916-6734-7572-5015."
payload = [user_input]

# 1) Get the Nightfall API key
nightfall = Nightfall() # By default Nightfall will read the NIGHTFALL_API_KEY environment variable

print("\nHere's the user's question before sanitization:\n", user_input)

# 2) Configure Nightfall detection and redaction
detection_rule = [DetectionRule(
    [Detector(
        min_confidence=Confidence.VERY_LIKELY,
        nightfall_detector="CREDIT_CARD_NUMBER",
        display_name="Credit Card Number",
        redaction_config=RedactionConfig(
            remove_finding=False,
            mask_config=MaskConfig(
                masking_char="X",
                num_chars_to_leave_unmasked=4,
                mask_right_to_left=True,
                chars_to_ignore=["-"])
        )
    )]
)]

# 3) Classify, Redact, Filter Your User Input
# Send the message to Nightfall to scan it for sensitive data
# Nightfall returns the sensitive findings and a copy of your input payload with sensitive data redacted
findings, redacted_payload = nightfall.scan_text(
    payload,
    detection_rules=detection_rule
)

# If the message has sensitive data, use the redacted version otherwise, use the original message
if redacted_payload[0]:
    user_input_sanitized = redacted_payload[0]
else:
    user_input_sanitized = payload[0]

print("\nHere's the user's question after sanitization:\n", user_input_sanitized)

# 4) Send prompt to OpenAI model for AI-generated response
completion = client.chat.completions.create(model="gpt-4",
messages=[
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": user_input_sanitized}
],
max_tokens=1024)

print("\nHere's a generated response you can send the customer:\n", completion.choices[0].message.content)

Step 1: Setup Nightfall

Get an API key for Nightfall and set environment variables. Learn more about creating an API key here.

Step 2: Configure Detection

Create an inline detection rule with the Nightfall API or SDK client, or use a pre-configured detection rule in the Nightfall account. In this example, we will do the former.

If you specify a redaction config, you can automatically get de-identified data back, including a reconstructed, redacted copy of your original payload. Learn more about redaction here.

Step 3: Classify, Redact, Filter Your User Input

Send your outgoing prompt text in a request payload to the Nightfall API text scan endpoint. The Nightfall API will respond with detections and the redacted payload.

For example, let’s say we send Nightfall the following:

The customer said: 'My credit card number is 4916-6734-7572-5015 and the card is getting declined.' How should I respond to the customer?

We get back the following redacted text:

The customer said: 'My credit card number is XXXX-XXXX-XXXX-5015 and the card is getting declined.' How should I respond to the customer?

Step 4: Send Redacted Prompt to OpenAI

Review the response to see if Nightfall has returned sensitive findings:

If there are sensitive findings:
- You can choose to specify a redaction config in your request so that sensitive findings are redacted automatically.
- Without a redaction config, you can simply break out of the conditional statement, throw an exception, etc.
If no sensitive findings or you chose to redact findings with a redaction config:
- Initialize the OpenAI SDK client (e.g. OpenAI Python client), or use the API directly to construct a request.
- Construct your outgoing prompt.
- If you specified a redaction config and want to replace raw sensitive findings with redacted ones, use the redacted payload that Nightfall returns to you.
- Use the OpenAI API or SDK client to send the prompt to the AI model.

Safely Leveraging Generative AI

You'll see that the message we originally intended to send had sensitive data:

The customer said: 'My credit card number is 4916-6734-7572-5015 and the card is getting declined. My transaction number is 4916-6734-7572-5015.' How should I respond to the customer?

And the message we ultimately sent was redacted, and that’s what we sent to OpenAI:

The customer said: 'My credit card number is XXXX-XXXX-XXXX-5015 and the card is getting declined. My transaction number is 4916-6734-7572-5015.' How should I respond to the customer?

OpenAI sends us the same response either way because it doesn’t need to receive sensitive data to generate a cogent response. This means we were able to leverage ChatGPT just as easily but we didn’t risk sending OpenAI any unnecessary sensitive data. Now, you are one step closer to leveraging generative AI safely in an enterprise setting.

LangChain Prompt Sanitization Tutorial

LangChain Tutorial: Integrating Nightfall for Secure Prompt Sanitization

Generative AI systems like OpenAI's ChatGPT have revolutionized how we interact with technology, but they come with a significant risk: the inadvertent exposure of sensitive information (). Without proper safeguards, these AI platforms may receive, process, and potentially retain confidential data, including:

Personally Identifiable Information (PII)
Protected Health Information (PHI)
Financial details (e.g., credit card numbers, bank account information)
Intellectual property

Real-world scenarios highlight the urgency of this issue:

Support Chatbots: Imagine a customer service AI powered by OpenAI. Users, in their quest for help, might unknowingly share credit card numbers or Social Security information. Without content filtering, this sensitive data could be transmitted to OpenAI and logged in your support system.
Healthcare Applications: Consider an AI-moderated health app that processes patient and doctor communications. These exchanges may contain protected health information (PHI), which, if not filtered, could be unnecessarily exposed to the AI system.

Python Example

Let's examine this in a Python example using the LangChain, Anthropic, and Nightfall Python SDKs. You can download this sample code .

Step 1: Setup Nightfall

Install the necessary packages using the command line:

Set up environment variables. Create a .env file in your project directory:

Step 2: Configure Detection

Step 3: Classify, Redact, Filter Your User Input

to integrate content filtering into our LangChain pipeline seamlessly. We'll create a custom LangChain component for Nightfall sanitization. This allows us to seamlessly integrate content filtering into our LangChain pipeline.

Explanation

We start by importing necessary modules and loading environment variables.
We initialize the Nightfall client and define detection rules for credit card numbers.
The NightfallSanitizationChain class is a custom LangChain component that handles content sanitization using Nightfall.
We set up the Anthropic LLM and create a prompt template for customer service responses.
We create separate chains for sanitization and response generation, then combine them using SimpleSequentialChain.
The process_customer_input function provides an easy-to-use interface for our chain.

Error Handling and Logging

In a production environment, you might want to add more robust error handling and logging. For example:

Usage

To use this script, you can either run it directly or import the process_customer_input function in another script.

Running the Script Directly

Simply run the script:

This will process the example customer input and print the sanitized input and final response.

Using in Another Script

You can import the process_customer_input function in another script:

Expected Output

What does success look like?

If the example runs properly, you should expect to see an output demonstrating the sanitization process and the final response from Claude. Here's what the output might look like:

SaaS Protection

This section consists of various documents that assist you in scanning various popular SaaS applications using Nightfall APIs.

Nightfall Use Cases

How do I know my data is secure?

At Nightfall, data security and privacy are our top priorities. We have implemented stringent security measures to protect your sensitive data at every stage of the scanning process. All data transmitted to our API is encrypted in transit using industry-standard protocols. We adhere to best practices for secure coding, undergo regular security audits, and maintain compliance with relevant security standards. Visit our security and compliance page at nightfall.ai/security for more details on our commitment to data protection.

Search violations

Fetch a list of violations based on some filters

GEThttps://api.nightfall.ai/dlp/v1/violations/search

Query parameters

createdAfterinteger

Unix timestamp in seconds, filters records created ≥ the value, defaults to -90 days UTC

createdBeforeinteger

Unix timestamp in seconds, filters records created < the value, defaults to end of the current day UTC

updatedAfterinteger

Unix timestamp in seconds, filters records updated > the value

limitinteger

The maximum number of records to be returned in the response

pageTokenstring

Cursor for getting the next page of results

sortViolationSearchSortKey (enum)

Sort key and direction, defaults to descending order by creation time

TIME_ASCTIME_DESCRELEVANCERISK_ASCRISK_DESC

query*string

The query containing filter clauses

Search query language

Query structure and terminology

A query clause consists of a field followed by an operator followed by a value:

term	value
clause	user_email:"amy@rocketrides.io"
field	user_email
operator	:
value	amy@rocketrides.io

You can combine multiple query clauses in a search by separating them with a space.

Field types, substring matching, and numeric comparators

Every search field supports exact matching with a :. Certain fields such as user_email and user_name support substring matching.

Quotes

You may use quotation marks around string values. Quotation marks are required in case the value contains spaces. For example:

user_mail:john@example.com
user_name:"John Doe"

Special Characters

+ - && || ! ( ) { } [ ] ^ " ~ * ? : are special characters need to be escaped using \. For example:

a value like (1+1):2 should be searched for using \(1\+1)\:2

Search Syntax

The following table lists the syntax that you can use to construct a query.

SYNTAX	USAGE	DESCRIPTION	EXAMPLES
`:`	field:value	Exact match operator (case insensitive)	`state:"pending"` returns records where the currency is exactly `"PENDING"` in a case-insensitive comparison
(space)	field1:value1 field2:value2	The query returns only records that match both clauses	`state:active slack.channel_name:general`
`OR`	field:(value1 OR value2)	The query returns records that match either of the values (case insensitive)	`state:(active OR pending)`

Query Fields

param	description
state	the violation states to filter on
user_email	the emails of users updating the resource resulting in the violation
user_name	the usernames of users updating the resource resulting in the violation
integration_name	the integration to filter on
confidence	one or more likelihoods/confidences
policy_id	one or more policy IDs
detection_rule_id	one or more detection rule IDs
detector_id	one or more detector IDs
risk_label	the risk label to filter on
risk_source	the risk determination source to filter on
slack.channel_name	the slack channel names to filter on
slack.channel_id	the slack channel IDs to filter on
slack.workspace	the slack workspaces to filter on
confluence.parent_page_name	the names of the parent pages in confluence to filter on
confluence.space_name	the names of the spaces in confluence to filter on
gdrive.drive	the drive names in gdrive to filter on
jira.project_name	the jira project names to filter on
jira.ticket_number	the jira ticket numbers to filter on
salesforce.org_name	the salesforce organization names to filter on
salesforce.object	the salesforce object names to filter on
salesforce.record_id	the salesforce record IDs to filter on
github.author_email	the github author emails to filter on
github.branch	the github branches to filter on
github.commit	the github commit ids to filter on
github.org	the github organizations to filter on
github.repository	the github repositories to filter on
github.repository_owner	the github repository owners to filter on
teams.team_name	the m365 teams team names to filter on
teams.channel_name	the m365 teams channels to filter on
teams.channel_type	the m365 teams channel types to filter on
teams.team_sensitivity	the m365 teams sensitivities to filter on
teams.sender	the m365 teams senders to filter on
teams.msg_importance	the m365 teams importance to filter on
teams.msg_attachment	the m365 teams attachment names to filter on
teams.chat_id	the m365 teams chat ID to filter on
teams.chat_type	the m365 teams chat type to filter on
teams.chat_topic	the m365 teams chat topic to filter on
teams.chat_participant	the m365 teams chat participant's display name to filter on
onedrive.drive_owner	drive owner's display name to filter on
onedrive.drive_owner_email	drive owner's email to filter on
onedrive.file_name	the file name to filter on
onedrive.created_by	the m365 user, who created the file in the drive, display name to filter on
onedrive.created_by_email	the m365 users, who created the file in the drive, email to filter on
onedrive.modified_by	the m365 users, who last modified the file in the drive, display name to filter on
onedrive.modified_by_email	the m365 users, who last modified the file in the drive, email to filter on
zendesk.ticket_status	the zendesk ticket status to filter on
zendesk.ticket_title	the zendesk ticket titles to filter on
zendesk.ticket_group_assignee	the zendesk ticket assignee groups to filter on
zendesk.current_user_role	the zendesk ticket current assignee user's roles to filter on
notion.created_by	the names of the users creating a resource in notion to filter on
notion.last_edited_by	the names of the users editing a resource in notion to filter on
notion.page_title	the page names in notion to filter on
notion.workspace_name	the workspace names in notion to filter on
gmail.user_name	the names of the sender to filter on
gmail.from	the email of sender to filter on
gmail.to	the email or name of recipients to filter on
gmail.cc	the email or name of cc to filter on
gmail.bcc	the email or name of bcc to filter on
gmail.thread_id	the thread id of email to filter on
gmail.subject	the subject of email to filter on
gmail.attachment_name	the name of attachment to filter on
gmail.attachment_type	the type of attachment to filter on

Response

Successful response

Headers

X-Rate-Limit-Remaininginteger

How many remaining requests you can make within the next second before being throttled

X-Quota-Remaininginteger

How many remaining requests you can make within the next quota period

X-Quota-Period-Endstring (date-time)

When the current quota period expires

Body

violationsarray of Violation (object)

idstring

The violation id

integrationIntegration (enum)

SLACKGDRIVEJIRACONFLUENCESALESFORCEZENDESKBROWSERNOTIONGITHUBM365_TEAMSM365_ONEDRIVEINLINE_EMAIL

createdAtinteger

Unix timestamp when the violation was created

updatedAtinteger

Unix timestamp when the violation was updated

possibleActionsarray of Action (enum)

Possible actions for the violation

itemsAction (enum)

ACKNOWLEDGEREDACTQUARANTINEALLOW_QUARANTINEREJECT_QUARANTINEREMOVE_INTERNAL_USERSREMOVE_EXTERNAL_USERSDOMAIN_WIDE_LINKRESTRICTED_LINKDELETEIGNORENOTIFY_SLACKNOTIFY_EMAILUNACKNOWLEDGEDISABLE_DOWNLOADCREATE_JIRA_ISSUEMARK_AS_PRIVATEDELETE_ATTACHMENTMANUAL_UNDORESOLVENOTIFY_TEAMSNOTIFY_GITHUBSOFT_DELETEHARD_DELETERESTRICT_TO_OWNER

stateViolationState (enum)

ACTIVEPENDINGRESOLVEDEXPIRED

resourceLinkstring

The link to the resource on the integration

metadataMetadata (object)

slackMetadataSlackMetadata (object)

locationstring

The channel name in case of a message in a channel

locationTypestring

Type of location

usernamestring

User name

userIDstring

ID - user

messagePermalinkstring

Link to message

locationMembersarray of string

Members for the location

itemsstring

locationMemberCountinteger

Count of members for the location

channelIDstring

ID - channel

workspaceNamestring

Name of workspace

confluenceMetadataConfluenceMetadata (object)

itemNamestring

Name of item

itemTypestring

Type of item

isArchivedboolean

Archived status

createdAtinteger

Unix timestamp

updatedAtinteger

Unix timestamp

labelsarray of string

List of labels

itemsstring

spaceNamestring

Name of space

spaceKeystring

Key of space

spaceNameLinkstring

Link of space

parentPageNamestring

Parent page

authorNamestring

Name of author

authorEmailstring

Email of author

authorNameLinkstring

Link of author name

permalinkstring

Link to resource

confluenceIDstring

ID - Confluence internal

confluenceUserIDstring

ID - Confluence user

itemVersioninteger

Version of item

parentPageIDstring

ID - parent page

parentVersioninteger

Version of parent page

gdriveMetadataGdriveMetadata (object)

fileIDstring

ID of file

fileNamestring

The name of the file

fileTypestring

Type of file

fileSizestring

File size

fileLinkstring

Link to file

permissionSettingstring

Permissions

sharingExternalUsersarray of string

User list shared with - external

itemsstring

sharingInternalUsersarray of string

User list shared with - internal

itemsstring

canViewersDownloadboolean

Available for viewers to download

fileOwnerstring

File owner

isInTrashboolean

In trash

createdAtinteger

Unix timestamp, when the file was created

updatedAtinteger

Unix timestamp, when the file was updated

drivestring

Drive name

updatedBystring

Updated by user

jiraMetadataJiraMetadata (object)

projectNamestring

Name of project

ticketNumberstring

Ticket number

projectTypestring

Type of project

issueIDstring

ID for the issue

projectLinkstring

Link to project

ticketLinkstring

Link to ticket

commentLinkstring

Link to comment

attachmentLinkstring

Link to attachment

githubMetadataGithubMetadata (object)

branchNamestring

Branch on which violation occurred

organizationstring

Name of the organization or username in case of an individual account

repositorystring

Name of the repository

authorEmailstring

Email of the user who pushed the changes to GitHub

authorUsernamestring

Username of the user who pushed the changes to GitHub

createdAtinteger

Unix timestamp

isRepoPrivateboolean

Boolean to check if the repo is private or public

filePathstring

Path of the file on which violation occurred

githubPermalinkstring

Permalink to the version of the file where sensitive content was identified

repositoryOwnerstring

Owner of the repository

githubRepoLinkstring

Link to the repository

salesforceMetadataSalesforceMetadata (object)

orgNamestring

Name of the Salesforce organization

recordIDstring

ID of the record

objectNamestring

Name of the object

contentTypestring

Attachment or Object

userIDstring

ID of the user

userNamestring

Salesforce username of the author

updatedAtinteger

Unix timestamp when the object was last updated

fieldsarray of string

Fields of the Object

itemsstring

fileTypestring

File Type

attachmentLinkstring

Link to the attachment

attachmentNamestring

Name of the attachment

objectLinkstring

Link to the object

zendeskMetadataZendeskMetadata (object)

ticketStatusstring

Status of the ticket

ticketTitlestring

Title of the ticket

ticketRequestorstring

Ticket requested by

ticketGroupAssigneestring

Group the ticket is assigned to

ticketAgentAssigneestring

Agent the ticket is assigned to

currentUserRolestring

User role

ticketIDinteger

ID of the ticket

ticketFollowersarray of string

Followers of the ticket

itemsstring

ticketTagsstring

Tags for the ticket

createdAtinteger

Unix timestamp

UpdatedAtinteger

Unix timestamp

locationstring

Location

subLocationstring

Sub-location

ticketCommentIDinteger

ID - ticket comment

ticketGroupIDinteger

ID - ticket group

ticketGroupLinkstring

Link to the ticket group

ticketAgentIDinteger

ID - ticket agent

ticketAgentLinkstring

Link - ticket agent

ticketEventstring

Ticket event

userRolestring

Role of the user

attachmentNamestring

Name of the attachment

attachmentLinkstring

Link for the attachment

notionMetadataNotionMetadata (object)

createdBystring

Page creator

updatedBystring

Page update by

workspaceNamestring

Workspace name

workspaceLinkstring

Link to workspace

pageIDstring

ID of the page

pageTitlestring

Title of the page

createdAtinteger

Unix timestamp

updatedAtinteger

Unix timestamp

privatePageLinkstring

Private page link

publicPageLinkstring

Public page link

sharedExternallyboolean

Externally shared state

attachmentIDstring

ID of the attachment

browserMetadataBrowserMetadata (object)

locationstring

Page URL where the extension is launched

subLocationstring

Specific location on the page

browserNamestring

Browser type

userCommentstring

Remediation comment from the user

m365TeamsMetadataM365TeamsMetadata (object)

teamNamestring

Name of the team containing the channel where the message was sent

tenantIDstring

ID of the tenant

tenantDomainstring

Domain name of the tenant

teamIDstring

ID of the team containing the channel where the message was sent

teamVisibilitystring

Visibility of the team containing the channel where the message was sent

teamWebURLstring

Web URL of the team containing the channel where the message was sent

channelIDstring

ID of the channel where the message was sent

channelNamestring

Name of the channel where the message was sent

channelTypestring

Type of the channel where the message was sent

channelWebURLstring

Web URL of the channel where the message was sent

messageIDstring

ID of the message

createdAtinteger

Unix timestamp

updatedAtinteger

Unix timestamp

chatMessageSenderstring

Sender of the chat message

userIDstring

ID of the user who sent the message

userPrincipalNamestring

Principal name of the user who sent the message

attachmentsarray of M365TeamsAttachment (object)

Attachment details

attachmentIDstring

ID of the attachment present in the message

attachmentNamestring

Name of the attachment present in the message

attachmentURLstring

URL of the attachment present in the message

chatMessageImportancestring

Importance of the sent message

chatIDstring

ID of the chat conversation

chatTypestring

Type of the chat conversation (one-on-one, group, meeting)

chatTopicstring

Topic or subject of the chat conversation

chatParticipantsarray of M365TeamsChatParticipant (object)

userIDstring

ID of the user participating in the chat conversation

emailstring

email address of the chat participant

displayNamestring

display name of the chat participant

m365OnedriveMetadataM365OnedriveMetadata (object)

tenantIDstring

ID of the tenant

tenantDomainstring

Domain name of the tenant

driveItemIDstring

ID of the drive item

driveItemNamestring

Name of the drive item

driveItemURLstring

URL of the drive item

driveItemMimeTypestring

Mime type of the drive item

driveItemSizeinteger

Size of the drive item in bytes

parentPathstring

Path to the drive item relative to the root of the drive

createdByIDstring

ID of the user who created the drive item

updatedByEmailstring

Email of the user who last updated the drive item

updatedByIDstring

ID of the user who last updated the drive item

updatedByNamestring

Name of the user who last updated the drive item

createdAtinteger

Unix timestamp when the drive item was created

updatedAtinteger

Unix timestamp when the drive item was last updated

specialFolderNamestring

Name of the special folder if drive item is inside one

driveIDstring

ID of the drive where the drive item is present

driveOwnerNamestring

Name of user who owns the drive where the drive item is present

driveOwnerEmailstring

Email of user who owns the drive where the drive item is present

driveOwnerIDstring

ID of user who owns the drive where the drive item is present

inlineEmailMetadataInlineEmailMetadata (object)

domainstring

Domain of the company where email was sent from

user_namestring

User Name who sent the email

fromstring

Email of the sender

toarray of string

Recipients of the Email

itemsstring

ccarray of string

Recipients mentioned in the CC field of the Email

itemsstring

bccarray of string

Recipients mentioned in the BCC field of the Email

itemsstring

subjectstring

Subject of the email

sent_atinteger

Unix timestamp of when email was sent

thread_idstring

ThreadID of the email

attachment_namestring

Name of the attachment

attachment_typestring

Type of attachment

fileDetailsFileDetails (object)

fileNamestring

The name of the file

mimeTypestring

The file mime type

permalinkstring

The link to the resource on the integration

policyUUIDsarray of string

Policies violated

itemsstring

detectionRuleUUIDsarray of string

Detection rules triggered

itemsstring

detectorUUIDsarray of string

Detectors triggered

itemsstring

riskViolationRisk (enum)

UNSPECIFIEDLOWMEDIUMHIGHCRITICALNO_RISK

riskSourceViolationRiskSource (enum)

NIGHTFALLCUSTOMOVERRIDDEN

riskScorenumber (float)

The calculated score of the risk for this violation

userInfoUserInformation (object)

usernamestring

Username as on the integration

userEmailstring

User email as on the integration, may be empty

nextPageTokenstring

Next page cursor, omitted if end of results reached

Request

const response = await fetch('https://api.nightfall.ai/dlp/v1/violations/search?query=text', {
    method: 'GET',
    headers: {},
});
const data = await response.json();

Response

{
  "violations": [
    {
      "id": "text",
      "integration": "SLACK",
      "possibleActions": [
        "ACKNOWLEDGE"
      ],
      "state": "ACTIVE",
      "resourceLink": "text",
      "metadata": {
        "slackMetadata": {
          "location": "text",
          "locationType": "text",
          "username": "text",
          "userID": "text",
          "messagePermalink": "text",
          "locationMembers": [
            "text"
          ],
          "channelID": "text",
          "workspaceName": "text"
        },
        "confluenceMetadata": {
          "itemName": "text",
          "itemType": "text",
          "isArchived": false,
          "labels": [
            "text"
          ],
          "spaceName": "text",
          "spaceKey": "text",
          "spaceNameLink": "text",
          "parentPageName": "text",
          "authorName": "text",
          "authorEmail": "text",
          "authorNameLink": "text",
          "permalink": "text",
          "confluenceID": "text",
          "confluenceUserID": "text",
          "parentPageID": "text"
        },
        "gdriveMetadata": {
          "fileID": "text",
          "fileName": "text",
          "fileType": "text",
          "fileSize": "text",
          "fileLink": "text",
          "permissionSetting": "text",
          "sharingExternalUsers": [
            "text"
          ],
          "sharingInternalUsers": [
            "text"
          ],
          "canViewersDownload": false,
          "fileOwner": "text",
          "isInTrash": false,
          "drive": "text",
          "updatedBy": "text"
        },
        "jiraMetadata": {
          "projectName": "text",
          "ticketNumber": "text",
          "projectType": "text",
          "issueID": "text",
          "projectLink": "text",
          "ticketLink": "text",
          "commentLink": "text",
          "attachmentLink": "text"
        },
        "githubMetadata": {
          "branchName": "text",
          "organization": "text",
          "repository": "text",
          "authorEmail": "text",
          "authorUsername": "text",
          "isRepoPrivate": false,
          "filePath": "text",
          "githubPermalink": "text",
          "repositoryOwner": "text",
          "githubRepoLink": "text"
        },
        "salesforceMetadata": {
          "orgName": "text",
          "recordID": "text",
          "objectName": "text",
          "contentType": "text",
          "userID": "text",
          "userName": "text",
          "fields": [
            "text"
          ],
          "fileType": "text",
          "attachmentLink": "text",
          "attachmentName": "text",
          "objectLink": "text"
        },
        "zendeskMetadata": {
          "ticketStatus": "text",
          "ticketTitle": "text",
          "ticketRequestor": "text",
          "ticketGroupAssignee": "text",
          "ticketAgentAssignee": "text",
          "currentUserRole": "text",
          "ticketFollowers": [
            "text"
          ],
          "ticketTags": "text",
          "location": "text",
          "subLocation": "text",
          "ticketGroupLink": "text",
          "ticketAgentLink": "text",
          "ticketEvent": "text",
          "userRole": "text",
          "attachmentName": "text",
          "attachmentLink": "text"
        },
        "notionMetadata": {
          "createdBy": "text",
          "updatedBy": "text",
          "workspaceName": "text",
          "workspaceLink": "text",
          "pageID": "text",
          "pageTitle": "text",
          "privatePageLink": "text",
          "publicPageLink": "text",
          "sharedExternally": false,
          "attachmentID": "text"
        },
        "browserMetadata": {
          "location": "text",
          "subLocation": "text",
          "browserName": "text",
          "userComment": "text"
        },
        "m365TeamsMetadata": {
          "teamName": "text",
          "tenantID": "text",
          "tenantDomain": "text",
          "teamID": "text",
          "teamVisibility": "text",
          "teamWebURL": "text",
          "channelID": "text",
          "channelName": "text",
          "channelType": "text",
          "channelWebURL": "text",
          "messageID": "text",
          "chatMessageSender": "text",
          "userID": "text",
          "userPrincipalName": "text",
          "attachments": [
            {
              "attachmentID": "text",
              "attachmentName": "text",
              "attachmentURL": "text"
            }
          ],
          "chatMessageImportance": "text",
          "chatID": "text",
          "chatType": "text",
          "chatTopic": "text",
          "chatParticipants": [
            {
              "userID": "text",
              "email": "text",
              "displayName": "text"
            }
          ]
        },
        "m365OnedriveMetadata": {
          "tenantID": "text",
          "tenantDomain": "text",
          "driveItemID": "text",
          "driveItemName": "text",
          "driveItemURL": "text",
          "driveItemMimeType": "text",
          "parentPath": "text",
          "createdByID": "text",
          "updatedByEmail": "text",
          "updatedByID": "text",
          "updatedByName": "text",
          "specialFolderName": "text",
          "driveID": "text",
          "driveOwnerName": "text",
          "driveOwnerEmail": "text",
          "driveOwnerID": "text"
        },
        "inlineEmailMetadata": {
          "domain": "text",
          "user_name": "text",
          "from": "text",
          "to": [
            "text"
          ],
          "cc": [
            "text"
          ],
          "bcc": [
            "text"
          ],
          "subject": "text",
          "thread_id": "text",
          "attachment_name": "text",
          "attachment_type": "text"
        }
      },
      "fileDetails": {
        "fileName": "text",
        "mimeType": "text",
        "permalink": "text"
      },
      "policyUUIDs": [
        "text"
      ],
      "detectionRuleUUIDs": [
        "text"
      ],
      "detectorUUIDs": [
        "text"
      ],
      "risk": "UNSPECIFIED",
      "riskSource": "NIGHTFALL",
      "riskScore": 0,
      "userInfo": {
        "username": "text",
        "userEmail": "text"
      }
    }
  ],
  "nextPageToken": "text"
}