Sample Datasets

Use sample data sets provided by Nightfall to test Nightfall's detection capabilities.

The following datasets can be used to test Nightfall's advanced AI-based detection capabilities. The data has been fully de-identified and can be used to test any data loss prevention (DLP) platform.

PII Samples

This dataset showcases Nightfall’s ability to detect Personally Identifiable Information (PII) with exceptional precision and minimal noise across text, spreadsheets, and screenshots. Samples include names, U.S. social security numbers, driver's license numbers, and more. See Image ID Samples for image samples of driver licenses and other ID types.

pii clipboard sample
Hi Support - My name is Julie Walsh. I tried to purchase a life insurance policy online yesterday however the site said, "an unexpected error occurred." I tried to pay with a credit card. My DOB is 02-10-97 and SSN is 523-23-6145. Could you take a look on your end?
This .ZIP contains the positive samples shown above, along with additional examples and negative lookalike samples for testing.

PCI / Banking Samples

This sample dataset demonstrates Nightfall's ability to detect sensitive banking and payment information with high precision and low noise in text, spreadsheets, and screen grabs. Samples include positive and negative examples of credit card numbers, routing numbers, IBAN codes, and SWIFT codes.

pci clipboard sample
Hi Support - This is Julie Walsh. I tried to purchase an electric bike using my credit card 6771-8979-6102-7961. The app is telling me the card was declined. Could you take a look on your end?
This .ZIP contains the positive samples shown above, along with additional examples and negative lookalike samples for testing.

API Keys

Nightfall AI's fine-tuned API key detection LLM detects secrets with high precision and dramatically reduces false positives.

api key clipboard sample
import stripe stripe.api_key = "sk_live_4eC39HqLyjWDarjtT1zdp7dcTYooMQauvdEDq54NiTphI7jx"
stripe.Charge.create( amount=2000, currency="usd", source="tok_amex", # obtained with Stripe.js description="Charge for [email protected]" )
169KB
Open
This .ZIP contains the positive samples shown above, along with additional examples and negative lookalike samples for testing.

Testing note: If a key status is marked as ‘Active’, please rotate the key immediately. Not all vendors provide an "Inactive" response code. In these cases or if the vendor service is offline, the finding status will be marked ‘Unverified’.

Password Samples

Nightfall AI detects passwords shared in conversational text and code.

password clipboard sample
Alex, Here are the credentials to get onto the new training platform. 
loginid=fitnessFreak99 passphrase=Activ3Life22!
This .ZIP contains the positive samples shown above, along with additional examples and negative lookalike samples for testing.

PHI Samples

Nightfall’s PHI model surpasses traditional entity-based detectors by combining multiple signals — including PII and medical indicators — and analyzing their relationships and context to ensure only patient health–related content is flagged.

phi clipboard sample
The patient, Anthony Smith (DOB 05/10/1993), presents with a sustained elevated heart rate. 

The patient has a past medical history of atrial fibrillation. 
Attending Physician: Harwood, Andrew MD 
This .ZIP contains the positive PHI samples shown above, along with additional examples and negative lookalike samples for testing.

Crypto Key Samples

This sample dataset demonstrates Nightfall's ability to detect cryptographic keys.

cryptographic key clipboard sample
-----BEGIN EC PRIVATE KEY-----
MHcCAQEEIGNDB1AYI5yJ4ysmzfnMzAe/gFJup+pY0qt7U7SaQiK/oAoGCCqGSM49
AwEHoUQDQgAEN+yEGcEGA6x31zryD4HUcbHhNVS8nkzhlNR4NWJN2HsCzjBvpq0j
e8CV5iMmLaaQA5BFng0ZbGUPOgLNHhVq1g==
-----END EC PRIVATE KEY-----
This .ZIP contains the positive samples shown above, along with additional examples and negative lookalike samples for testing.

Image ID Samples

Nightfall’s computer vision (CV) transformer model outperforms legacy Optical Character Recognition (OCR) text scanning to identify driver’s licenses, passports, credit cards, and US social security cards even though images may be degraded (rotated, glossy, low contrast, blurry, skewed, or cropped).

108KB
Open
This .ZIP contains the positive samples shown above, along with additional examples.

All Sample Datasets

This ZIP file includes all positive and negative lookalike samples across PII, PCI, Banking, PHI, credentials, and image-based datasets. It’s designed to help you evaluate Nightfall’s detection precision and compare performance in your DLP proof of value (POV) testing.

Last updated

Was this helpful?