Say you have a number of files containing customer or patient data and you are not sure which of them are ok to share in a less secure manner. By leveraging Nightfall’s API you can easily verify whether a file contains sensitive PII, PHI, or PCI.
To make a request to the Nightfall API you will need:
A Nightfall API key
A list of data types you wish to scan for
Data to scan. Note that the API interprets data as plaintext, so you may pass it in any structured or unstructured format.
To run the following API call, we will be using Python's standard json, os, and requests libraries.
import jsonimport osimport requests
First we define the endpoint we want to reach with our API call.
endpoint = 'https://api.nightfall.ai/v1/scan'
Next we define the headers of our API request. In this example, we have our API key set via an environment variable called "NIGHTFALL_API_KEY". Your API key should never be hard-coded directly into your script.
h = {
'Content-Type': 'application/json',
'x-api-key': os.getenv('NIGHTFALL_API_KEY')
}
Next we define the detectors with which we wish to scan our data. The detectors must be formatted as a list of key-value pairs of format {‘name’:’DETECTOR_NAME’}.
detector_list = ['US_SOCIAL_SECURITY_NUMBER', 'ICD9_CODE', 'US_DRIVERS_LICENSE_NUMBER']
detector_object = [{'name':detector} for detector in detector_list]
Next, we build the request body, which contains the detectors from above, as well as the raw data that you wish to scan. In this example, we will read it from a file called sample_data.csv.
Here we assume that the file is under the 500 KB payload limit of the Scan API. If your file is larger than the limit, consider breaking it down into smaller pieces across multiple API requests.
import os
if os.stat('sample_data.csv').st_size < 500000:
print('This file will fit in a single API call.')
else:
print('This file will need to be broken into pieces across multiple calls.')
Now we are ready to call the Nightfall API to check if there is any sensitive data in our file. If there are no sensitive findings in our file, the response will be "[[]]".
response = requests.post(endpoint, headers = h, data = json.dumps(d))if (response.status_code ==200) & (len(response.content.decode())>4):print('This file contains sensitive data.')print(json.loads(response.content.decode()))elif response.status_code ==200:print('No sensitive data detected. Hooray!')else:print(f'Something went wrong -- Response {response.status_code}.')