All pages
Powered by GitBook
1 of 1

Loading...

Using Scan API (with Python)

Say you have a number of files containing customer or patient data and you are not sure which of them are ok to share in a less secure manner. By leveraging Nightfall’s API you can easily verify whether a file contains sensitive PII, PHI, or PCI.

To make a request to the Nightfall API you will need:

  • A Nightfall API key

  • A list of data types you wish to scan for

  • Data to scan. Note that the API interprets data as plaintext, so you may pass it in any structured or unstructured format.

You can read more about or about our in the linked reference guides.

To run the following API call, we will be using Python's standard json, os, and requests libraries.

First we define the endpoint we want to reach with our API call.

Next we define the headers of our API request. In this example, we have our API key set via an environment variable called "NIGHTFALL_API_KEY". Your API key should never be hard-coded directly into your script.

Next we define the detectors with which we wish to scan our data. The detectors must be formatted as a list of key-value pairs of format {‘name’:’DETECTOR_NAME’}.

Next, we build the request body, which contains the detectors from above, as well as the raw data that you wish to scan. In this example, we will read it from a file called sample_data.csv.

Here we assume that the file is under the 500 KB payload limit of the Scan API. If your file is larger than the limit, consider breaking it down into smaller pieces across multiple API requests.

Now we are ready to call the Nightfall API to check if there is any sensitive data in our file. If there are no sensitive findings in our file, the response will be "[[]]".

[[]]

obtaining a Nightfall API key
available data detectors
detector_list = ['US_SOCIAL_SECURITY_NUMBER', 'ICD9_CODE', 'US_DRIVERS_LICENSE_NUMBER']

detector_object = [{'name':detector} for detector in detector_list]
[{'name':'US_SOCIAL_SECURITY_NUMBER'}, 
 {'name':'ICD9_CODE'}, 
 {'name':'US_DRIVERS_LICENSE_NUMBER'}]
import json
import os
import requests
endpoint = 'https://api.nightfall.ai/v1/scan'
h = {
    'Content-Type': 'application/json',
    'x-api-key': os.getenv('NIGHTFALL_API_KEY')
}
with open('sample_data.csv', 'r') as f:
  raw_data = f.read()

d = {
    'detectors': detector_object,
    'payload':{'items':[raw_data]}
}
import os

if os.stat('sample_data.csv').st_size < 500000:
  print('This file will fit in a single API call.')
else:
  print('This file will need to be broken into pieces across multiple calls.')
response = requests.post(endpoint, headers = h, data = json.dumps(d))

if (response.status_code == 200) & (len(response.content.decode()) > 4):
  print('This file contains sensitive data.')
  print(json.loads(response.content.decode()))
elif response.status_code == 200:
  print('No sensitive data detected. Hooray!')
else:
  print(f'Something went wrong -- Response {response.status_code}.')
[
  [
  {'fragment': '172-32-1176',
   'detector': 'US_SOCIAL_SECURITY_NUMBER',
   'confidence': {'bucket': 'LIKELY'},
   'location': {'byteRange': {'start': 122, 'end': 133},
    'unicodeRange': {'start': 122, 'end': 133}}},
  {'fragment': '514-14-8905',
   'detector': 'US_SOCIAL_SECURITY_NUMBER',
   'confidence': {'bucket': 'LIKELY'},
   'location': {'byteRange': {'start': 269, 'end': 280},
    'unicodeRange': {'start': 269, 'end': 280}}},
  {'fragment': '213-46-8915',
   'detector': 'US_SOCIAL_SECURITY_NUMBER',
   'confidence': {'bucket': 'LIKELY'},
   'location': {'byteRange': {'start': 418, 'end': 429},
    'unicodeRange': {'start': 418, 'end': 429}}}
  ]
 ]