1 of 1

GenAI Content Filtering-How to prevent exposure of sensitive data

LangChain/OpenAI Tutorial: Integrating Nightfall for Secure Prompt Sanitization

LLMs like ChatGPT and Claude can inadvertently receive sensitive information from user inputs, posing significant privacy concerns (OWASP LLM06). Without content filtering, these AI platforms can process and retain confidential data such as health records, financial details, and personal identifying information.

Consider the following real-world scenarios:

Support Chatbots: You use LangChain/Claude to power a level-1 support chatbot to help users resolve issues. Users will likely overshare sensitive information like credit card and Social Security numbers. Without content filtering, this information would be transmitted to Anthropic and added to your support ticketing system.
Healthcare Apps: You are using LangChain/Claude to moderate content sent by patients or doctors in your developing health app. These queries may contain sensitive protected health information (PHI), which could be unnecessarily transmitted to Anthropic.

Implementing robust content filtering mechanisms is crucial to protect sensitive data and comply with data protection regulations. In this guide, we will explore how to sanitize prompts using Nightfall before sending them to Claude.

LangChain/OpenAI Example

If you're not using LangChain, check our and tutorials.

Let's take a look at what this would look like in a Python example using the LangChain, Anthropic, and Nightfall Python SDKs:

Setup your environment

Install the necessary packages:

Set up environment variables. Create a .env file in your project directory:

Implementing Nightfall Sanitization as a LangChain Component

to integrate content filtering into our LangChain pipeline seamlessly. We'll create a custom LangChain component for Nightfall sanitization. This allows us to incorporate content filtering into your LangChain pipeline seamlessly.

Explanation

We start by importing necessary modules and loading environment variables.
We initialize the Nightfall client and define detection rules for credit card numbers.
The NightfallSanitizationChain class is a custom LangChain component that handles content sanitization using Nightfall.

Error Handling and Logging

In a production environment, you might want to add more robust error handling and logging. For example:

Usage

To use this script, you can either run it directly or import the process_customer_input function in another script.

Running the Script Directly

Simply run the script:

This will process the example customer input and print the sanitized input and final response.

Using in Another Script

You can import the process_customer_input function in another script:

Expected Output

What does success look like?

If the example runs properly, you should expect to see an output demonstrating the sanitization process and the final response from Claude. Here's what the output might look like:

GenAI Content Filtering-How to prevent exposure of sensitive data

LangChain/OpenAI Tutorial: Integrating Nightfall for Secure Prompt Sanitization

Consider the following real-world scenarios:

Support Chatbots: You use LangChain/Claude to power a level-1 support chatbot to help users resolve issues. Users will likely overshare sensitive information like credit card and Social Security numbers. Without content filtering, this information would be transmitted to Anthropic and added to your support ticketing system.
Healthcare Apps: You are using LangChain/Claude to moderate content sent by patients or doctors in your developing health app. These queries may contain sensitive protected health information (PHI), which could be unnecessarily transmitted to Anthropic.

LangChain/OpenAI Example

If you're not using LangChain, check our and tutorials.

Let's take a look at what this would look like in a Python example using the LangChain, Anthropic, and Nightfall Python SDKs:

Setup your environment

Install the necessary packages:

Set up environment variables. Create a .env file in your project directory:

Implementing Nightfall Sanitization as a LangChain Component

Explanation

We start by importing necessary modules and loading environment variables.
We initialize the Nightfall client and define detection rules for credit card numbers.
The NightfallSanitizationChain class is a custom LangChain component that handles content sanitization using Nightfall.

Error Handling and Logging

In a production environment, you might want to add more robust error handling and logging. For example:

Usage

To use this script, you can either run it directly or import the process_customer_input function in another script.

Running the Script Directly

Simply run the script:

This will process the example customer input and print the sanitized input and final response.

Using in Another Script

You can import the process_customer_input function in another script:

Expected Output

What does success look like?

If the example runs properly, you should expect to see an output demonstrating the sanitization process and the final response from Claude. Here's what the output might look like:

import os
from dotenv import load_dotenv
from langchain.llms import Anthropic
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.chains.base import Chain
from nightfall import Confidence, DetectionRule, Detector, RedactionConfig, MaskConfig, Nightfall
from typing import Dict, List

# Load environment variables
load_dotenv()

# Initialize Nightfall client
nightfall = Nightfall()

# Define Nightfall detection rule
detection_rule = [DetectionRule(
    [Detector(
        min_confidence=Confidence.VERY_LIKELY,
        nightfall_detector="CREDIT_CARD_NUMBER",
        display_name="Credit Card Number",
        redaction_config=RedactionConfig(
            remove_finding=False,
            mask_config=MaskConfig(
                masking_char="X",
                num_chars_to_leave_unmasked=4,
                mask_right_to_left=True,
                chars_to_ignore=["-"])
        )
    )]
)]

class NightfallSanitizationChain(Chain):
    input_key: str = "input"
    output_key: str = "sanitized_input"

    @property
    def input_keys(self) -> List[str]:
        return [self.input_key]

    @property
    def output_keys(self) -> List[str]:
        return [self.output_key]

    def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
        text = inputs[self.input_key]
        payload = [text]
        try:
            findings, redacted_payload = nightfall.scan_text(
                payload,
                detection_rules=[detection_rule]
            )
            sanitized_text = redacted_payload[0] if redacted_payload[0] else text
        except Exception as e:
            print(f"Error in sanitizing input: {e}")
            sanitized_text = text
        return {self.output_key: sanitized_text}

# Initialize the Anthropic LLM
llm = Anthropic(model="claude-v1")

# Create a prompt template
template = "The customer said: '{customer_input}' How should I respond to the customer?"
prompt = PromptTemplate(template=template, input_variables=["customer_input"])

# Create chains
sanitization_chain = NightfallSanitizationChain()
response_chain = LLMChain(llm=llm, prompt=prompt)

# Combine chains
from langchain.chains import SimpleSequentialChain

full_chain = SimpleSequentialChain(
    chains=[sanitization_chain, response_chain],
    verbose=True
)

# Use the combined chain
customer_input = "My credit card number is 4916-6734-7572-5015, and the card is getting declined."
response = full_chain.run(customer_input)

print("\nFinal Response:", response)

import os
from dotenv import load_dotenv
from langchain.llms import Anthropic
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.chains.base import Chain
from nightfall import Confidence, DetectionRule, Detector, RedactionConfig, MaskConfig, Nightfall
from typing import Dict, List

# Load environment variables
load_dotenv()

# Initialize Nightfall client
nightfall = Nightfall()

# Define Nightfall detection rule
detection_rule = [DetectionRule(
    [Detector(
        min_confidence=Confidence.VERY_LIKELY,
        nightfall_detector="CREDIT_CARD_NUMBER",
        display_name="Credit Card Number",
        redaction_config=RedactionConfig(
            remove_finding=False,
            mask_config=MaskConfig(
                masking_char="X",
                num_chars_to_leave_unmasked=4,
                mask_right_to_left=True,
                chars_to_ignore=["-"])
        )
    )]
)]

class NightfallSanitizationChain(Chain):
    input_key: str = "input"
    output_key: str = "sanitized_input"

    @property
    def input_keys(self) -> List[str]:
        return [self.input_key]

    @property
    def output_keys(self) -> List[str]:
        return [self.output_key]

    def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
        text = inputs[self.input_key]
        payload = [text]
        try:
            findings, redacted_payload = nightfall.scan_text(
                payload,
                detection_rules=[detection_rule]
            )
            sanitized_text = redacted_payload[0] if redacted_payload[0] else text
        except Exception as e:
            print(f"Error in sanitizing input: {e}")
            sanitized_text = text
        return {self.output_key: sanitized_text}

# Initialize the Anthropic LLM
llm = Anthropic(model="claude-v1")

# Create a prompt template
template = "The customer said: '{customer_input}' How should I respond to the customer?"
prompt = PromptTemplate(template=template, input_variables=["customer_input"])

# Create chains
sanitization_chain = NightfallSanitizationChain()
response_chain = LLMChain(llm=llm, prompt=prompt)

# Combine chains
from langchain.chains import SimpleSequentialChain

full_chain = SimpleSequentialChain(
    chains=[sanitization_chain, response_chain],
    verbose=True
)

# Use the combined chain
customer_input = "My credit card number is 4916-6734-7572-5015, and the card is getting declined."
response = full_chain.run(customer_input)

print("\nFinal Response:", response)

GenAI Content Filtering-How to prevent exposure of sensitive data

hashtagLangChain/OpenAI Tutorial: Integrating Nightfall for Secure Prompt Sanitization

hashtagLangChain/OpenAI Example

hashtagSetup your environment

hashtagImplementing Nightfall Sanitization as a LangChain Component

hashtagExplanation

hashtagError Handling and Logging

hashtagUsage

hashtagRunning the Script Directly

hashtagUsing in Another Script

hashtagExpected Output

hashtagWhat does success look like?

GenAI Content Filtering-How to prevent exposure of sensitive data

hashtagLangChain/OpenAI Tutorial: Integrating Nightfall for Secure Prompt Sanitization

hashtagLangChain/OpenAI Example

hashtagSetup your environment

hashtagImplementing Nightfall Sanitization as a LangChain Component

hashtagExplanation

hashtagError Handling and Logging

hashtagUsage

hashtagRunning the Script Directly

hashtagUsing in Another Script

hashtagExpected Output

hashtagWhat does success look like?

LangChain/OpenAI Tutorial: Integrating Nightfall for Secure Prompt Sanitization

LangChain/OpenAI Example

Setup your environment

Implementing Nightfall Sanitization as a LangChain Component

Explanation

Error Handling and Logging

Usage

Running the Script Directly

Using in Another Script

Expected Output

What does success look like?

LangChain/OpenAI Tutorial: Integrating Nightfall for Secure Prompt Sanitization

LangChain/OpenAI Example

Setup your environment

Implementing Nightfall Sanitization as a LangChain Component

Explanation

Error Handling and Logging

Usage

Running the Script Directly

Using in Another Script

Expected Output

What does success look like?