LangChain Tutorial: Integrating Nightfall for Secure Prompt Sanitization
Generative AI systems like OpenAI's ChatGPT have revolutionized how we interact with technology, but they come with a significant risk: the inadvertent exposure of sensitive information (OWASP LLM06). Without proper safeguards, these AI platforms may receive, process, and potentially retain confidential data, including:
Personally Identifiable Information (PII)
Protected Health Information (PHI)
Financial details (e.g., credit card numbers, bank account information)
Intellectual property
Real-world scenarios highlight the urgency of this issue:
Support Chatbots: Imagine a customer service AI powered by OpenAI. Users, in their quest for help, might unknowingly share credit card numbers or Social Security information. Without content filtering, this sensitive data could be transmitted to OpenAI and logged in your support system.
Healthcare Applications: Consider an AI-moderated health app that processes patient and doctor communications. These exchanges may contain protected health information (PHI), which, if not filtered, could be unnecessarily exposed to the AI system.
Content filtering is a crucial safeguard, removing sensitive data before it reaches the AI system. This ensures that only necessary, non-sensitive information is used for content generation, effectively preventing the spread of confidential data to AI platforms.
Python Example
Let's examine this in a Python example using the LangChain, Anthropic, and Nightfall Python SDKs. You can download this sample code here.
import osfrom dotenv import load_dotenvfrom nightfall import Confidence, DetectionRule, Detector, RedactionConfig, MaskConfig, Nightfallfrom typing import Dict, Listfrom langchain.chains.base import Chainfrom langchain.schema.language_model import BaseLanguageModelfrom langchain.schema.prompt_template import BasePromptTemplatefrom langchain.prompts import PromptTemplatefrom langchain_anthropic import ChatAnthropicfrom langchain.schema.runnable import RunnableSequence, RunnablePassthroughfrom pydantic import Field# Load environment variablesload_dotenv()# 1) Setup Nightfall# By default Nightfall will read the NIGHTFALL_API_KEY environment variablenightfall =Nightfall()# 2) Define a Nightfall detection ruledetection_rule = [DetectionRule( [Detector( min_confidence=Confidence.VERY_LIKELY, nightfall_detector="CREDIT_CARD_NUMBER", display_name="Credit Card Number", redaction_config=RedactionConfig( remove_finding=False, mask_config=MaskConfig( masking_char="X", num_chars_to_leave_unmasked=4, mask_right_to_left=True, chars_to_ignore=["-"]) ) )])]# 3) Classify, Redact, Filter Your User Input# Setup Nightfall Chain elementclassNightfallSanitizationChain(Chain): input_key:str="input" output_key:str="sanitized_input"@propertydefinput_keys(self) -> List[str]:return [self.input_key]@propertydefoutput_keys(self) -> List[str]:return [self.output_key]def_call(self,inputs: Dict[str,str]) -> Dict[str,str]: text = inputs[self.input_key] payload = [text]try: findings, redacted_payload = nightfall.scan_text( payload, detection_rules=detection_rule ) sanitized_text = redacted_payload[0]if redacted_payload[0]else textprint(f"\nsanitized input:\n{sanitized_text}")exceptExceptionas e:print(f"Error in sanitizing input: {e}") sanitized_text = textreturn{self.output_key: sanitized_text}# Initialize the Anthropic LLMllm =ChatAnthropic(model="claude-2.1")# Create a prompt templatetemplate ="The customer said: '{customer_input}' How should I respond to the customer?"prompt =PromptTemplate(template=template, input_variables=["customer_input"])# Create the sanitization chainsanitization_chain =NightfallSanitizationChain()# Create the full chain using RunnableSequencefull_chain = (RunnablePassthrough()| sanitization_chain | (lambdax: {"customer_input": x["sanitized_input"]}) | prompt | llm)# Use the combined chaincustomer_input ="My credit card number is 4916-6734-7572-5015, and the card is getting declined."print(f"\ncustomer input:\n{customer_input}")try: response = full_chain.invoke({"input": customer_input})print("\model reponse:\n", response.content)exceptExceptionas e:print("An error occurred:", e)
Step 1: Setup Nightfall
If you don't yet have a Nightfall account, sign up here.
Create a Nightfall key. Here are the instructions.
Install the necessary packages using the command line:
Create an inline detection rule with the Nightfall API or SDK client, or use a pre-configured detection rule in the Nightfall account. In this example, we will do the former.
If you specify a redaction config, you can automatically get de-identified data back, including a reconstructed, redacted copy of your original payload. Learn more about redaction here.
Step 3: Classify, Redact, Filter Your User Input
to integrate content filtering into our LangChain pipeline seamlessly. We'll create a custom LangChain component for Nightfall sanitization. This allows us to seamlessly integrate content filtering into our LangChain pipeline.
Explanation
We start by importing necessary modules and loading environment variables.
We initialize the Nightfall client and define detection rules for credit card numbers.
The NightfallSanitizationChain class is a custom LangChain component that handles content sanitization using Nightfall.
We set up the Anthropic LLM and create a prompt template for customer service responses.
We create separate chains for sanitization and response generation, then combine them using SimpleSequentialChain.
The process_customer_input function provides an easy-to-use interface for our chain.
Error Handling and Logging
In a production environment, you might want to add more robust error handling and logging. For example:
import logginglogging.basicConfig(level=logging.INFO)logger = logging.getLogger(__name__)defsanitize_input(text): payload = [text]try: findings, redacted_payload = nightfall.scan_text( payload, detection_rules=[detection_rule] )if findings: logger.info(f"Sensitive information detected and redacted")return redacted_payload[0]if redacted_payload[0]else textexceptExceptionas e: logger.error(f"Error in sanitizing input: {e}")# Depending on your use case, you might want to return the original text or an error messagereturn text
Usage
To use this script, you can either run it directly or import the process_customer_input function in another script.
Running the Script Directly
Simply run the script:
pythonsecure_langchain.py
This will process the example customer input and print the sanitized input and final response.
Using in Another Script
You can import the process_customer_input function in another script:
from secure_langchain import process_customer_inputcustomer_input ="My credit card 4916-6734-7572-5015 isn't working. Contact me at alice@example.com."response =process_customer_input(customer_input)print(response)
Expected Output
What does success look like?
If the example runs properly, you should expect to see an output demonstrating the sanitization process and the final response from Claude. Here's what the output might look like:
> Entering new SimpleSequentialChain chain...
> Finished chain.
Sanitized input: The customer said: 'My credit card number is XXXX-XXXX-XXXX-5015, and the card is getting declined.' How should I respond to the customer?
Final Response: I understand you're having trouble with your credit card (XXXX-XXXX-XXXX-5015) being declined. I apologize for the inconvenience. To assist you better, I'll need some additional information...