# Scanning Images for patterns using Custom Regex Detectors

\
Using regex to identify long patterns in images can be challenging because OCR systems. In such cases, even Nightfall may not achieve 100% character-by-character accuracy. To improve results, you must introduce higher levels of flexibility into your regex patterns to accommodate common OCR inconsistencies. Here are some typical OCR challenges to keep in mind:

* **Spell-check noise**: Spell-checking tools can add artifacts like red underlines, which may interfere with text recognition.
* **Character ambiguity**:
  * The digit 0 may be misinterpreted as the letter O (or vice versa), depending on the font.
  * The character l (lowercase L) may be read as the digit 1.
  * The letter B may appear as the digit 8.
* **Underscore handling**: An underscore (`_`) is sometimes interpreted as a space, particularly when spell-check artifacts are present.
* **Line wrapping**: OCR may introduce unexpected newlines when text wraps across multiple lines.
* **Periods and punctuation**: Spell-check artifacts or font issues may result in extraneous periods (`.`) or other punctuation being added to the output. En dash (–) and hyphens (`-`) may be  interchanged.

For reference, OCR tools like Tesseract typically achieve 85-98% character accuracy for similar input, and our system operates within a similar range. Given this, tuning your regex to be more forgiving (e.g., allowing for optional characters or slight variations) can significantly improve detection rates.

**Example Regex (original and loosened)**

original:     ATATT3xFfGF0\[A-Za-z0-9=\_\\-]\*\[=A-Za-z0-9]{9}<br>

loosened:  ATATT\[A-Za-z0-9\_\\-– @.\n=]\*\[A-Za-z0-9\_\\- @.\n]{7,11}

* shortened the literal match prefix
* excluded the the literal zero (`0`) from the prefix
* added period (`.`) and newline () chars
* relaxed the char length

<figure><img src="/files/AO25jkfja9pRoldkz0rc" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.nightfall.ai/developer-api/key-concepts/scanning_features/pre_configured_detection_rules/scanning-images-for-patterns-using-custom-regex-detectors.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
