Discovery

PII Discovery

PII (Personally Identifiable Information) Discovery is the first step in identifying sensitive data in your databases. Gigantics provides multiple options to customize how you discover PII in your data.

PII Discovery Options

Select Entities

You can choose which database entities (tables, collections, etc.) to include in the discovery process:

  • Include specific entities: Select only certain tables or collections for analysis
  • Exclude specific entities: Scan all entities except those you specify
  • Scan all entities: Analyze every table and collection in your database

This selection is important because:

  • It allows you to focus discovery efforts on databases with sensitive data
  • Reduces processing time by excluding system tables or non-sensitive data
  • Enables compliance scanning of only regulated data sets

Regular Expressions

Use regular expressions to define patterns for entities you want to discover:

  • Pattern matching: Automatically include entities that match specified patterns
  • Flexible filtering: Create complex rules to select tables based on naming conventions
  • Bulk selection: Efficiently select multiple entities without manual checkboxes

Examples of useful regex patterns:

  • user.* - Match all tables starting with "user"
  • .*_pii$ - Match all tables ending with "_pii"
  • [a-z]+_[0-9]+ - Match tables with letterunderscorenumber patterns

Discovery Scope Configuration

You can configure the scope of your discovery with these parameters:

Scan Depth

  • Sample size: Define how many rows to scan for accurate detection
  • Column coverage: Specify whether to analyze all columns or focus on specific types

Processing Options

  • Rate limit: Control the speed of discovery to avoid impacting database performance
  • Concurrency: Set how many entities to scan in parallel
  • Schedule: Run discovery immediately or schedule for off-peak hours

Running PII Discovery

To start a PII Discovery job:

  1. Navigate to the Model section
  2. Select "Discovery" from the sidebar
  3. Click "New Discovery Job"
  4. Choose "PII Discovery" option
  5. Configure your entity selection method
  6. Set processing parameters
  7. Review and start the job

After the discovery completes, you'll be able to Review Labels and adjust classifications as needed.

On this page