Discovery
PII Discovery
PII (Personally Identifiable Information) Discovery is the first step in identifying sensitive data in your databases. Gigantics provides multiple options to customize how you discover PII in your data.
PII Discovery Options
Select Entities
You can choose which database entities (tables, collections, etc.) to include in the discovery process:
- Include specific entities: Select only certain tables or collections for analysis
- Exclude specific entities: Scan all entities except those you specify
- Scan all entities: Analyze every table and collection in your database
This selection is important because:
- It allows you to focus discovery efforts on databases with sensitive data
- Reduces processing time by excluding system tables or non-sensitive data
- Enables compliance scanning of only regulated data sets
Regular Expressions
Use regular expressions to define patterns for entities you want to discover:
- Pattern matching: Automatically include entities that match specified patterns
- Flexible filtering: Create complex rules to select tables based on naming conventions
- Bulk selection: Efficiently select multiple entities without manual checkboxes
Examples of useful regex patterns:
user.*- Match all tables starting with "user".*_pii$- Match all tables ending with "_pii"[a-z]+_[0-9]+- Match tables with letterunderscorenumber patterns
Discovery Scope Configuration
You can configure the scope of your discovery with these parameters:
Scan Depth
- Sample size: Define how many rows to scan for accurate detection
- Column coverage: Specify whether to analyze all columns or focus on specific types
Processing Options
- Rate limit: Control the speed of discovery to avoid impacting database performance
- Concurrency: Set how many entities to scan in parallel
- Schedule: Run discovery immediately or schedule for off-peak hours
Running PII Discovery
To start a PII Discovery job:
- Navigate to the Model section
- Select "Discovery" from the sidebar
- Click "New Discovery Job"
- Choose "PII Discovery" option
- Configure your entity selection method
- Set processing parameters
- Review and start the job
After the discovery completes, you'll be able to Review Labels and adjust classifications as needed.