Projects
Data Lists
Data Lists are collections of values that can be used in anonymization and synthesis rules to replace or generate data. They provide a way to define custom sets of data that your transformations can use.
What are Data Lists?
Data Lists allow you to create named collections of values that can be referenced in your data transformation rules. These lists can contain simple text values or be populated from CSV files.
When to Use Data Lists?
Data Lists are particularly useful when:
- You need to replace specific values with a predefined set of alternatives (e.g., replacing real city names with a list of fictional cities)
- You want to generate synthetic data from a specific pool of values (e.g., generating realistic names from a list of common names)
- You have domain-specific values that aren't covered by built-in generators (e.g., specific product codes, internal department names)
- You need consistent data across multiple fields or tables
Creating a Data List
From Plain Text Values
- Click the Create button in the Data Lists view
- Enter a name and description for your data list
- Select "Plain text" as the source type
- Add comma-separated values in the text area
- Click Save
From CSV Files
- Click the Create button in the Data Lists view
- Enter a name and description for your data list
- Select "CSV" as the source type
- Configure CSV parsing options:
- Check "Does the CSV contain headers?" if your file has a header row
- Specify the separator character (default is comma)
- Upload your CSV file
- Click Save
Using Data Lists in Rules
Data Lists can be used in both anonymization and synthesis rules.
Anonymization Example
Replace sensitive data with values from a Data List:
- In an anonymization rule, select the "List" function
- Choose your Data List from the dropdown
- Configure how values are inserted:
- Sequential: Values are used in order
- Random: Values are selected randomly
Synthesis Example
Generate synthetic data using values from a Data List:
- In a synthesis rule, select the "List" function
- Choose your Data List from the dropdown
- Configure insertion method (Sequential or Random)
Benefits of Data Lists
- Reusability: Create once, use in multiple rules
- Consistency: Ensure the same values are used across transformations
- Flexibility: Support both simple lists and complex CSV data
- Organization: Keep your transformation data organized and named
Best Practices
- Use descriptive names for your Data Lists
- Keep lists updated when your data requirements change
- When using CSV files, ensure they're properly formatted
- Test your rules after making changes to Data Lists