Limit
The Limit operation allows you to restrict the number of records in your dataset output. You can limit by absolute number of rows or by percentage, and apply the limit to all entities collectively or to individual entities separately.
Overview
The Limit operation provides flexible ways to reduce dataset size:
- Limit by a fixed number of rows
- Limit by a percentage of available records
- Apply limits to all entities together or to each entity individually
- Choose which records to keep (first, last, or random)
Configuration Options
Scope
The scope determines whether the limit is applied to all entities collectively or to each entity individually:
All entities: Applies the limit to the entire dataset, regardless of entity types. For example, if you have 1000 customer records and 1000 order records (2000 total), a limit of 500 would return 500 records total from any combination of entities.
By entity: Applies the limit separately to each entity type. For example, if you have customer and order entities, a limit of 500 would return up to 500 customer records AND up to 500 order records (1000 total records maximum).
Limit Type
You can specify how the limit should be applied:
By number of rows: Specify an exact number of records to include. For example, limit to exactly 1000 records.
By percentage: Specify a percentage of the total available records. For example, limit to 20% of all records. When using percentages, you can also set minimum and maximum row constraints to ensure you get a reasonable number of records even when the percentage of a small dataset might be too few or too many records.
Row Position
Determines which records are selected when applying the limit:
First records: Selects records from the beginning of the dataset (useful for getting the most recent records when data is sorted chronologically).
Last records: Selects records from the end of the dataset (useful for getting the oldest records when data is sorted chronologically).
Random records: Selects records randomly from the dataset (useful for sampling data).
Percentage Constraints
When using percentage-based limits, you can set additional constraints:
Min rows: Ensures that even if the percentage of total records is small, you'll get at least this many rows. For example, if you set 5% but want at least 1000 rows, this setting ensures you'll get 1000 rows even if 5% of your dataset is less than that.
Max rows: Ensures that even if the percentage of total records is large, you won't get more than this many rows. For example, if you set 50% but only want a maximum of 50000 rows, this setting caps your output at 50000 rows even if 50% of your dataset would be more.
Examples
Limiting by Number of Rows
To get exactly 1000 records from your dataset:
- Set Scope to "All entities"
- Set Limit Type to "By number of rows"
- Enter "1000" in the value field
- Choose which records to keep (First, Last, or Random)
Limiting by Percentage
To get 20% of your dataset:
- Set Scope to "All entities"
- Set Limit Type to "By percentage"
- Enter "20" in the value field
- Choose which records to keep (First, Last, or Random)
Per-Entity Limiting
To get up to 500 records from each entity:
- Set Scope to "By entity"
- Configure each entity with:
- Limit Type: "By number of rows"
- Value: 500
- Position: "First" (or your preferred selection)
This approach is particularly useful when working with related entities where you want to maintain balanced representation across all types.