New Job modal
The New Job modal launches whenever you click the New Job button on the Jobs page (or when other workflows defer to the job engine). It is a multi-panel form that adapts to your selections to produce either a one-off job or a reusable pipeline.
When Manual execution or Repeat is selected in the Schedule panel, the footer reveals a required Pipeline name field and the primary button switches to Save as Pipeline. In all other cases the button reads Run Now or Run Later depending on the selected schedule.
Panel overview
| Panel | Purpose | Summary text when collapsed |
|---|---|---|
| From | Choose the source tap or a saved dataset. | Shows the tap (with driver/env badges) or the dataset name. |
| To | Route the data into a sink, tap, or dataset. | Lists the sink or dataset action (new, overwrite, merge). |
| Rule | Attach an optional masking rule and schema version. | Displays the selected rule (or “No rule”). |
| Load options | Tune batches and write mode for the selected destination. | Condenses the selected write mode and read/write batch sizes. |
| Transform options (conditional) | Configure determinism, dictionary behaviour, and foreign-key handling when a rule is active. | Shows determinism + dictionary state. |
| Schedule | Decide when and how often to run the job. | Shows “Now”, the scheduled timestamp, or pipeline cadence. |
The following sections break down every configurable field.
From panel
| Field | Control | Description | Dependencies & defaults |
|---|---|---|---|
| Source selector | Radio (Tap, Dataset) | Swap between streaming from the model tap or starting from an existing dataset. | Defaults to Tap. Switching to Dataset enables the dataset dropdown. |
| Tap summary | Read-only text with badges | Displays tap name, environment tag, and driver tag fetched from model.tap. | Always visible so you confirm the environment before launching. |
| Dataset | Dropdown (DatasetSelect) | Pick the dataset that will act as the source when Dataset is selected. | Required when from = from-dataset. Disabled otherwise. |
Internal behaviour:
- Choosing
TapkeepsjobTypealigned with the target (pump/dump). - Selecting a dataset stores the ID for downstream panels (merge and overwrite require it).
To panel
| Destination option | Extra controls | Description | Dependencies & defaults |
|---|---|---|---|
Sink | Dropdown (SinkSelect) | Stream the output to one of the model sinks. | Required when selected. Sets runType = load-to-sink. Job type becomes pump (tap source) or load (dataset source). |
Overwrite current tap | None | Write back into the tap itself (only for drivers that support it). | Disabled unless the tap driver is in the overwrite allow-list. Forces write mode update and runType = overwrite-tap. |
New Dataset | Text input | Create a new dataset; leave blank to auto-generate a name. | Available for all drivers. Sets jobType = dump for tap sources, or pump for dataset copies. |
Overwrite dataset | Dropdown (DatasetSelect) | Replace all rows in an existing dataset. | Required when selected. Keeps job type in sync with the source (dump/pump). |
Merge into dataset | Dropdown (DatasetSelect) | Append + merge into an existing dataset. | Required when selected. Available for both tap and dataset sources. |
Behind the scenes the modal recalculates jobType, runType, and default load mode whenever the From/To pair changes. That guards the later panels from incompatible combinations (for example, overwriting the tap automatically disables Truncate/Drop load modes).
Rule panel
| Field | Control | Description | Notes |
|---|---|---|---|
| Rule | Dropdown (RuleSelect) | Attach a transformation/anonymisation rule. | Defaults to none. Fetches rule options so Load/Transform panels receive pre-filled values. |
| Schema version | Dropdown (Select) | Choose the schema snapshot that should be used during execution. | Populated from Model.schemaVersions. The latest version is labelled (Latest). |
| Concurrency | Embedded component (RuleConcurrency) | Configure parallel entity execution limits inherited from the rule. | Lets you balance performance vs resource usage per job. |
Selecting a rule unlocks the Transform options panel described below.
Transform options (conditional)
| Field | Description | Key details |
|---|---|---|
| Determinism | Toggle between Random and Deterministic. | Deterministic mode requires a seed and guarantees repeatable outputs. |
| Seed | Numeric input with “Generate” helper. | Enabled only in deterministic mode. Accepts values 1-9,999,999. |
| Check foreign keys | Dropdown (Auto, Do not check, Force). | Influences throughput vs referential integrity guarantees. |
| Dictionary | Radio group (No dictionary, Field, Label, Global) plus Cache, Keep, Overwrite toggles. | Mirrors the dictionary behaviour documented in the Dictionary guide. Disabled when mode = none. |
| Continue streaming on fail | Checkbox | Keeps streaming after entity-level errors (default: enabled). |
Load options panel
Load options only appears when the destination implies writing to a sink or tap (jobType = load or pump with tap source). For dataset-only copies the panel hides automatically.
| Field | Control | Description | Behaviour & defaults |
|---|---|---|---|
Target entities (load) | Dropdown | Write strategy: Truncate, Drop, Append, Update, Merge. | Default Truncate. Options are filtered by the selected sink’s writeModes. Overwriting the tap restricts the list to Update. |
Stream content (stream) | Dropdown | Decide how verbose the job log stream should be. | Choices: Data and metadata (default), Data only, Metadata only. |
Read batch size (readBatch) | Number input | Records pulled per batch from the source. | Default 32768. Rules can override via saved options. |
Write batch size (writeBatch) | Number input | Records pushed per batch to the destination. | Default 32768. |
Concurrent workers (concurrent) | Number input (fed by RuleConcurrency) | Sets parallel workers for the run. | Defaults originate from the selected rule or modal initial values. |
Schedule panel
| Choice | Additional fields | Outcome | Primary button label |
|---|---|---|---|
Run now | None | Job is submitted immediately (when = now). | Run Now |
Schedule | Run at (DatePicker) | Creates a one-time scheduledDate; appears in the Scheduled tab until execution. | Run Later |
Manual execution | Pipeline name (footer) | Saves the configuration as a manual pipeline; no run is queued. | Save as Pipeline |
Repeat | Next run on (DatePicker), Repeat every (number), Repeat type (Hour/Day/Month/Year), Pipeline name | Produces a pipeline with recurrence metadata (nextRunDate, repeatEvery, repeatType). | Save as Pipeline |
Schedule summaries when collapsed
Putting it together
- Start with From and To. These two panels decide the internal
jobType(dump,load, orpump) and preconfigure defaults for downstream panels. - Optionally select a Rule to apply masking or transformation logic; doing so unlocks saved load/transform presets.
- Use Load options and Transform options to fine-tune throughput, determinism, and dictionary reuse.
- Finish in the Schedule panel. If you opt for a pipeline, provide a unique name in the footer before saving.
- Review the collapsed summaries. Each header highlights missing required fields in orange so misconfigurations are easy to spot.
Closing the modal clears the cached selections (prevFrom, prevTo, prevRule) so every new launch starts from clean defaults.