Model/Jobs

New Job modal

The New Job modal launches whenever you click the New Job button on the Jobs page (or when other workflows defer to the job engine). It is a multi-panel form that adapts to your selections to produce either a one-off job or a reusable pipeline.

┌─────────────────────────────────────────────────────────────────────┐
│ New Job                                                             │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ From ▾ | To ▾ | Rule ▾ | Load options ▾ | Transform ▾ | …     │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ Panels collapse once configured so you can scan the summary rows │
└─────────────────────────────────────────────────────────────────────┘

When Manual execution or Repeat is selected in the Schedule panel, the footer reveals a required Pipeline name field and the primary button switches to Save as Pipeline. In all other cases the button reads Run Now or Run Later depending on the selected schedule.

Panel overview

PanelPurposeSummary text when collapsed
FromChoose the source tap or a saved dataset.Shows the tap (with driver/env badges) or the dataset name.
ToRoute the data into a sink, tap, or dataset.Lists the sink or dataset action (new, overwrite, merge).
RuleAttach an optional masking rule and schema version.Displays the selected rule (or “No rule”).
Load optionsTune batches and write mode for the selected destination.Condenses the selected write mode and read/write batch sizes.
Transform options (conditional)Configure determinism, dictionary behaviour, and foreign-key handling when a rule is active.Shows determinism + dictionary state.
ScheduleDecide when and how often to run the job.Shows “Now”, the scheduled timestamp, or pipeline cadence.

The following sections break down every configurable field.

From panel

FieldControlDescriptionDependencies & defaults
Source selectorRadio (Tap, Dataset)Swap between streaming from the model tap or starting from an existing dataset.Defaults to Tap. Switching to Dataset enables the dataset dropdown.
Tap summaryRead-only text with badgesDisplays tap name, environment tag, and driver tag fetched from model.tap.Always visible so you confirm the environment before launching.
DatasetDropdown (DatasetSelect)Pick the dataset that will act as the source when Dataset is selected.Required when from = from-dataset. Disabled otherwise.

Internal behaviour:

  • Choosing Tap keeps jobType aligned with the target (pump/dump).
  • Selecting a dataset stores the ID for downstream panels (merge and overwrite require it).

To panel

Destination optionExtra controlsDescriptionDependencies & defaults
SinkDropdown (SinkSelect)Stream the output to one of the model sinks.Required when selected. Sets runType = load-to-sink. Job type becomes pump (tap source) or load (dataset source).
Overwrite current tapNoneWrite back into the tap itself (only for drivers that support it).Disabled unless the tap driver is in the overwrite allow-list. Forces write mode update and runType = overwrite-tap.
New DatasetText inputCreate a new dataset; leave blank to auto-generate a name.Available for all drivers. Sets jobType = dump for tap sources, or pump for dataset copies.
Overwrite datasetDropdown (DatasetSelect)Replace all rows in an existing dataset.Required when selected. Keeps job type in sync with the source (dump/pump).
Merge into datasetDropdown (DatasetSelect)Append + merge into an existing dataset.Required when selected. Available for both tap and dataset sources.

Behind the scenes the modal recalculates jobType, runType, and default load mode whenever the From/To pair changes. That guards the later panels from incompatible combinations (for example, overwriting the tap automatically disables Truncate/Drop load modes).

Rule panel

FieldControlDescriptionNotes
RuleDropdown (RuleSelect)Attach a transformation/anonymisation rule.Defaults to none. Fetches rule options so Load/Transform panels receive pre-filled values.
Schema versionDropdown (Select)Choose the schema snapshot that should be used during execution.Populated from Model.schemaVersions. The latest version is labelled (Latest).
ConcurrencyEmbedded component (RuleConcurrency)Configure parallel entity execution limits inherited from the rule.Lets you balance performance vs resource usage per job.

Selecting a rule unlocks the Transform options panel described below.

Transform options (conditional)

FieldDescriptionKey details
DeterminismToggle between Random and Deterministic.Deterministic mode requires a seed and guarantees repeatable outputs.
SeedNumeric input with “Generate” helper.Enabled only in deterministic mode. Accepts values 1-9,999,999.
Check foreign keysDropdown (Auto, Do not check, Force).Influences throughput vs referential integrity guarantees.
DictionaryRadio group (No dictionary, Field, Label, Global) plus Cache, Keep, Overwrite toggles.Mirrors the dictionary behaviour documented in the Dictionary guide. Disabled when mode = none.
Continue streaming on failCheckboxKeeps streaming after entity-level errors (default: enabled).

Load options panel

Load options only appears when the destination implies writing to a sink or tap (jobType = load or pump with tap source). For dataset-only copies the panel hides automatically.

FieldControlDescriptionBehaviour & defaults
Target entities (load)DropdownWrite strategy: Truncate, Drop, Append, Update, Merge.Default Truncate. Options are filtered by the selected sink’s writeModes. Overwriting the tap restricts the list to Update.
Stream content (stream)DropdownDecide how verbose the job log stream should be.Choices: Data and metadata (default), Data only, Metadata only.
Read batch size (readBatch)Number inputRecords pulled per batch from the source.Default 32768. Rules can override via saved options.
Write batch size (writeBatch)Number inputRecords pushed per batch to the destination.Default 32768.
Concurrent workers (concurrent)Number input (fed by RuleConcurrency)Sets parallel workers for the run.Defaults originate from the selected rule or modal initial values.

Schedule panel

ChoiceAdditional fieldsOutcomePrimary button label
Run nowNoneJob is submitted immediately (when = now).Run Now
ScheduleRun at (DatePicker)Creates a one-time scheduledDate; appears in the Scheduled tab until execution.Run Later
Manual executionPipeline name (footer)Saves the configuration as a manual pipeline; no run is queued.Save as Pipeline
RepeatNext run on (DatePicker), Repeat every (number), Repeat type (Hour/Day/Month/Year), Pipeline nameProduces a pipeline with recurrence metadata (nextRunDate, repeatEvery, repeatType).Save as Pipeline

Schedule summaries when collapsed

┌────────────┬────────────────────────────────────────────┐
│ Run now    │ Summary reads: “Run: Now”                  │
├────────────┼────────────────────────────────────────────┤
│ Schedule   │ Shows the scheduled timestamp (UTC offset) │
├────────────┼────────────────────────────────────────────┤
│ Manual     │ Displays “Run pipeline manually”           │
├────────────┼────────────────────────────────────────────┤
│ Repeat     │ Shows next run timestamp + cadence (e.g.   │
│            │ “Next: 2024-07-24 18:00 — Every 1 day”)    │
└────────────┴────────────────────────────────────────────┘

Putting it together

  1. Start with From and To. These two panels decide the internal jobType (dump, load, or pump) and preconfigure defaults for downstream panels.
  2. Optionally select a Rule to apply masking or transformation logic; doing so unlocks saved load/transform presets.
  3. Use Load options and Transform options to fine-tune throughput, determinism, and dictionary reuse.
  4. Finish in the Schedule panel. If you opt for a pipeline, provide a unique name in the footer before saving.
  5. Review the collapsed summaries. Each header highlights missing required fields in orange so misconfigurations are easy to spot.

Closing the modal clears the cached selections (prevFrom, prevTo, prevRule) so every new launch starts from clean defaults.

On this page