Model

Jobs

The Jobs page is your operational hub for managing all data processing activities in your model. Here you can see everything that has run, is running, or is scheduled to run in the future. It's like a dashboard for all your data operations.

Page overview

┌───────────────────────────────────────────────────────────────┐
│ Jobs Page                                                     │
│ ┌─────────────┬────────────┐   ┌────────────────────────────┐ │
│ │ Last Jobs ▣ │ Scheduled  │   │ New Job ⊕ • Sort ▾         │ │
│ └─────────────┴────────────┘   └────────────────────────────┘ │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ Status | Name | Type | Rule | Created By | Info | Actions │ │
│ │   ●    | tap→sink dump                                    │ │
│ │   ○    | nightly discover                                 │ │
│ └───────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────┘
UI areaWhat it showsHow to interact
Tabs (Last Jobs, Scheduled Jobs)Filter your view between recent activity and future scheduled work.Click tabs to switch between views.
Toolbar (New Job, Sort)Primary actions to create new jobs or sort existing ones.Use the New Job button to create a new data processing job.
Jobs tableTable showing all your data processing activities with their status and details.Click on any column header to filter results. Click on a job name to see details.

Jobs table columns

ColumnSample valueWhat it shows
StatusRunning indicator or Completed checkmarkCurrent state of the job.
Nametap-to-s3 (2024-05-06 01:00)Descriptive name of the job. Click to see detailed logs.
Model (project-wide view only)Customer DataWhich data model this job belongs to.
Typedump, load, pump, discover, scanWhat kind of operation the job performs.
RuleAnonymize PIIIf applicable, which rule was applied to your data.
Created Byjane.doeWho initiated the job.
InfoStarted: 12:21 • Duration: 00:03:18 or Next: 07/24/2024 22:00When the job started/finished or when it's scheduled to run.
Actions menu with available actionsContext-sensitive actions based on the job's current status.

Status indicators

StatusWhat it meansWhat you can do
queuedJob accepted and waiting to start.Cancel if needed, otherwise wait for it to begin.
runningJob is currently processing data.Monitor progress through the job details page; cancel if needed.
completedJob finished successfully.View results or rerun if needed.
failedJob encountered an error.Restart to retry the failed parts or rerun completely.
scheduledJob is set to run at a future time.Edit or cancel the scheduled time.

Working with jobs

Starting new jobs

The New Job button (+) in the toolbar opens the New Job modal, which is the primary interface for creating new data processing jobs. The modal provides a comprehensive configuration interface with the following capabilities:

Key features of the New Job modal:

  • From panel: Choose your source (tap or dataset) with environment and driver information
  • To panel: Route data to sinks, taps, or create new datasets
  • Rule panel: Apply optional transformation or anonymization rules
  • Load options: Fine-tune batch sizes, write modes, and performance settings
  • Schedule panel: Choose between immediate execution, one-time scheduling, or recurring pipelines

How to use:

  1. Click the New Job (+) button in the top-right toolbar
  2. Follow the step-by-step panels to configure your data flow
  3. Choose to run immediately (Run Now), schedule for later (Run Later), or save as a reusable pipeline
  4. Review the collapsed panel summaries to ensure all required fields are configured

For detailed information about all available options and configuration settings, see the complete New Job modal documentation.

Managing existing jobs

  • Click any job name to view its details and logs
  • Use the Actions menu (⋮) to perform context-appropriate actions like:

Cancel jobs that are queued or running

Sometimes you may need to stop a job that's currently running or waiting in the queue. This is useful when:

  • You've started a job by mistake
  • You realize you need to make changes to the configuration before proceeding
  • The job is taking longer than expected and blocking other operations
  • You've identified an issue that makes the job unnecessary

Important: Canceling a job may leave it in an undesired state, so use this action with care. Any data processing that was already completed will remain, but partial operations may need to be cleaned up manually.

Restart jobs that failed

When a job fails, you can restart it from the point of failure rather than starting over completely. This action:

  • Skips entities that were already processed successfully
  • Retries only the entries that failed or hadn't started yet
  • Continues processing where it left off

This is particularly useful for jobs that process large volumes of data where most entities were successful, and you only need to retry the failed ones.

Rerun jobs that completed

This action starts a job over completely from the beginning, reusing all the original job settings. Why would you want to do this?

  • Repeat the same operation: Run the exact same scan, discovery, or data processing again with the same parameters
  • Refresh data: Get updated results based on the current state of your data sources
  • Test consistency: Verify that the job produces the same results when run multiple times
  • Apply to new data: If your data source has been updated, rerun the job to process the new information

This is an efficient way to repeat operations without having to reconfigure all the settings.

Delete job history

Remove completed jobs from your job history list. This action:

  • Cleans up your job list for better organization
  • Removes old jobs that are no longer relevant
  • Helps maintain a focused view of recent and active operations

Deleting job history only removes the record from this list - it doesn't affect any data that was processed or created by the job itself.

Download rule configuration as YAML

Export the rule configuration used in a job as a YAML file. This is helpful when:

  • You want to share job configurations with team members
  • You need to send configuration details to Gigantics support for debugging
  • You want to replicate an issue or specific job configuration
  • You need to audit or document the rules applied to your data
  • You're migrating configurations between environments

Scheduled jobs

Jobs can be scheduled to run automatically in several ways:

Schedule methodWhat it does
Run nowExecute the job immediately.
One-time scheduleSet a specific date and time for the job to run.
Manual pipelineSave job configuration as a reusable pipeline template.
Recurring pipelineCreate an automatically repeating job (daily, weekly, etc.).

Scheduled jobs appear in the Scheduled Jobs tab until they run, making it easy to see what's coming up.

Where jobs originate

Jobs come from various actions in the platform:

  • Discover — Scanning your data sources for sensitive information
  • Rules — Data anonymization or transformation operations
  • Datasets — Data export, copy, or merge operations
  • Sinks — Loading processed data to destinations
  • Pipelines — Automated sequences of jobs

Whenever you configure one of these operations, if it's scheduled to run in the future, it will appear in your Scheduled Jobs tab.