Welcome to the Gigantics Documentation

What is Gigantics

Gigantics brings data risk analysis, anonymization, and synthetic data generation together in one local-first platform. It continuously profiles your databases to surface personally identifiable information (PII), assesses field-level risk, and enriches metadata with an AI-powered labeling engine so that you can mask, anonymize, or synthesize data with confidence.

With Gigantics, you can:

✅ Analyze database schemas and compare them against previous revisions
✅ Identify PII automatically and review the risk level of each field
✅ Generate anonymized and fully synthetic datasets for safe testing
✅ Produce audit-ready security reports on demand
✅ Govern, share, and download curated datasets
✅ Move data between environments without exposing sensitive records
✅ Control access through granular roles and permissions

Data Operations: Load, Dump, Pump

Gigantics streamlines three verbs that define secure data movement. Each verb is backed by built-in anonymization pipelines and metadata awareness so that sensitive values are always handled safely.

Discovery

Bring metadata in from your taps (sources). Gigantics inspects structure, tags PII with labels, and stores it all in the Model.

Dump

Bring data in from your taps (sources) while anonymizing or synthesizing new data. Gigantics uses labels to modify sensitive information and stores a governed copy inside datasets in the Model.

Load

Move curated data out from the Model to Sinks (targets) with masking, anonymization, or synthesis applied in-flight (if the datasets are not already anonymized, or new data is being generated) so only compliant payloads ever leave the platform.

Pump

Continuously refresh downstream environments with synthetic or anonymized snapshots. Pumps orchestrate recurring extractions, transformations, and deliveries without re-exposing raw production records through automated pump jobs. You can overwrite or just update new data.

Taps and Sinks

Taps represent the databases, warehouses, or APIs you pull from; sinks are the destinations where Gigantics delivers protected data. Every tap and sink inherits your data rules so anonymization and synthesis happen consistently across the entire flow.

Security & Collaboration

In sensitive data environments, proper access control and organizational structure are critical for maintaining data security while enabling team productivity. Gigantics provides a flexible, self-service security framework through its hierarchical organization system.

Each user has their own space that we call Organizations, which serves as the top-level container for all your data processing work. Within each Organization, you can create multiple Projects to isolate different workstreams, databases, or team responsibilities. This separation ensures that sensitive data access is properly compartmentalized and that teams can work independently without interfering with each other's workflows.

Projects function as individual workspaces where users can:

Create and manage data models
Configure database connections (taps and sinks)
Define anonymization and synthesis rules
Invite team members with appropriate permission levels
Share datasets and collaborate securely

The security model is designed to be self-service, allowing teams to create their own organizations and projects without requiring administrative intervention. This flexibility enables rapid experimentation and development while maintaining security boundaries through granular role-based access controls. Users can be assigned different permission levels within each project, ensuring that only authorized personnel can access or modify sensitive datasets.

This hierarchical approach to security enables organizations to implement a "zero-trust" data handling model where:

Access to sensitive data is controlled at the organization level
Teams can work independently within their projects
Data governance policies are consistently applied across all workspaces
Compliance requirements can be met through proper access logging and controls

Fast, Gigantic

Gigantics is engineered for high-performance data processing, leveraging cutting-edge technologies to handle large-scale datasets with remarkable speed and efficiency. Our platform processes data flows in parallel using three core optimization strategies:

Local Managed AI Rules

All AI processing happens locally on your infrastructure, eliminating network latency and data transfer bottlenecks. This approach not only ensures maximum security and compliance but also delivers faster processing times since data doesn't need to travel across networks for analysis.

Fast JavaScript Functions

Custom data transformations and anonymization operations are executed through highly optimized JavaScript functions. These lightweight functions provide rapid processing speeds while maintaining the flexibility to implement complex data manipulation logic tailored to your specific requirements.

Node Streams Architecture

Gigantics employs Node.js streams to process data incrementally rather than loading entire datasets into memory. This streaming approach enables efficient memory usage and allows data to flow through processing pipelines without waiting for complete dataset loads, significantly reducing processing times for large databases.

These technologies work in concert to deliver parallel processing capabilities that scale with your data volume. Whether you're anonymizing a small database or synthesizing massive datasets, Gigantics maintains consistent performance through intelligent resource management and concurrent processing streams.