Operations

Query

The Query operation allows you to write custom SQL queries to extract data from your sources instead of relying on the default discovery and extraction mechanisms. This operation provides complete control over your data retrieval process and is executed before any transformation operations in the pipeline.

Overview

The Query operation enables you to:

  • Write custom SQL SELECT statements for precise data extraction
  • Override automatic data discovery with targeted queries
  • Join tables, filter records, and aggregate data as needed
  • Apply different queries to different entity types
  • Optimize data retrieval performance with custom queries
  • Access specific subsets of data based on complex business requirements

Configuration

Entity-Level Query Customization

The Query operation is configured at the entity level, allowing you to specify different SQL queries for each entity in your dataset:

  1. Select which entities should use custom SQL queries
  2. Write SQL statements tailored to each entity's requirements
  3. Test and validate your queries before execution

Query Requirements

When writing custom SQL queries, ensure they meet these requirements:

  • Queries must be valid SQL SELECT statements
  • Queries should return all required fields for the entity
  • Queries can include JOINs, WHERE clauses, ORDER BY, and other SQL features
  • Queries are executed in the context of the configured data source
  • Query results should match the expected schema for downstream operations

Examples

Basic Custom Query

To replace the default extraction for a customers entity with a custom query:

  1. Add the Query operation to your rule
  2. Select the customers entity
  3. Write a custom query like:
SELECT customer_id, first_name, last_name, email, phone, created_date
FROM customers
WHERE status = 'active'
ORDER BY created_date DESC

Complex Query with Joins

To create a unified view of orders and customer data:

  1. Select the orders entity in the Query configuration
  2. Write a query that joins multiple tables:
SELECT o.order_id, o.order_date, o.total_amount,
       c.first_name, c.last_name, c.email,
       p.product_name, p.category
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
WHERE o.order_date >= DATEADD(month, -6, GETDATE())

Aggregated Data Query

To extract summarized data instead of individual records:

  1. Select the sales_summary entity
  2. Write an aggregation query:
SELECT DATE_TRUNC('month', order_date) as month,
       COUNT(*) as order_count,
       SUM(total_amount) as monthly_revenue,
       AVG(total_amount) as avg_order_value
FROM orders
WHERE order_date >= '2023-01-01'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month

Best Practices

  • Use parameterized queries when possible to prevent SQL injection
  • Test queries in your database environment before adding to rules
  • Keep queries as simple as possible for better performance
  • Use indexes appropriately in your queries
  • Limit result sets when you don't need all records
  • Ensure query column names match expected entity field names

This operation provides advanced users with the flexibility to extract exactly the data they need using custom SQL, bypassing the need for multiple filtering operations.

On this page