Omnitrace

Production architecture

From lakehouse telemetry to verified action.

Omnitrace connects Databricks and cloud signals to detectors, Atlas Playbook agents, governed remediation, and verified outcomes.

Architecture diagram

From signals to verified outcomes.

Omnitrace turns lakehouse metadata into governed agent work without requiring access to customer data.

Customer data stays outside

Signals

Databricks + cloud

1

Billing usage

System tables

Query history

Jobs

Cluster config

Cloud cost metadata

Platform

Omnitrace control plane

2

Evidence graph

Detector engine

Policy store

Tenant boundary

Action ledger

Verification records

Agents

Atlas Playbook

3

FinOps agent

Reliability agent

Investigator workers

Remediation workers

Verifier workers

Model boundary controls

Outcomes

Governed action

4

Owner context

Approval gates

Scoped MCP tools

Safe remediation

Read-back checks

Outcome evidence

Metadata only

Policy-gated autonomy

Cloud-hosted model option

Verified outcomes

Zone 01

Databricks + cloud signals

Operational metadata and telemetry enter through scoped connections. Omnitrace does not need customer rows, files, query results, or business records.

Billing usage

System tables

Query history

Jobs and workflows

Cluster configuration

Cloud cost metadata

Zone 02

Omnitrace control, data, and agent plane

Signals are normalized into an evidence graph where detectors, policies, ownership, and action history stay connected.

Tenant boundary

Evidence graph

Detector engine

Policy store

Action ledger

Verification records

Zone 03

Atlas Playbook + worker agents

Atlas Playbook decides what should happen next, then delegates focused tasks to worker agents under policy and autonomy controls.

FinOps agent

Reliability agent

Investigator workers

Remediation workers

Verifier workers

Model boundary controls

Zone 04

Work, remediation, and verification

Findings become owned work, approved changes, and verified outcomes instead of ending as dashboard observations.

Jira-ready context

Approval gates

Scoped MCP tools

Guarded remediation

Read-back checks

Outcome evidence

Operating loop

The platform loop is intentionally simple.

The detailed implementation can include many detectors, workers, and policy gates. The product loop stays easy to explain: sense, reason, route, act, and verify.

01

Sense operational metadata

02

Detect waste and drift

03

Build evidence and owner context

04

Select policy and autonomy level

05

Apply approved action

06

Verify the final state

Security boundary

Serious automation without expanding the data boundary.

Omnitrace is designed to do operational work from metadata and telemetry. That keeps the product focused on waste, reliability, ownership, action, and verification rather than customer business data.

Metadata-only by design

The agent uses operational metadata, configuration, cost, workflow, and telemetry signals. Customer table contents and business records stay outside the product boundary.

Cloud-bound reasoning option

Enterprise deployments can use model endpoints hosted by the customer's cloud provider so prompt context remains inside the selected cloud boundary.

Governed action path

Manual, approval-required, and low-risk automatic modes let teams decide which remediations can run and which require review.

Review the architecture with us.

Review the metadata boundary, Atlas Playbook agents, policy gates, and verification loop for your Databricks environment.