# Databricks

## Requirements

* Within Databricks, you must be an Account Admin
* Within Teleskope, you have a Teleskope Account with the Admin role
* A **Teleskope service principal** created in Databricks
* A **personal access token (PAT)** or **OAuth credentials** for API access
* An active **SQL warehouse** per workspace
  * Use the same warehouse name across workspaces, recommended to be specific to Teleskope
* Assigned permissions for the Teleskope service principal on all target data assets

## Integration

Teleskope integrates with Databricks using Unity Catalog metadata APIs, SQL query execution APIs, and system tables. The connector supports:

#### Metadata Discovery

Teleskope scans Unity Catalog using the following object hierarchy:

```
mathematicaCopyEditWorkspace → Catalog → Schema → Table → Column
                             ↘︎ Volume (optional)
```

It discovers:

* Table metadata including schema, data types, tags, and masking policies
* Volume objects such as unstructured file paths (e.g., CSV, JSON, Parquet)

#### Data Sampling

Teleskope executes parameterized SQL queries against the assigned SQL warehouse using:

* `TABLESAMPLE` clause for row-level sampling
* Optional Genie-based sampling (for distribution and profiling of string fields)

#### Tagging and Governance

Using Unity Catalog, Teleskope can:

* Apply governance tags directly to tables and columns via `APPLY TAG`
* Detect and audit existing column masks using `INFORMATION_SCHEMA.COLUMN_MASKS`
* Track and potentially define masking policies for sensitive data

#### Policy Management (optional)

Teleskope can integrate with Policy Maker to:

* Deploy masking policies via SQL commands (`CREATE FUNCTION`, `SET MASK`)
* Automate row-level security using dynamic `FILTER POLICY` functions
* Track and log data access patterns for alerting or escalation

#### Access Monitoring

Databricks system tables allow Teleskope to monitor query and access history:

* `system.query_history` for user-level query logging
* Audit logs for data access, table modifications, and privilege changes (if enabled)

## Enrollment

To enroll Databricks with Teleskope:

{% stepper %}
{% step %}

### Create a Teleskope Service Principal

* Set up a dedicated [service principal](https://docs.databricks.com/aws/en/admin/users-groups/service-principals) in Databricks for Teleskope access.
* Generate a personal access token (PAT) or configure OAuth credentials.
  {% endstep %}

{% step %}

### Assign the Account Admin Role

The service principal requires the account admin role in order to be able to list workspaces.
{% endstep %}

{% step %}

### Assign Required Permissions per Catalog

The service principal must be granted the following minimum permissions:

| Object Type  | Privileges Needed     |
| ------------ | --------------------- |
| Catalog      | `USE CATALOG`         |
| Schema       | `USE SCHEMA`          |
| Tables/Views | `SELECT`, `APPLY TAG` |

For advanced features such as **Access Monitoring**, grant the following additional permissions:

* Access to `system.query_history` and audit logs (for user activity tracking)

[Link to full Python enrollment script ](https://docs.teleskope.ai/connectors/saas/databricks/enrollment-script)
{% endstep %}

{% step %}

### Collect Integration Details

Have the following information available to configure the connector:

* Databricks workspace URL
* Account ID
* SQL Warehouse ID
* Client ID / Client Secret for the Teleskope service principal
  {% endstep %}
  {% endstepper %}
