> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pelanor.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Databricks

> Configure Pelanor to ingest cost and usage data from Databricks system tables through Unity Catalog.

<Alert type="info">
  This integration is available only on <strong>Databricks Premium tier and above.</strong>
</Alert>

<Alert type="warning">
  This feature is currently in <strong>Private Beta</strong>. Contact your Pelanor account manager to join the program.
</Alert>

## Prerequisites

| Requirement               | Purpose                                                     |
| ------------------------- | ----------------------------------------------------------- |
| **Unity Catalog enabled** | Exposes Databricks <code>system</code> tables for querying. |
| **Metastore admin role**  | Needed to enable system schemas.                            |
| **Databricks CLI**        | Used to enable schemas and list workspace metadata.         |

***

## Step-by-Step Setup

<Steps>
  <Step title="Enable Unity Catalog">
    Ensure Unity Catalog is active for the target workspace. Follow Databricks’ official documentation if it is not already enabled.
  </Step>

  <Step title="Enable System Schemas">
    <SubSteps>
      1. Install & authenticate the Databricks CLI:
         ```bash theme={null}
         databricks auth login
         # Provide profile name, host, and account ID
         ```

      2. Retrieve the workspace ID:
         ```bash theme={null}
         databricks account workspaces list
         ```

      3. List assigned metastores:
         ```bash theme={null}
         databricks account metastore-assignments get <workspace-id>
         ```

      4. Enable required schemas on the metastore:
         ```bash theme={null}
         databricks system-schemas enable <METASTORE-ID> compute
         databricks system-schemas enable <METASTORE-ID> billing
         databricks system-schemas enable <METASTORE-ID> lakeflow
         ```
    </SubSteps>
  </Step>

  <Step title="(Optional) Create a Warehouse">
    Pelanor can query any existing warehouse, but a small <strong>serverless warehouse</strong> with auto-stop is recommended for cost efficiency.
  </Step>

  <Step title="Create a Service Principal">
    1. Open <strong>Account Console → Service principals</strong>.
    2. Click <strong>Add Service principal</strong>, assign a clear name, then <strong>Generate Secret</strong>.
    3. Save the <strong>Client ID</strong> and <strong>Secret</strong>—you will enter these in Pelanor.
  </Step>

  <Step title="Grant Workspace & Warehouse Access">
    1. In <strong>Account Console → Workspaces</strong>, add the Service Principal to the workspace with <strong>User</strong> permission.
    2. Inside the workspace, open the warehouse → <strong>Permissions</strong> → grant <strong>Can Use</strong>.
    3. Confirm the principal has the <strong>Databricks SQL access</strong> entitlement.
  </Step>

  <Step title="Grant System-Table Privileges">
    Run the following SQL (replace the placeholder with the Service Principal ID):

    ```sql theme={null}
    -- Compute schema
    GRANT USE SCHEMA ON SCHEMA system.compute TO '<service_principal_id>';
    GRANT SELECT ON TABLE system.compute.clusters TO '<service_principal_id>';

    -- Billing schema
    GRANT USE SCHEMA ON SCHEMA system.billing TO '<service_principal_id>';
    GRANT SELECT ON TABLE system.billing.list_prices TO '<service_principal_id>';
    GRANT SELECT ON TABLE system.billing.usage TO '<service_principal_id>';

    -- Lakeflow schema
    GRANT USE SCHEMA ON SCHEMA system.lakeflow TO '<service_principal_id>';
    GRANT SELECT ON TABLE system.lakeflow.job_run_timeline TO '<service_principal_id>';
    GRANT SELECT ON TABLE system.lakeflow.job_task_run_timeline TO '<service_principal_id>';
    GRANT SELECT ON TABLE system.lakeflow.jobs TO '<service_principal_id>';
    ```
  </Step>
</Steps>

***

## Additional Notes

* **Multiple Workspaces** – Granting access in one workspace lets Pelanor collect data for all spend in the Databricks account—no per-workspace connection needed.
* **Pricing Source** – The current adaptor uses Databricks list prices. Custom price books are not yet supported.
