From 0d3a4c12c64df0305fc434eb21f61365850f570f Mon Sep 17 00:00:00 2001 From: Salar Rahmanian Date: Mon, 28 Apr 2025 10:03:53 -0700 Subject: [PATCH] Correction to unicode character - 3 --- content/post/the-data-surrender-trap/index.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/post/the-data-surrender-trap/index.md b/content/post/the-data-surrender-trap/index.md index 1022a40..2ef868c 100644 --- a/content/post/the-data-surrender-trap/index.md +++ b/content/post/the-data-surrender-trap/index.md @@ -97,8 +97,8 @@ Key take-aways: - External AI vendors have two strictly-controlled entry points -- Model-Serving API (top) They can request predictions from your governed endpoint—no direct data access required. -- Delta Sharing (middle-right) If a vendor truly needs training or feature data, you grant a scoped, revocable token that exposes only the tables/columns you approve. +- Model-Serving API - They can request predictions from your governed endpoint—no direct data access required. +- Delta Sharing - If a vendor truly needs training or feature data, you grant a scoped, revocable token that exposes only the tables/columns you approve. All other layers—compute, governance, storage—live inside your VPC / cloud account, so raw data never leaves your perimeter unless you explicitly share it through the Delta Sharing gateway. @@ -112,10 +112,10 @@ Each step below tightens control, reduces copies, and shows how to give an exter | Land everything in open, governed tables | | Open formats + immutable history make later audits and deletions possible. | | Switch on a unified catalog | | One policy engine ≫ dozens of per-tool ACLs. | | Harden the perimeter | | Keeps “shadow ETL” from copying data out the side door. | -| Safely share with an external AI vendor (zero-copy) |
  1. Minimise first - aggregate, pseudonymise, or drop columns the vendor doesn't need.
  2. Create a Share (Delta Sharing / Iceberg REST / Arrow Flight):  
    • Grant only the filtered table or view.
    • Attach row-level filters & column masks.
    • Issue a time-boxed bearer token (7-, 30-, or 90-day TTL) and pin it to the vendor's IP range. Databricks DocumentationDatabricks
  3. Contract & controls - DPA, usage policy, no onward sharing.
  4. Monitor - streaming audit of every query; set alerts for unusually large scans.
  5. Revoke or rotate the token the moment the engagement ends (one CLI/API call).
| Zero-copy protocols let the vendor query live tables without replicating them. Instant revocation closes the door the second you're done. | +| Safely share with an external AI vendor (zero-copy) |
  1. Minimise first - aggregate, pseudonymise, or drop columns the vendor doesn't need.
  2. Create a Share (Delta Sharing / Iceberg REST / Arrow Flight):
    • Grant only the filtered table or view.
    • Attach row-level filters & column masks.
    • Issue a time-boxed bearer token (7-, 30-, or 90-day TTL) and pin it to the vendor's IP range. Databricks DocumentationDatabricks
  3. Contract & controls - DPA, usage policy, no onward sharing.
  4. Monitor - streaming audit of every query; set alerts for unusually large scans.
  5. Revoke or rotate the token the moment the engagement ends (one CLI/API call).
| Zero-copy protocols let the vendor query live tables without replicating them. Instant revocation closes the door the second you're done. | | Move internal ML pipelines onto the platform | | No more exporting giant CSVs to Jupyter on someone's laptop. | | Expose governed model endpoints | | External apps can call for predictions without direct data access. | -| Automate audits & drift detection | | Governance-as-code keeps guard-rails from eroding over time. | +| Automate audits & drift detection | | Governance-as-code keeps guard-rails from eroding over time. | **Result**: engineers still use the notebooks, SQL editors, and BI dashboards they love—but every byte of sensitive data stays in your buckets, under traceable, revocable control. External AI vendors get exactly the slice you permit, for exactly as long as you permit, with a full audit trail to keep everyone honest.