mirror of
https://github.com/softinio/softinio.com.git
synced 2025-09-03 10:36:42 -07:00
Correction to unicode character - 3
This commit is contained in:
parent
60f14550c6
commit
0d3a4c12c6
1 changed files with 4 additions and 4 deletions
|
@ -97,8 +97,8 @@ Key take-aways:
|
|||
|
||||
|
||||
- External AI vendors have two strictly-controlled entry points
|
||||
- Model-Serving API (top) They can request predictions from your governed endpoint—no direct data access required.
|
||||
- Delta Sharing (middle-right) If a vendor truly needs training or feature data, you grant a scoped, revocable token that exposes only the tables/columns you approve.
|
||||
- Model-Serving API - They can request predictions from your governed endpoint—no direct data access required.
|
||||
- Delta Sharing - If a vendor truly needs training or feature data, you grant a scoped, revocable token that exposes only the tables/columns you approve.
|
||||
|
||||
All other layers—compute, governance, storage—live inside your VPC / cloud account, so raw data never leaves your perimeter unless you explicitly share it through the Delta Sharing gateway.
|
||||
|
||||
|
@ -112,10 +112,10 @@ Each step below tightens control, reduces copies, and shows how to give an exter
|
|||
| Land everything in open, governed tables | <ul><li>Convert CSV/Parquet to Delta / Iceberg with schema enforcement & time-travel.</li><li>Store in your S3 buckets / Google Cloud Storage; enable server-side encryption and object-lock.</li></ul> | Open formats + immutable history make later audits and deletions possible. |
|
||||
| Switch on a unified catalog | <ul><li>Unity Catalog / Lake Formation / Purview / Dataplex / Lakekeeper.</li><li>Import IAM groups, apply column masks, row filters, dynamic data tags (“pii = true”).</li></ul> | One policy engine ≫ dozens of per-tool ACLs. |
|
||||
| Harden the perimeter | <ul><li>Private subnets, VPC peering, and storage firewall rules so only approved compute can touch raw data.</li><li>Disable public buckets & open egress unless justified.</li></ul> | Keeps “shadow ETL” from copying data out the side door. |
|
||||
| Safely share with an external AI vendor (zero-copy) | <ol><li>Minimise first - aggregate, pseudonymise, or drop columns the vendor doesn't need.</li><li>Create a Share (Delta Sharing / Iceberg REST / Arrow Flight): <ul><li>Grant only the filtered table or view.</li><li>Attach row-level filters & column masks.</li><li>Issue a time-boxed bearer token (7-, 30-, or 90-day TTL) and pin it to the vendor's IP range. Databricks DocumentationDatabricks</li></ul><li>Contract & controls - DPA, usage policy, no onward sharing.</li><li>Monitor - streaming audit of every query; set alerts for unusually large scans.</li><li>Revoke or rotate the token the moment the engagement ends (one CLI/API call).</li></ol> | Zero-copy protocols let the vendor query live tables without replicating them. Instant revocation closes the door the second you're done. |
|
||||
| Safely share with an external AI vendor (zero-copy) | <ol><li>Minimise first - aggregate, pseudonymise, or drop columns the vendor doesn't need.</li><li>Create a Share (Delta Sharing / Iceberg REST / Arrow Flight):<ul><li>Grant only the filtered table or view.</li><li>Attach row-level filters & column masks.</li><li>Issue a time-boxed bearer token (7-, 30-, or 90-day TTL) and pin it to the vendor's IP range. Databricks DocumentationDatabricks</li></ul><li>Contract & controls - DPA, usage policy, no onward sharing.</li><li>Monitor - streaming audit of every query; set alerts for unusually large scans.</li><li>Revoke or rotate the token the moment the engagement ends (one CLI/API call).</li></ol> | Zero-copy protocols let the vendor query live tables without replicating them. Instant revocation closes the door the second you're done. |
|
||||
| Move internal ML pipelines onto the platform | <ul><li>Use Spark + MosaicML (or SageMaker/Vertex/Azure ML) inside the governed workspace.</li><li>Log models to a central registry; tag each with source-data lineage.</li></ul> | No more exporting giant CSVs to Jupyter on someone's laptop. |
|
||||
| Expose governed model endpoints | <ul><li>Deploy behind Model Serving (or cloud equivalent).</li><li>Protect with catalog-level ACLs, network policies, and request logging.</li></ul> | External apps can call for predictions without direct data access. |
|
||||
| Automate audits & drift detection | <ul><li>Scheduled jobs that flag: - Tables without tags / owners - Shares approaching token expiry - Models trained on untagged data</li><li>Pipe findings to Slack / JIRA for triage.</li></ul> | Governance-as-code keeps guard-rails from eroding over time. |
|
||||
| Automate audits & drift detection | <ul><li>Scheduled jobs that flag: - Tables without tags / owners - Shares approaching token expiry - Models trained on untagged data</li><li>Pipe findings to Slack / JIRA for triage.</li></ul> | Governance-as-code keeps guard-rails from eroding over time. |
|
||||
|
||||
|
||||
**Result**: engineers still use the notebooks, SQL editors, and BI dashboards they love—but every byte of sensitive data stays in your buckets, under traceable, revocable control. External AI vendors get exactly the slice you permit, for exactly as long as you permit, with a full audit trail to keep everyone honest.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue