• caglararli@hotmail.com
  • 05386281520

How to secure data on production environment

Çağlar Arlı      -    12 Views

How to secure data on production environment

Question

Suppose you design a data lake with sensitive data. Due to immaturity of tools, dynamic data masking is unavailable. You have MFA, encryption at rest, audit logging, ETL processing data and people that develop/support/debug that ETL.

The standard answer is a separation of concerns between developer and support engineer (no access to sensitive data from developer, only support engineer).

But how to protect sensitive data from support engineer(s)?

Considerations

Support engineer need to debug issues, run jobs, etc. If ETL logic depends on sensistive data (i.e. join by IP address, or filter by medical status) this inherently mean support engineer has to have access to sensitive data.

Column-level encryption (besides a lot of burden) looks to not solve the issue: deterministic encryption isn't secure, moreover support engineer has to have indirect access to decryption keys in order to run the jobs.

Audit of queries also may not solve the issue, because query results may be downloaded.