admin管理员组

文章数量:1345310

I know that in BigQuery I can set a policy tag with a UDF masking rule, i.e. a user-defined function to alter the values of the columns where I apply the tag. In this way, other users might not see the original value of that field, but just the altered one. The UDF is stored in BigQuery in a dataset of the same project where I want to apply the data masking rule. An example: a nice UDF would sha256(concat(field;"123456")), where 123456 might be replaced by a more complex "encryption" parameter.

If I want to guarantee that the field is still usable in some way (i.e., allow to make joins), I need the algorithm of alteration to be deterministic, so that the same original-value in different columns has the same masked-value.

However, this approach has a drawback: if the final user knows the original-value and the alteration rule (for example, because they can read the actual UDF used for data-masking, since they belong to a group that have enough roles to create tables, datasets, etc), they can compute the masked-value, so this approach is useless.

From this loop it is clear that the possible solutions are:

  • Using a deterministic algorythm, but keeping it unkown from the final user
  • Prevent users with admin roles on that project to read the UDF
  • Find a totally different approach

本文标签: