PII masking for privacy-grade machine learning

出处：engineering.grab.com

摘要

At Grab, data engineers work with large sets of data on a daily basis. They design and build advanced machine learning models that provide strategic insights using all of the data that flow through the Grab Platform. This enables us to provide a better experience to our users, for example by increasing the supply of drivers in areas where our predictive models indicate a surge in demand in a timely fashion.

Grab has a mature privacy programme that complies with applicable privacy laws and regulations and we use tools to help identify, assess, and appropriately manage our privacy risks. To ensure that our users’ data are well-protected and avoid any human-related errors, we always take extra measures to secure this data.

However, data engineers will still require access to actual production data in order to tune effective machine learning models and ensure the models work as intended in production.

In this article, we will describe how the Grab’s data streaming team (Coban), along with the data platform and user teams, have enforced Personally Identifiable Information (PII) masking on machine learning data streaming pipelines. This ensures that we uphold a high standard and embody a privacy by design culture, while enabling data engineers to refine their models with sanitised production data.

阅读原文

xiaozi 于 2023-06-06 分享

2866

关联话题： #Grab

欢迎在评论区写下你对这篇文章的看法。

PII masking for privacy-grade machine learning

PII masking for privacy-grade machine learning

摘要

评论

文库