确保风险系统中的数据可靠性和可观察性
Grab has an in-house Risk Management platform called GrabDefence which relies on ingesting large amounts of data gathered from upstream services to power our heuristic risk rules and data science models in real time.
Grab 拥有一个内部的风险管理平台,称为 GrabDefence,它依赖于从上游服务收集的大量数据来实时驱动我们的启发式风险规则和数据科学模型。
Fig 1. GrabDefence aggregates data from different upstream services
图 1. GrabDefence 从不同的上游服务聚合数据
As Grab’s business grows, so does the amount of data. It becomes imperative that the data which fuels our risk systems is of reliable quality as any data discrepancy or missing data could impact fraud detection and prevention capabilities.
随着Grab业务的增长,数据量也在增加。因此,确保驱动我们的风险系统的数据具有可靠的质量变得至关重要,因为任何数据差异或缺失数据都可能影响欺诈检测和预防能力。
We need to quickly detect any data anomalies, which is where data observability comes in.
我们需要快速检测任何数据异常,这就是数据可观察性的作用。
Data observability as a solution
数据可观察性作为解决方案
Data observability is a type of data operation (DataOps; similar to DevOps) where teams build visibility over the health and quality of their data pipelines. This enables teams to be notified of data quality issues, and allows teams to investigate and resolve these issues faster.
数据可观察性是一种数据操作(类似于 DevOps),团队通过构建对数据管道的健康和质量的可见性来实现。这使得团队能够收到有关数据质量问题的通知,并能够更快地调查和解决这些问题。
We needed a solution that addresses the following issues:
我们需要解决以下问题的解决方案:
- Alerts for any data quality issues as soon as possible - so this means the observability tool had to work in real time.
- 尽快发出任何数据质量问题的警报 - 这意味着可观察性工具必须实时工作。
- With hundreds of data points to observe, we needed a neat and scalable solution which allows users to quickly pinpoint which data points were having issues.
- 由于有数百个数据点需要观察,我们需要一个简洁且可扩展的解决方案,使用户能够快速定位有问题的数据点。
- A consistent way to compare, analyse, and compute data that might have different formats.
- 一种一致的比较、分析和计算可能具有不同格式的数据的方法。
Hence, we decided to use Flink to standardise data transformations, compute, and observe data trends quickly (in real time) and scalably.
因此,我们决定使用Flink来快速(实时)和可扩展地标准化数据转换、计算和观察数据趋势。
Flink SQL is a powerful, flexible tool for performing real-time analy...