使用CLP将测井成本降低两个数量级
Long, long ago, the amount of data our systems output to logs was small enough that we were able to retain all of the log files. This allowed our engineers to freely analyze the logs, say for troubleshooting our systems or improving applications. But as Uber’s business grew rapidly, the amount of data being logged increased dramatically. And so we were forced to discard log files after just a short period of time, given the prohibitive cost of retaining them–that is, until we integrated CLP into the logging library (Log4j) of our big data platform. In aggregate, CLP achieves a 169x compression ratio on our log data, saving storage, memory, and disk/network bandwidth at every level. As a result, we can now retain all logs at a fraction of the cost, without throwing away any insights, and the compressed logs can be efficiently searched without decompression.
很久很久以前,我们的系统输出到日志的数据量很小,我们能够保留所有的日志文件。这使得我们的工程师可以自由地分析日志,比如说为我们的系统排除故障或改进应用。但随着Uber业务的快速增长,被记录的数据量急剧增加。因此,鉴于保留日志文件的成本过高,我们不得不在很短的时间内丢弃这些文件--也就是说,直到我们整合了 CLP 到我们的大数据平台的日志库(Log4j)中。总的来说,CLP在我们的日志数据上实现了169倍的压缩率,在每个层面上都节省了存储、内存和磁盘/网络带宽。因此,我们现在可以以很小的成本保留所有的日志,而不丢弃任何见解,而且压缩后的日志可以在不解压的情况下有效搜索。
At Uber, we rely on making data-driven decisions at every level. For this, we have built a large-scale big data platform that runs over 250,000 Spark analytics jobs per day, where each job could consist of hundreds of thousands of executors, processing over a hundred petabytes of analytical data. In addition, the big data platform generates a large amount of log data, and the rapid growth of Uber’s business has led to furious growth of these logs. On a busy day, our Spark cluster alone can generate up to 200TB of logs (at the default INFO verbosity level).
在Uber,我们依靠在每个层面上做出数据驱动的决策。为此,我们建立了一个大规模的大数据平台,每天运行超过25万个Spark分析作业,其中每个作业可能由数十万个执行器组成,处理超过100PB的分析数据。此外,大数据平台还产生了大量的日志数据,Uber业务的快速增长导致了这些日志的狂热增长。在繁忙的一天,仅我们的Spark集群就可以产生高达200TB的日志(在默认的INFO粗略程度下)。
These logs are critical to both the platform engineers and data scientists using Spark. Analyzing logs can be used to improve the q...