Golang中的十亿行挑战 - 从95秒到1.96秒
Renato Pereira
Mar 18, 2024
Renato Pereira
Mar 18, 2024
The One Billion Row Challenge (1BRC) is quite simple: the task is developing a program capable of read a file with 1 billion lines, aggregating the information contained in each line, and print a report with the result. Each line within the file contains a weather station name and a temperature reading in the format <station name>;<temperature>
, where station name may have spaces and other special characters excluding ;
, and the temperature is a floating-point number ranging from -99.9
to 99.9
with precision limited to one decimal point. The expected output format is {<station name>=<min>/<mean/<max>, ...}
, sorted alphabetically by station name, and where min, mean and max denote the computed minimum, average and maximum temperature readings for each respective station.
十亿行挑战(1BRC)非常简单:任务是开发一个能够读取包含10亿行的文件的程序,汇总每行中包含的信息,并打印出结果报告。文件中的每行包含一个天气站名称和一个温度读数,格式为<站点名称>;<温度>
,其中站点名称可能包含空格和其他特殊字符,但不包括;
,温度是一个浮点数,范围从-99.9
到99.9
,精度限制为一位小数。预期的输出格式为{<站点名称>=<最小值>/<平均值/<最大值>,...}
,按站点名称按字母顺序排序,其中min、mean和max分别表示每个站点的计算最小、平均和最大温度读数。
Example of a measurement file:
测量文件示例:
Yellowknife;16.0
Entebbe;32.9
Porto;24.4
Vilnius;12.4
Fresno;7.9
Maun;17.5
Panama City;39.5
...
Example of the expected output:
预期输出的示例:
{Abha=-23.0/18.0/59.2, Abidjan=-16.2/26.0/67.3, Abéché=-10.0/29.4/69.0, ...}
Given that 1-billion-line-file is approximately 13GB, instead of providing a fixed database, the official repository offers a script to generate synthetic data with random readings. Just follow the instructions to create your own database.
鉴于10亿行文件约为13GB,官方存储库提供了一个脚本来生成具有随机读数的合成数据,而不是提供固定的数据库。只需按照说明创建自己的数据库即可。
Although the challenge is primarily targeted for Java developers, the problem presets an interesting toy exercise to experiment in any language. As I’ve been working with Golang in a daily-basis at Gamers Club, I decided to give it a try to test how deep I could go. But before going forward with this article, I want to acknowledge that, despite being well-versed, I am no specialist in Golang and ...