资产管理平台(AMP)中的Elasticsearch索引策略
By Burak Bacioglu, Meenakshi Jindal
作者:Burak Bacioglu,Meenakshi Jindal
Asset Management at Netflix
网飞公司的资产管理
At Netflix, all of our digital media assets (images, videos, text, etc.) are stored in secure storage layers. We built an asset management platform (AMP), codenamed Amsterdam, in order to easily organize and manage the metadata, schema, relations and permissions of these assets. It is also responsible for asset discovery, validation, sharing, and for triggering workflows.
在Netflix,我们所有的数字媒体资产(图片、视频、文本等)都存储在安全存储层中。我们建立了一个资产管理平台(AMP),代号为阿姆斯特丹,以便轻松组织和管理这些资产的元数据、模式、关系和权限。它还负责资产的发现、验证、共享和触发工作流程。
Amsterdam service utilizes various solutions such as Cassandra, Kafka, Zookeeper, EvCache etc. In this blog, we will be focusing on how we utilize Elasticsearch for indexing and search the assets.
阿姆斯特丹服务利用了各种解决方案,如Cassandra、Kafka、Zookeeper、EvCache等。在这篇博客中,我们将重点介绍我们如何利用Elasticsearch来索引和搜索资产。
Amsterdam is built on top of three storage layers.
阿姆斯特丹是建立在三个存储层之上的。
The first layer, Cassandra, is the source of truth for us. It consists of close to a hundred tables (column families) , the majority of which are reverse indices to help query the assets in a more optimized way.
第一层,Cassandra,是我们的真理之源。它由近百个表(列族)组成,其中大部分是反向索引,以帮助以更优化的方式查询资产。
The second layer is Elasticsearch, which is used to discover assets based on user queries. This is the layer we’d like to focus on in this blog. And more specifically, how we index and query over 7TB of data in a read-heavy and continuously growing environment and keep our Elasticsearch cluster healthy.
第二层是Elasticsearch,用于根据用户查询发现资产。这是我们在这篇博客中要关注的一层。更具体地说,我们如何在一个重读和持续增长的环境中索引和查询超过7TB的数据,并保持我们的Elasticsearch集群健康。
And finally, we have an Apache Iceberg layer which stores assets in a denormalized fashion to help answer heavy queries for analytics use cases.
最后,我们有一个ApacheIceberg层,它以一种非标准化的方式存储资产,以帮助回答分析用例的大量查询。
Elasticsearch Integration
Elasticsearch集成
Elasticsearch is one of the best and widely adopted distributed, open source search and analytics engines for all types of data, including textual, numer...