公司:Netflix
Netflix(/ˈnɛtflɪks/)(官方中文译名网飞,非官方中文译名奈飞)是起源于美国、在世界各地提供网络视频点播的OTT服务公司,并同时在美国经营单一费率邮寄影像光盘出租服务,后者是使用回邮信封寄送DVD和Blu-ray出租光盘至消费者指定的收件地址。公司由里德·哈斯廷斯和马克·兰多夫在1997年8月29日成立,总部位于加利福尼亚州的洛斯盖图,1999年开始推出订阅制的服务。2009年,Netflix已可提供超过10万部电影DVD,订阅者数超过1000万人。另一方面,截至2022年6月的数据,Netflix的流服务已经在全球拥有2.20亿个订阅用户,在美国的订户已达到7330万。其主要的竞争对手有Disney+、Hulu、HBO Max、Amazon Prime Video、YouTube Premium及Apple TV+等。
Netflix在多个排行榜上均榜上有名:2017年6月6日,《2017年BrandZ最具价值全球品牌100强》公布,Netflix名列第92位。2018年10月,《财富》未来公司50强排行榜发布,Netflix排名第八。2018年12月,世界品牌实验室编制的《2018世界品牌500强》揭晓,排名第88。在《财富》2018年世界500大排名261位,并连年增长。2019年10月,位列2019福布斯全球数字经济100强榜第46名。2019年10月,Interbrand发布的全球品牌百强榜排名65。2020年1月22日,名列2020年《财富》全球最受赞赏公司榜单第16位。2022年2月,按市值计算,Netflix为全球第二大的媒体娱乐公司。2019年,Netflix加入美国电影协会(MPA)。另外,Netflix也被部分媒体列为科技巨擘之一。
Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data Platform
Netflix开发了Auto Remediation功能,旨在自动修复失败的作业。该功能通过规则分类器和机器学习服务集成,可以处理由错误配置引起的错误,并分类新错误。Auto Remediation使用规则分类器进行初始分类,然后使用机器学习服务生成推荐配置。推荐配置存储于配置服务中,并自动应用。通过与Spark作业故障的生产部署验证了Auto Remediation的有效性和潜力。其优化器通过探索Spark配置参数空间来推荐最小化重试失败概率和成本的配置。预测模型每天离线重新训练,并在每个候选配置参数值上调用优化器进行评估。如果优化器找到可行的配置解决方案,响应中包含此推荐配置,由ConfigService用于更改重试的配置。如果没有可行的解决方案,响应中包含禁用重试的标记,从而消除了计算成本的浪费。训练时,使用了Feedforward Multilayer Perceptron (MLP)模型,将任务的元数据特征进行处理,并使用特征哈希和嵌入层进行建模。模型经过验证后存储在Metaflow Hosting中,然后由优化器根据每个配置请求进行调用。
Bending pause times to your will with Generational ZGC
The surprising and not so surprising benefits of generations in the Z Garbage Collector.
Introducing SafeTest: A Novel Approach to Front End Testing
In this post, we’re excited to introduce SafeTest, a revolutionary library that offers a fresh perspective on End-To-End (E2E) tests for web-based User Interface (UI) applications.
Sequential A/B Testing Keeps the World Streaming Netflix Part 1: Continuous Data
Can you spot any difference between the two data streams below? Each observation is the time interval between a Netflix member hitting the play button and playback commencing, i.e., play-delay. These observations are from a particular type of A/B test that Netflix runs called a software canary or regression-driven experiment. More on that below — for now, what’s important is that we want to quickly and confidently identify any difference in the distribution of play-delay — or conclude that, within some tolerance, there is no difference.
In this blog post, we will develop a statistical procedure to do just that, and describe the impact of these developments at Netflix. The key idea is to switch from a “fixed time horizon” to an “any-time valid” framing of the problem.
Rebuilding Netflix Video Processing Pipeline with Microservices
This is the first blog in a multi-part series on how Netflix rebuilt its video processing pipeline with microservices, so we can maintain our rapid pace of innovation and continuously improve the system for member streaming and studio operations. This introductory blog focuses on an overview of our journey. Future blogs will provide deeper dives into each service, sharing insights and lessons learned from this process.
Causal Machine Learning for Creative Insights
A framework to identify the causal impact of successful visual components.
Incremental Processing using Netflix Maestro and Apache Iceberg
本文介绍了Netflix在处理延迟到达数据时采用的增量处理模式。作者详细解释了增量处理模式的应用场景和用例,并提供了一个使用增量处理重新构建的数据流程示例。通过使用增量处理模式,Netflix成功降低了计算成本和执行时间。文章还提到了一些对业务逻辑的更改,如将playback_daily_table与playback_daily_icdc_table进行JOIN,以处理延迟到达的数据。通过这些改变,数据管道的效率得到了极大提升,新的基于IPS的管道只需要原来的资源的约10%的时间来完成。作者还展望了IPS的未来发展,并感谢参与开发IPS的同事们的建议和反馈。
Streamlining Membership Data Engineering at Netflix with Psyberg
Netflix的Membership and Finance Data Engineering团队负责处理与计划、定价、会员生命周期和收入相关的多样化数据,以支持分析、驱动各种仪表盘,并做出数据驱动的决策。然而,当数据延迟到达时,管理数据可能会带来很大的挑战。为了解决这个问题,团队开发了一个增量数据处理框架Psyberg。这个框架可以处理延迟到达的数据,并确保数据的准确性和完整性。通过这个系列的博客文章,读者可以了解Psyberg框架的内部机制、独特特性以及如何与数据流水线集成。Psyberg框架的使用使得数据处理更加高效、准确和及时。
Diving Deeper into Psyberg: Stateless vs Stateful Data Processing
Psyberg是一个数据处理平台,支持无状态和有状态的数据处理模式。对于无状态模式,它根据提供的输入检测Iceberg快照的变化,并将相关信息存储在psyberg_session_f表中。对于有状态模式,它可以处理多个输入流,并根据不同的时间戳字段来跟踪源表快照的变化。无论是哪种模式,Psyberg都能解析出每个Iceberg快照的分区信息。
Psyberg: Automated end to end catch up
本文介绍了Psyberg如何帮助自动化处理数据管道的端到端补偿,包括维度表。文章首先介绍了Psyberg的核心操作模式,即无状态和有状态数据处理。然后,介绍了在集成Psyberg后管道的状态。文章详细解释了Psyberg如何处理延迟到达的数据,并提供了一个通用的处理流程。最后,强调了Psyberg的优点和适用性。文章总结了如何将Psyberg与客户生命周期的四个组件集成,实现自动补偿。
Detecting Speech and Music in Audio Content
When you enjoy the latest season of Stranger Things or Casa de Papel (Money Heist), have you ever wondered about the secrets to fantastic story-telling, besides the stunning visual presentation? From the violin melody accompanying a pivotal scene to the soaring orchestral arrangement and thunderous sound-effects propelling an edge-of-your-seat action sequence, the various components of the audio soundtrack combine to evoke the very essence of story-telling. To uncover the magic of audio soundtracks and further improve the sonic experience, we need a way to systematically examine the interaction of these components, typically categorized as dialogue, music and effects.
In this blog post, we will introduce speech and music detection as an enabling technology for a variety of audio applications in Film & TV, as well as introduce our speech and music activity detection (SMAD) system which we recently published as a journal article in EURASIP Journal on Audio, Speech, and Music Processing.
The Next Step in Personalization: Dynamic Sizzles
At Netflix, we strive to give our members an excellent personalized experience, helping them make the most successful and satisfying selections from our thousands of titles. We already personalize artwork and trailers, but we hadn’t yet personalized sizzle reels — until now.
A sizzle reel is a montage of video clips from different titles strung together into a seamless A/V asset that gets members excited about upcoming launches (for example, our Emmys nominations or holiday collections). Now Netflix can create a personalized sizzle reel dynamically in real time and on demand. The order of the clips and included titles are personalized per member, giving each a unique and effective experience. These new personalized reels are called Dynamic Sizzles.
In this post, we will dive into the exciting details of how we create Dynamic Sizzles with minimal human intervention, including the challenges we faced and the solutions we developed.
Building In-Video Search
Empowering video editors with multimodal machine learning to discover perfect moments across the entire Netflix catalog.
Streaming SQL in Data Mesh
Data powers much of what we do at Netflix. On the Data Platform team, we build the infrastructure used across the company to process data at scale.
In our last blog post, we introduced “Data Mesh” — A Data Movement and Processing Platform. When a user wants to leverage Data Mesh to move and transform data, they start by creating a new Data Mesh pipeline. The pipeline is composed of individual “Processors” that are connected by Kafka topics. The Processors themselves are implemented as Flink jobs that use the DataStream API.
Since then, we have seen many use cases (including Netflix Graph Search) adopt Data Mesh for stream processing. We were able to onboard many of these use cases by offering some commonly used Processors out of the box, such as Projection, Filtering, Unioning, and Field Renaming.
Zero Configuration Service Mesh with On-Demand Cluster Discovery
Netflix’s service mesh adoption: history, motivations, and how we worked with the Envoy community on a feature to streamline mesh adoption.
AVA Discovery View: Surfacing Authentic Moments
At Netflix, we have created millions of artwork to represent our titles. Each artwork tells a story about the title it represents. From our testing on promotional assets, we know which of these assets have performed well and which ones haven’t. Through this, our teams have developed an intuition of what visual and thematic artwork characteristics work well for what genres of titles. A piece of promotional artwork may resonate more in certain regions, for certain genres, or for fans of particular talent. The complexity of these factors makes it difficult to determine the best creative strategy for upcoming titles.
Our assets are often created by selecting static image frames directly from our source videos. To improve it, we decided to invest in creating a Media Understanding Platform, which enables us to extract meaningful insights from media that we can then surface in our creative tools. In this post, we will take a deeper look into one of these tools, AVA Discovery View.