公司:pinterest
Pinterest(中文译名:缤趣),是一个网络与手机的应用程序,可以让用户利用其平台作为个人创意及项目工作所需的视觉探索工具,同时也有人把它视为一个图片分享类的社交网站,用户可以按主题分类添加和管理自己的图片收藏,并与好友分享。其使用的网站布局为瀑布流(Pinterest-style layout)。
Pinterest由美国加州帕罗奥图的一个名为Cold Brew Labs的团队营运,创办人为Ben Silbermann、 Paul Sciarra 及 Evan Sharp。2010年正式上线。“Pinterest”是由“Pin”及“interest”两个字组成,在社交网站中的访问量仅次于Facebook、Youtube、VKontakte以及Twitter。
Bring Your Own Algorithm to Anomaly Detection
In this blog, we present a pragmatic way of integrating analytics, written in Python, with our distributed anomaly detection platform, written in Java. The approach here could be generalized to integrate processing done in one language/paradigm into a platform in another language/paradigm.
Lessons from debugging a tricky direct memory leak
To support metrics reporting for ads from external advertisers and real-time ad budget calculations at Pinterest, we run streaming pipelines using Apache Flink. These jobs have guaranteed an overall 99th percentile availability to our users; however, every once in a while some tasks get hit with nasty direct out-of-memory (OOM) errors on multiple operators.
Training Foundation Improvements for Closeup Recommendation Ranker
Pinterest’s mission is- to bring everyone the inspiration to create a life they love. The closeup team helps with this mission by providing a feed of relevant and context-and-user-aware recommendations when a Pinner closes up on any Pin.
The recommendations are powered by innovative and cutting-edge machine learning technologies. We have published a detailed blog post of its modeling architecture. While adopting the newest architectures improves a model’s capabilities, building a solid training foundation stabilizes the model and further up-levels the model’s potential.
Training foundations cover a lot of aspects, from training preparation (training data logging, feature freshness, sampling strategies, hyperparameter tuning, etc), to training efficiency optimization (distributed training, model refreshes, GPU training, etc), to post training validation (offline replay, etc).
Building for Inclusivity: The Technical Blueprint of Pinterest’s Multidimensional Diversification
Pinterest’s mission as a company is to bring everyone the inspiration to create a life they love. “Everyone” has been the north star for our Inclusive AI and Inclusive Product teams. These teams work together to ensure algorithmic fairness, inclusive design, and representation are an integral part of our platform and product experience.
Our commitment is evidenced by our history of building products that champion inclusivity. In 2018, Pinterest announced the skin tone signal and skin tone ranges. In 2020, we announced the integration of skin tone ranges into Try on for Beauty. In 2021, we announced hair pattern search. In early 2023, we announced how we have been using our skin tone signal to shape our recommendations to increase skin tone representation across several surfaces. Now, we are expanding the latter to also include body type representation in fashion related results across search and closeup recommendations (AKA related feeds).
Last Mile Data Processing with Ray
Our mission at Pinterest is to bring everyone the inspiration to create the life they love. Machine Learning plays a crucial role in this mission. It allows us to continuously deliver high-quality inspiration to our 460 million monthly active users, curated from billions of pins on our platform. Behind the scenes, hundreds of ML engineers iteratively improve a wide range of recommendation engines that power Pinterest, processing petabytes of data and training thousands of models using hundreds of GPUs.
MLEnv: Standardizing ML at Pinterest Under One ML Engine to Accelerate Innovation
Pinterest’s mission is to bring everyone the inspiration to create a life they love. We rely on an extensive suite of AI powered products to connect over 460M users to hundreds of billions of Pins, resulting in hundreds of millions of ML inferences per second, hundreds of thousands of ML training jobs per month by just a couple of hundreds of ML engineers.
In 2021, ML was siloed at Pinterest with 10+ different ML frameworks relying on different deep learning frameworks, framework versions, and boilerplate logic to connect with our ML platform. It was a major bottleneck for ML innovation at Pinterest because the amount of engineering resources spent by each ML team to maintain their own ML stack was immense and there was limited knowledge sharing across teams.
Securely Scaling Big Data Access Controls At Pinterest
Businesses collect many different types of data. Each dataset needs to be securely stored with minimal access granted to ensure they are used appropriately and can easily be located and disposed of when necessary. As businesses grow, so does the variety of these datasets and the complexity of their handling requirements. Consequently, access control mechanisms also need to scale constantly to handle the ever-increasing diversification. Pinterest decided to invest in a newer technical framework to implement a finer grained access control (FGAC) framework. The result is a multi-tenant Data Engineering platform, allowing users and services access to only the data they require for their work. In this post, we focus on how we enhanced and extended Monarch, Pinterest’s Hadoop based batch processing system, with FGAC capabilities.
Analyzing Time Series for Pinterest Observability
Time series is a critical part of Observability at Pinterest, powering 60,000 alerts and 5,000 dashboards. A time series is an identifier with values where the values are associated with a timestamp. Given the widespread use and critical nature of time series, it’s important to give engineers the ability to adequately express what operations to perform on the time series in a readable, understandable, and efficient manner. In this post, we will cover the background of time series at Pinterest, the goals of designing an expressive time series language, and some examples of how we are using this language today.
Tuning Flink Clusters for Stability and Efficiency
At Pinterest, stream data processing powers a wide range of real-time use cases. Our Flink clusters are multitenant and run jobs that concurrently process more than 20M msgs/sec across 12 clusters. Over the course of 2022 and early 2023, we’ve spent a significant period of time optimizing our Flink runtime environment and cluster configurations, and we’d like to share our learnings with you.
Deep Multi-task Learning and Real-time Personalization for Closeup Recommendations
At Pinterest, Closeup recommendations (aka Related Pins) is typically a feed of recommended content (primarily Pins) that we serve on any pin closeup. Closeup recommendations generate the largest amount of impressions among all recommendation surfaces at Pinterest and are uniquely critical for our users’ inspiration-to-realization journey. It’s important that we surface qualitative, relevant, and context-and-user-aware recommendations for people on Pinterest.
Representation online matters: practical end-to-end diversification in search and recommender systems
Pinterest is a platform designed to bring everyone the inspiration to create a life they love. This is not only our company’s core mission but something that has become increasingly important in today’s interconnected world. As technology becomes increasingly integrated into the daily lives of billions of people globally, it is crucial for online platforms to reflect the diverse communities they serve. Improving representation online can facilitate content discovery for a more diverse user base by reflecting their inclusion on the platform. This, in turn, demonstrates the platform’s ability to meet their needs and preferences. In addition to improved user experience and satisfaction, this can have a positive business impact through increased engagement, retention, and trust in the platform.
In this post, we show how we improved diversification on Pinterest for three different surfaces: Search, Related Products, and New User Homefeed. Specifically, we have developed and deployed scalable diversification mechanisms that utilize a visual skin tone signal to support representation of a wide range of skin tones in recommendations, as shown in Figure 1 for fashion recommendations in the Related Products surface.
Pacer: Pinterest’s New Generation of Asynchronous Computing Platform
Pinterest的异步作业执行平台Pinlater存在可扩展性瓶颈、硬件效率低、缺乏隔离性和可用性等问题,Pacer重新设计了架构并引入了新的组件和机制。Pacer通过将任务队列划分为分区,并通过Helix和Zookeeper进行管理,解决了Pinlater存在的问题,提高了作业执行的独立性和性能,减少了锁竞争,提高了硬件利用率。Pacer的dequeue broker服务解决了锁竞争问题,并使用缓冲区提高了作业获取的效率。同时,Pacer使用了Helix管理大量的分区,并将它们分配给适当的dequeue broker,以优化资源管理。此次改进是多个团队协作的结果,包括来自Core Services、Data Org、Storage and Caching、Cloud Runtime和Notifications的贡献。
Warden: Real Time Anomaly Detection at Pinterest
Detecting anomalous events has been becoming increasingly important in recent years at Pinterest. Anomalous events, broadly defined, are rare occurrences that deviate from normal or expected behavior. Because these types of events can be found almost anywhere, opportunities and applications for anomaly detection are vast. At Pinterest, we have explored leveraging anomaly detection, specifically our Warden Anomaly Detection Platform, for several use cases (which we’ll get into in this post). With the positive results we are seeing, we are planning to continue to expand our anomaly detection work and use cases.
An ML based approach to proactive advertiser churn prevention
In this blog post, we describe a Machine Learning (ML) powered proactive churn prevention solution that was prototyped with our small & medium business (SMB) advertisers. Results from our initial experiment suggest that we can detect future churn with a high degree of predictive power and consequently empower our sales partners in mitigating churn. ML-powered proactive churn prevention can achieve better results than traditional reactive manual effort.
Large-scale User Sequences at Pinterest
Understanding and responding to user actions and preferences is critical to delivering a personalized, high quality user experience. In this blog post, we’ll discuss how multiple teams joined together to build a new large-scale, highly-flexible, and cost-efficient user signal platform service, which indexes the relevant user events in near real-time, constructs them into user sequences, and makes it super easy to use both for online service requests and for ML training & inferences.
Pinterest is now on HTTP/3
Now Pinterest operates on HTTP/3. We have enabled HTTP/3 for major Pinterest production domains on our multi-CDN edge network, and we’ve upgraded client apps’ network stack to support the new protocol. This allows us to catch up with industry trends. Most importantly, faster and more reliable networking improves Pinners’ experience and business metrics.