话题公司 › pinterest

公司:pinterest

Pinterest(中文译名:缤趣),是一个网络与手机的应用程序,可以让用户利用其平台作为个人创意及项目工作所需的视觉探索工具,同时也有人把它视为一个图片分享类的社交网站,用户可以按主题分类添加和管理自己的图片收藏,并与好友分享。其使用的网站布局为瀑布流(Pinterest-style layout)。

Pinterest由美国加州帕罗奥图的一个名为Cold Brew Labs的团队营运,创办人为Ben Silbermann、 Paul Sciarra 及 Evan Sharp。2010年正式上线。“Pinterest”是由“Pin”及“interest”两个字组成,在社交网站中的访问量仅次于Facebook、Youtube、VKontakte以及Twitter。

Handling Network Throttling with AWS EC2 at Pinterest

Pinterest在使用AWS EC2实例时,遇到了网络性能瓶颈,特别是在处理高流量和批量数据上传时,出现了网络限速问题。通过引入ENA指标监控,Pinterest实现了对网络流量的精细化管理,并采用限速、数据压缩等技术优化网络性能,减少了限速对关键服务的影响。这一经验为其他使用AWS EC2的用户提供了宝贵的网络性能管理参考。

Improving Pinterest Search Relevance Using Large Language Models

Pinterest通过LLM(大语言模型)优化搜索相关性,采用交叉编码器架构预测查询与Pin的相关性,并利用多种文本特征(如标题、描述、图像合成标题等)增强Pin表示。由于LLM实时服务成本高,通过知识蒸馏将LLM模型压缩为轻量级学生模型,实现在线服务。实验表明,该方法显著提升了搜索相关性和用户满意度,尤其在多语言场景下表现优异。未来将探索可服务LLM和多模态模型的集成,进一步提升系统效果。

Building Holiday Finds: How Pinterest Engineers Reimagined Gift Discovery

Pinterest推出了“Holiday Finds”功能,旨在简化节日礼物发现过程。通过优化个性化推荐系统,智能生成愿望清单,并结合动态UI框架,为用户提供沉浸式购物体验。技术核心包括候选生成、排名和混合机制,确保推荐内容多样化且符合节日需求。自动生成的愿望清单减少了用户手动操作的繁琐,提升了购物效率。整体设计注重平台兼容性和用户体验,为未来购物功能扩展奠定了基础。

Module Relevance on Homefeed

Pinterest的Homefeed引入了模块化内容,通过动态混合模块和Pins优化用户体验。模块分为落地页和轮播两种类型,提供更多浏览控制和推荐背景。通过疲劳机制,减少用户不感兴趣的模块展示。模块排名模型结合用户互动数据,优化模块排序。跳过槽混合方法根据预测互动动态调整模块位置,确保模块仅在预测互动高于Pins时显示。未来计划进一步优化排名和混合策略。

The Quest to Understand Metric Movements

Pinterest通过三种方法分析关键指标波动的原因:首先,“切片分析”通过细分指标维度,找到显著变化的部分;其次,“通用相似性”通过扫描其他指标,找出与目标指标相似或相反的变化;最后,“实验效应”通过分析A/B测试,识别对指标影响最大的实验。这些方法结合使用,能有效缩小根因搜索范围,提升分析效率。

Advancements in Embedding-Based Retrieval at Pinterest Homefeed

Pinterest Homefeed团队通过嵌入式检索技术提升个性化推荐,采用先进的特征交叉和ID嵌入,升级服务语料库,增强机器学习检索能力。改进的两塔模型引入MaskNet和DHEN框架,以提高用户参与度。预训练ID嵌入和条件检索进一步优化推荐精准度。采用多嵌入检索和条件检索方法,提升多样性和用户交互,提高推荐系统效率。

Establishing a Large Scale Learned Retrieval System at Pinterest

Pinterest通过多阶段推荐系统,利用两塔模型及自动再训练机制,提升内容检索和用户参与度。模型结合用户长期和短期行为进行嵌入生成,采用在线服务和离线索引策略,确保模型版本同步并支持快速回滚。自上线以来,新系统已显著提升用户覆盖率和保存率,成功取代其他候选生成器,实现整体用户参与度的提升。

How Optimizing Memory Management with LMDB Boosted Performance on Our API Service

NGAPI, the API platform for serving all first party client API requests, requires optimized system performance to ensure a high success rate of requests and allow for maximum efficiency to provide Pinners worldwide with engaging content. Recently, our team made a significant improvement in handling memory pressure to our API service by implementing a Lightning Memory-Mapped Database (LMDB) to streamline memory management and enhance the overall efficiency of our fleet of hosts. To handle parallelism, NGAPI relies on a multi-process architecture with gevent for per-process concurrency. However, at Pinterest scale, this can cause an increase in memory pressure, leading to efficiency bottlenecks. Moving to LMDB reduced our memory usage by 4.5%, an increase of 4.5 GB per host, which allowed us to increase the number of processes running on each host from 64 to 66, resulting in a greater number of requests each host could handle and better CPU utilization, thus reducing our overall fleet size. [1] The result? More happy Pinners, per host!

Simplify Pinterest Conversion Tracking with NPM Packages

Pinterest conversions are critical for businesses looking to optimize their campaigns and track the performance of their advertisements. By leveraging Pinterest’s Conversion API and Conversion Tag, advertisers can gain deeper insights into user behavior and fine-tune their marketing efforts.

To make this process seamless for developers, we’ve created two NPM packages: pinterest-conversions-server and pinterest-conversions-client. These packages simplify the integration of Pinterest’s Conversion API and Conversion Tag, offering robust solutions for server-side and client-side tracking.

How Pinterest Leverages Honeycomb to Enhance CI Observability and Improve CI Build Stability

At Pinterest, our mobile infrastructure is core to delivering a high-quality experience for our users. In this blog, I’ll showcase how the Pinterest Mobile Builds team is leveraging Honeycomb (starting in 2021) to enhance observability and performance in our mobile builds and continuous integration (CI) workflows.

Resource Management with Apache YuniKorn™ for Apache Spark™ on AWS EKS at Pinterest

Monarch, Pinterest’s Batch Processing Platform, was initially designed to support Pinterest’s ever-growing number of Apache Spark and MapReduce workloads at scale. During Monarch’s inception in 2016, the most dominant batch processing technology around to build the platform was Apache Hadoop YARN. Now, eight years later, we have made the decision to move off of Apache Hadoop and onto our next generation Kubernetes (K8s) based platform.

Ray Batch Inference at Pinterest (Part 3)

Offline batch inference involves operating over a large dataset and passing the data in batches to a ML model which will generate a result for each batch. Offline batch inference jobs generally consist of a series of steps: dataloading, preprocessing, inference, post processing, and result writing. These offline batch inference jobs can be both I/O and compute intensive.

Structured DataStore (SDS): Multi-model Data Management With a Unified Serving Stack

In this blog, we will show how the team transitioned from supporting multiple query serving stacks to provide different data models to a brand new data serving platform with a unified multi model query serving stack called Structured DataStore (SDS).

Feature Caching for Recommender Systems w/ Cachelib

At Pinterest, we operate a large-scale online machine learning inference system, where feature caching plays a critical role to achieve optimal efficiency. In this blog post, we will discuss our decision to adopt Cachelib project by Meta Open Source (“Cachelib”) and how we have built a high-throughput, flexible feature cache by leveraging and expanding upon the capabilities of Cachelib.

Pinterest Tiered Storage for Apache Kafka®️: A Broker-Decoupled Approach

When it comes to PubSub solutions, few have achieved higher degrees of ubiquity, community support, and adoption than Apache Kafka®️, which has become the industry standard for data transportation at large scale. At Pinterest, petabytes of data are transported through PubSub pipelines every day, powering foundational systems such as AI training, content safety and relevance, and real-time ad bidding, bringing inspiration to hundreds of millions of Pinners worldwide. Given the continuous growth in PubSub-dependent use cases and organic data volume, it became paramount that PubSub storage must be scaled to meet growing storage demands while lowering the per-unit cost of storage.

Improving Efficiency Of Goku Time Series Database at Pinterest (Part — 3)

At Pinterest, one of the pillars of the observability stack provides internal engineering teams (our users) the opportunity to monitor their services using metrics data and set up alerting on it. Goku is our in-house time series database that provides cost efficient and low latency storage for metrics data.

Главная - Вики-сайт
Copyright © 2011-2025 iteam. Current version is 2.146.0. UTC+08:00, 2025-10-25 21:26
浙ICP备14020137号-1 $Гость$