话题 › 公司 › pinterest

公司：pinterest

Pinterest（中文译名：缤趣），是一个网络与手机的应用程序，可以让用户利用其平台作为个人创意及项目工作所需的视觉探索工具，同时也有人把它视为一个图片分享类的社交网站，用户可以按主题分类添加和管理自己的图片收藏，并与好友分享。其使用的网站布局为瀑布流（Pinterest-style layout）。

Pinterest由美国加州帕罗奥图的一个名为Cold Brew Labs的团队营运，创办人为Ben Silbermann、 Paul Sciarra 及 Evan Sharp。2010年正式上线。“Pinterest”是由“Pin”及“interest”两个字组成，在社交网站中的访问量仅次于Facebook、Youtube、VKontakte以及Twitter。

TiDB Adoption at Pinterest

HBase has been a foundational storage system at Pinterest since its inception in 2013, when it was deployed at a massive scale and supported numerous use cases. However, it started to show significant inadequacy to keep up with the evolving business needs due to various reasons mentioned in the previous blog. As a result, two years ago we started searching for a next-generation storage technology that could replace HBase for many years to come and enable business critical use cases to scale beyond the existing storage limitations.

pinterest技术

Building Pinterest Canvas, a text-to-image foundation model

Pinterest Canvas是一种文本到图像的模型，用于增强Pinterest平台上的现有图像和产品。它通过训练基础文本到图像模型，然后进行微调生成可视化产品的真实背景。模型经过两个阶段的训练，第一阶段训练模型填充缺失的图像区域，第二阶段专注于产品的可视化任务。模型还支持个性化结果，通过附加样式上下文来指导生成过程。Pinterest Canvas的未来改进包括升级底层的扩散骨干模型，进一步提高生成质量，并与用户进行反馈交流。此外，团队还在研究如何重新思考模型条件约束，并探索使用Pinterest优化的视觉嵌入来改进模型的文本条件组件。

pinterest技术

Web Performance Regression Detection (Part 3 of 3)

Pinterest网站在预发布阶段通过A/B实验和JS捆绑大小检查来主动检测和防止性能回归。当实验的性能指标出现明显回归时，会触发警报和工单，以跟踪和解决问题。此外，通过CI流水线运行的JS捆绑大小检查可以防止捆绑大小增加导致的回归。任何捆绑大小的显著变化都会在PR评论中报告，并发送警报通知相关团队。开发人员可以通过提示解决回归，并使用webpack-bundle-analyzer工具进行进一步调查。

pinterest技术

Ray Infrastructure at Pinterest

Pinterest在构建Ray基础架构时，采用了中间层解决方案，包括API Server、Ray Cluster / Job Controller和MySQL数据库。该解决方案简化了用户与Kubernetes的交互，并提供了实时的Ray集群和Ray Job状态监控。Pinterest还使用AWS S3持久化Ray日志，并在Ray Cluster UI上展示。他们还开发了Statsboard工具，用于展示Ray应用程序的性能指标和特定功能。此外，Pinterest还提供了三种开发Ray应用程序的选项，并通过ML RESTful API支持Dev server、Jupyter和Spinner工作流。Pinterest还提供了两种测试选项，即Unittest和Integration Test，以及网络和安全性方面的考虑。

pinterest技术

Redesigning Pinterest’s Ad Serving Systems with Zero Downtime

Pinterest的广告投放平台是公司最关键的系统之一，负责每年超过30亿美元的收入。为了解决多年的技术债务并为未来的业务目标奠定基础，广告基础设施团队重新设计和重写了该系统。他们设计了一个复杂的软件系统，采用代码组织和团队变更管理等措施，以确保易于扩展、关注点分离、安全设计和开发速度。通过使用内部图执行框架Apex和创新的数据模型，在迁移过程中保证了安全执行。重写后的AdMixer服务已在生产环境稳定运行，实现了产品创新的加速和大团队的安全开发，开发人员满意度得到大幅提升，还实现了基础设施成本的降低。详细设计和迁移过程请参考博文的第二部分。

pinterest技术

Web Performance Regression Detection (Part 2 of 3)

Pinterest通过监控和调查实时指标来检测和解决Pinner等待时间和核心Web Vital指标的退化问题。他们发现了2023年一些回归问题的根本原因，即对服务器渲染过程进行更改。为了更好地理解网络请求的开始和结束时间，他们引入了网络拥塞时间。这些实时指标帮助他们分析了不同流式传输处理方式对性能的影响，并发现了一些回归问题的关键原因，例如LCP图像的预加载请求延迟、脚本请求开始时间早于LCP预加载请求完成时间等。通过更新日志标记实验和实验处理，他们可以比较实时子指标，进一步提高调查的效率和成功率。这些实时监控图表对于捕捉在生产环境中发布的回归问题非常有帮助。

pinterest技术

HBase Deprecation at Pinterest

Alberto Ordonez Pereira | Senior Staff Software Engineer; Lianghong Xu | Senior Manager, Engineering;

pinterest技术

Web Performance Regression Detection (Part 1 of 3)

Michelle Vu | Web Performance Engineer

pinterest技术

The Field Guide to Non-Engagement Signals

Pinterest发布了一份关于非参与度信号的“领域指南”，旨在帮助在线平台更好地平衡用户参与度与内容质量，并实现用户需求的多样化。该指南提供了对于非参与度信号的应用和决策指导，特别是在调整情绪健康、保护用户权益等方面具有实际可行性。它强调了优化用户参与度对于长期用户留存的重要性，并指出通过使用质量指标和调查反馈等非参与度信号可以进一步提高用户留存率。此外，指南还提到通过使用生成人工智能来扩展内容质量信号的可能性，以及各个平台可以根据指南对自身进行分析和改进的方法。如果您有兴趣加入这项工作，可以了解更多信息并签署“Inspired Internet Pledge”。

pinterest技术

How we built Text-to-SQL at Pinterest

Pinterest开发了名为Text-to-SQL的功能，使用大型语言模型（LLM）将问题转化为SQL代码，帮助数据用户。该功能包括表格摘要生成和查询摘要生成。未来的发展方向包括进一步改进NLP表格搜索，增加元数据等。此外，还有一些潜在的改进点，如定期更新向量索引、优化相似性搜索和评分策略、实施查询验证和收集用户反馈。此外，应该创建更真实的基准测试，包括大量非规范化表格和表格搜索。

pinterest技术

LinkSage: GNN-based Pinterest Off-site Content Understanding

Adopted by Pinterest multiple user facing surfaces, Ads, and Board.

pinterest技术

Improving Efficiency Of Goku Time Series Database at Pinterest (Part 2)

Goku is a time series database used by Pinterest for monitoring and alerting services. It offers pre-aggregation as an optimization technique for reducing query latency and cardinality. Users can enable pre-aggregation for metrics experiencing high latency or hitting cardinality limits. The Goku team provides users with tag combination distribution for the metric, allowing them to choose the tags they want to preserve in the pre-aggregated time series. After consuming data points from Kafka, the Goku Short Term host checks if the time series qualifies for pre-aggregation. If it does, the data is entered into an in-memory data structure that records various aggregations. Additionally, 5 aggregated data points are emitted for the time series. Goku Root handles query requests and modifies the metric name to query the right time series. The success story mentions a metric with high cardinality that achieved lower latencies after enabling pre-aggregation. Goku has onboarded over 50 use cases for pre-aggregation.

pinterest技术

User Action Sequence Modeling for Pinterest Ads Engagement Modeling

Yulin Lei | Senior Machine Learning Engineer; Kaili Zhang | Staff Machine Learning Engineer; Sharare Zahtabian | Machine Learning Engineer…

pinterest技术

Migrating Policy Delivery Engines with (almost) Nobody Knowing

Several years ago, Pinterest had a short incident due to oversights in the policy delivery engine. This engine is the technology that ensures a policy document written by a developer and checked into source control is fully delivered to the production system evaluating that policy, similar to OPAL. This incident began a multi-year journey for our team to rethink policy delivery and migrate hundreds of policies to a new distribution model. We shared details about our former policy delivery system in a conference talk from Kubecon 2019.

At a high level, there are three important architectural decisions we’d like to bring attention to for this story.

pinterest技术

Handling Online-Offline Discrepancy in Pinterest Ads Ranking System

At Pinterest, our mission is to bring everyone the inspiration to create a life they love. People often come to Pinterest when they are considering what to do or buy next. Understanding this evolving user journey while balancing across multiple objectives is crucial to bring the best experience to Pinterest users and is supported by multiple recommendation models, with each providing real-time inference with an overall latency of 200–300 milliseconds. In particular, our machine learning powered ads ranking systems are trying to understand users’ engagement and conversion intent and promote the right ads to the right user at the right time. Our engineers are constantly discovering new algorithms and new signals to improve the performance of our machine learning models. A typical development cycle involves offline model training to realize offline model metric gains and then online A/B experiments to quantify online metric movements. However, it is not uncommon that offline metric gains do not translate into online business metric wins. In this blog, we will focus on some online and offline discrepancies and development cycle learnings we have observed in Pinterest ads conversion models, as well as some of the key platform investments Pinterest has made to minimize such discrepancies.

pinterest技术

Evolution of Ads Conversion Optimization Models at Pinterest

A Journey from GBDT to Multi-Task Ensemble DNN.

pinterest技术

公司：pinterest的相关资料

公司：pinterest