公司:meta
Meta Platforms, Inc.(商业名称:Meta)是美国一家经营社交网络服务、虚拟现实、元宇宙等产品的互联网科技公司,总部位于美国加州门洛帕克,旗下拥有Facebook、Instagram、WhatsApp等社交软件。它是由马克·扎克伯格和他的室友、哈佛学院的学生爱德华多·萨维林、安德鲁·麦科勒姆、达斯廷·莫斯科维茨和克里斯·休斯一起创立的,最初的名字是TheFacebook.com,后简化为Facebook,后于2021年10月28日由扎克伯格宣布改名为Meta。
Meta提供社交网络服务之外的其他产品和服务,包括Facebook Messenger、Facebook Watch和Facebook Portal,之后陆续收购了Instagram、WhatsApp、Oculus、Giphy和Mapillary,并持有Jio Platforms9.9%的股份。
Meta是全球最有价值的公司之一,与微软、亚马逊、苹果和Alphabet一起被认为是五大科技公司之一。
How we think about Threads’ iOS performance
How did the Threads iOS team maintain the app’s performance during its incredible growth? Here’s how Meta’s Threads team thinks about performance, including the key metrics we mon…
Sequence learning: A paradigm shift for personalized ads recommendations
AI plays a fundamental role in creating valuable connections between people and advertisers within Meta’s family of apps. Meta’s ad recommendation engine, powered by deep learning recommendation mo…
How Meta built large-scale cryptographic monitoring
Cryptographic monitoring at scale has been instrumental in helping our engineers understand how cryptography is used at Meta. Monitoring has given us a distinct advantage in our efforts to proactiv…
IPLS: Privacy-preserving storage for your WhatsApp contacts
Your contact list is fundamental to the experiences you love and enjoy on WhatsApp. With contacts, you know which of your friends and family are on WhatsApp, you can easily message or call them, an…
How Meta enforces purpose limitation via Privacy Aware Infrastructure at scale
Purpose limitation, a core data protection principle, is about ensuring data is only processed for explicitly stated purposes. A crucial aspect of purpose limitation is managing data as it flows across systems and services. Commonly, purpose limitation can rely on “point checking” controls at the point of data processing. This approach involves using simple if statements in code (“code assets”) or access control mechanisms for datasets (“data assets”) in data systems. However, this approach can be fragile as it requires frequent and exhaustive code audits to ensure the continuous validity of these controls, especially as the codebase evolves. Additionally, access control mechanisms manage permissions for different datasets to reflect various purposes using mechanisms like access control lists (ACLs), which requires the physical separation of data into distinct assets to ensure each maintains a single purpose. When Meta started to address more and larger-scope purpose limitation requirements that crossed dozens of our systems, these point checking controls did not scale.
How Meta animates AI-generated images at scale
We launched Meta AI with the goal of giving people new ways to be more productive and unlock their creativity with generative AI (GenAI). But GenAI also comes with challenges of scale. As we deploy new GenAI technologies at Meta, we also focus on delivering these services to people as quickly and efficiently as possible.
Meta AI’s animate feature, which lets people generate a short animation of a generated image, carried unique challenges in this regard. To deploy and run at scale, our model to generate image animations had to be able to serve billions of people who use our products and services, do so quickly – with fast generation times and minimal errors, and remain resource efficient.
Here’s how we were able to deploy Meta AI’s animate feature using a combination of latency optimizations, traffic management, and other novel techniques.
Taming the tail utilization of ads inference at Meta scale
Tail utilization is a significant system issue and a major factor in overload-related failures and low compute utilization. The tail utilization optimizations at Meta have had a profound impact on …
Logarithm: A logging engine for AI training workflows and services
Systems and application logs play a key role in operations, observability, and debugging workflows at Meta. Logarithm is a hosted, serverless, multitenant service, used only internally at Meta, tha…
Simple Precision Time Protocol at Meta
We’ve previously spoken in great detail about how Precision Time Protocol is being deployed at Meta, including the protocol itself and Meta’s precision time architecture.
As we deployed PTP into one of our data centers, we were also evaluating and testing alternative PTP clients. In doing so, we soon realized that we could eliminate a lot of complexity in the PTP protocol itself that we experienced during data center deployments while still maintaining complete hardware compatibility with our existing equipment.
This is how the idea of Simple Precision Time Protocol (SPTP) was born.
But before we dive under the hood of SPTP we should explore why the IEEE 1588 G8265.1 and G8275.2 unicast profiles (here, we just call them PTP) weren’t a perfect fit for our data center deployment.
AI debugging at Meta with HawkEye
HawkEye是一个强大的工具包,用于监控、可观察性和调试能力,支持机器学习工作流程。它通过分析模型的快照和特征重要性,为用户提供排名列表,指出导致预测异常的特征。HawkEye还能够追踪上游数据和管道的来源,并通过可视化工作流程帮助确定问题的根本原因。此外,HawkEye还能够诊断模型快照和训练数据问题,并提供相应的工具和可视化功能。未来,HawkEye将继续发展,为产品团队提供更多的调试工作流程,并扩展其功能。感谢所有当前和过去的HawkEye团队成员及其合作伙伴的支持,特别感谢Girish Vaitheeswaran、Atul Goyal、YJ Liu、Shiblee Sadik、Peng Sun、Adwait Tumbde、Karl Gyllstrom、Sean Lee、Dajian Li、Yu Quan、Robin Tafel、Ankit Asthana、Gautam Shanbhag和Prabhakar Goyal的贡献。
Automating data removal
This is the third and final post in our series on Meta’s Systematic Code and Asset Removal Framework (SCARF). SCARF contains a combination of subsystems that analyze code and data usage throughout Meta to aid in the safe and efficient removal of deprecated products. In our first post on automating product deprecation, we discussed the complexities of product deprecations and introduced SCARF’s workflow management tools that guide engineers through a coordinated process to safely deprecate products. In the second post on automating dead code cleanup, we discussed SCARF’s dead code subsystem and its ability to analyze both static and dynamic usage of code to automatically generate change requests to remove unused code. Throughout this series, we have referred to the example of the deprecation of a photo sharing app called Moments, which Meta launched in 2015 and eventually shut down in 2019.
In this post, we introduce the subsystem responsible for automating the identification and safe removal of unused data types at Meta. This process can be unexpectedly difficult, because large software systems are inevitably interconnected. Moments relied on several pieces of shared Facebook functionality and infrastructure, and deleting it was more complicated than simply turning off servers and removing data tables.
Automating dead code cleanup
In our last blog post on automatic product deprecation, we talked about the complexities of product deprecations, and a solution Meta has built called the Systematic Code and Asset Removal Framework (SCARF). As an example, we looked at Moments, the photo sharing app Meta launched in 2015 and eventually shut down in 2019, and how SCARF can help with the deprecation process through its workflow management capabilities. We discussed how SCARF saves engineering time by identifying the correct order of tasks for cleaning up a product and how it can be blocked from automating the cleanup when there are intersystem dependencies. This naturally leads to the question: How do we automatically unblock SCARF when there is code that references an asset?
Automating product deprecation
At Meta, we are constantly innovating and experimenting by building and shipping many different products, and those products comprise thousands of individual features. As part of this healthy technology lifecycle, it is inevitable that certain products or features will be deprecated. For example, in 2015 we launched a photo-sharing app called Moments, which was later deprecated in 2019. So, how did we efficiently and safely remove all of the code and data related to Moments without adversely affecting Meta’s other products and services?
In this three-part blog series, we will discuss the complexities involved in removing a product from a complex portfolio of products and the framework Meta has built to drive the automation of this process, our Systematic Code and Asset Removal Framework (SCARF). SCARF has had an important impact at Meta. In the last year, it has removed petabytes of unused data across 12.8M different data types stored in 21 different data systems. Over the last five years it has deleted over 100M lines of code.
Scaling the Instagram Explore recommendations system
AI plays an important role in what people see on Meta’s platforms. Every day, hundreds of millions of people visit Explore on Instagram to discover something new, making it one of the largest recommendation surfaces on Instagram.
To build a large-scale system capable of recommending the most relevant content to people in real time out of billions of available options, we’ve leveraged machine learning (ML) to introduce task specific domain-specific language (DSL) and a multi-stage approach to ranking.
How Meta is improving password security and preserving privacy
Meta is developing new privacy-enhancing technologies (PETs) to innovate and solve problems with less data. These technologies enable teams to build and launch privacy-enhanced products in a way that’s verifiable and safeguards user data. Using state-of-the-art cryptographic techniques, we have developed Private Data Lookup (PDL) that allows users to privately query a server-side data set. PDL is based on a secure multiparty computation mechanism called Private Set Intersection, where two parties holding sets can compute the intersection of the two sets without revealing their sets to the counterpart. With PDL, we further ensure that only one party (i.e., Meta users) can see the result, disabling Meta from learning the result of the intersection and thus enhancing the privacy of users’ data.
Building and deploying MySQL Raft at Meta
At Meta, we run one of the largest deployments of MySQL in the world. The deployment powers the social graph along with many other services, like Messaging, Ads, and Feed. Over the last few years, we have implemented MySQL Raft, a Raft consensus engine that was integrated with MySQL to build a replicated state machine. We have migrated a large portion of our deployment to MySQL Raft and plan to fully replace the current MySQL semisynchronous databases with it. The project has delivered significant benefits to the MySQL deployment at Meta, including higher reliability, provable safety, significant improvements in failover time, and operational simplicity — all with equal or comparable write performance.