Meta Platforms, Inc.（商业名称：Meta）是美国一家经营社交网络服务、虚拟现实、元宇宙等产品的互联网科技公司，总部位于美国加州门洛帕克，旗下拥有Facebook、Instagram、WhatsApp等社交软件。它是由马克·扎克伯格和他的室友、哈佛学院的学生爱德华多·萨维林、安德鲁·麦科勒姆、达斯廷·莫斯科维茨和克里斯·休斯一起创立的，最初的名字是TheFacebook.com，后简化为Facebook，后于2021年10月28日由扎克伯格宣布改名为Meta。
Meta提供社交网络服务之外的其他产品和服务，包括Facebook Messenger、Facebook Watch和Facebook Portal，之后陆续收购了Instagram、WhatsApp、Oculus、Giphy和Mapillary，并持有Jio Platforms9.9%的股份。
We’ve previously spoken in great detail about how Precision Time Protocol is being deployed at Meta, including the protocol itself and Meta’s precision time architecture.
As we deployed PTP into one of our data centers, we were also evaluating and testing alternative PTP clients. In doing so, we soon realized that we could eliminate a lot of complexity in the PTP protocol itself that we experienced during data center deployments while still maintaining complete hardware compatibility with our existing equipment.
This is how the idea of Simple Precision Time Protocol (SPTP) was born.
But before we dive under the hood of SPTP we should explore why the IEEE 1588 G8265.1 and G8275.2 unicast profiles (here, we just call them PTP) weren’t a perfect fit for our data center deployment.
HawkEye是一个强大的工具包，用于监控、可观察性和调试能力，支持机器学习工作流程。它通过分析模型的快照和特征重要性，为用户提供排名列表，指出导致预测异常的特征。HawkEye还能够追踪上游数据和管道的来源，并通过可视化工作流程帮助确定问题的根本原因。此外，HawkEye还能够诊断模型快照和训练数据问题，并提供相应的工具和可视化功能。未来，HawkEye将继续发展，为产品团队提供更多的调试工作流程，并扩展其功能。感谢所有当前和过去的HawkEye团队成员及其合作伙伴的支持，特别感谢Girish Vaitheeswaran、Atul Goyal、YJ Liu、Shiblee Sadik、Peng Sun、Adwait Tumbde、Karl Gyllstrom、Sean Lee、Dajian Li、Yu Quan、Robin Tafel、Ankit Asthana、Gautam Shanbhag和Prabhakar Goyal的贡献。
This is the third and final post in our series on Meta’s Systematic Code and Asset Removal Framework (SCARF). SCARF contains a combination of subsystems that analyze code and data usage throughout Meta to aid in the safe and efficient removal of deprecated products. In our first post on automating product deprecation, we discussed the complexities of product deprecations and introduced SCARF’s workflow management tools that guide engineers through a coordinated process to safely deprecate products. In the second post on automating dead code cleanup, we discussed SCARF’s dead code subsystem and its ability to analyze both static and dynamic usage of code to automatically generate change requests to remove unused code. Throughout this series, we have referred to the example of the deprecation of a photo sharing app called Moments, which Meta launched in 2015 and eventually shut down in 2019.
In this post, we introduce the subsystem responsible for automating the identification and safe removal of unused data types at Meta. This process can be unexpectedly difficult, because large software systems are inevitably interconnected. Moments relied on several pieces of shared Facebook functionality and infrastructure, and deleting it was more complicated than simply turning off servers and removing data tables.
In our last blog post on automatic product deprecation, we talked about the complexities of product deprecations, and a solution Meta has built called the Systematic Code and Asset Removal Framework (SCARF). As an example, we looked at Moments, the photo sharing app Meta launched in 2015 and eventually shut down in 2019, and how SCARF can help with the deprecation process through its workflow management capabilities. We discussed how SCARF saves engineering time by identifying the correct order of tasks for cleaning up a product and how it can be blocked from automating the cleanup when there are intersystem dependencies. This naturally leads to the question: How do we automatically unblock SCARF when there is code that references an asset?
At Meta, we are constantly innovating and experimenting by building and shipping many different products, and those products comprise thousands of individual features. As part of this healthy technology lifecycle, it is inevitable that certain products or features will be deprecated. For example, in 2015 we launched a photo-sharing app called Moments, which was later deprecated in 2019. So, how did we efficiently and safely remove all of the code and data related to Moments without adversely affecting Meta’s other products and services?
In this three-part blog series, we will discuss the complexities involved in removing a product from a complex portfolio of products and the framework Meta has built to drive the automation of this process, our Systematic Code and Asset Removal Framework (SCARF). SCARF has had an important impact at Meta. In the last year, it has removed petabytes of unused data across 12.8M different data types stored in 21 different data systems. Over the last five years it has deleted over 100M lines of code.
AI plays an important role in what people see on Meta’s platforms. Every day, hundreds of millions of people visit Explore on Instagram to discover something new, making it one of the largest recommendation surfaces on Instagram.
To build a large-scale system capable of recommending the most relevant content to people in real time out of billions of available options, we’ve leveraged machine learning (ML) to introduce task specific domain-specific language (DSL) and a multi-stage approach to ranking.
Meta is developing new privacy-enhancing technologies (PETs) to innovate and solve problems with less data. These technologies enable teams to build and launch privacy-enhanced products in a way that’s verifiable and safeguards user data. Using state-of-the-art cryptographic techniques, we have developed Private Data Lookup (PDL) that allows users to privately query a server-side data set. PDL is based on a secure multiparty computation mechanism called Private Set Intersection, where two parties holding sets can compute the intersection of the two sets without revealing their sets to the counterpart. With PDL, we further ensure that only one party (i.e., Meta users) can see the result, disabling Meta from learning the result of the intersection and thus enhancing the privacy of users’ data.
At Meta, we run one of the largest deployments of MySQL in the world. The deployment powers the social graph along with many other services, like Messaging, Ads, and Feed. Over the last few years, we have implemented MySQL Raft, a Raft consensus engine that was integrated with MySQL to build a replicated state machine. We have migrated a large portion of our deployment to MySQL Raft and plan to fully replace the current MySQL semisynchronous databases with it. The project has delivered significant benefits to the MySQL deployment at Meta, including higher reliability, provable safety, significant improvements in failover time, and operational simplicity — all with equal or comparable write performance.
We’re sharing our latest research and analysis into malware campaigns that are targeting online businesses — including newer malware posing as AI tools.
Device Verification on WhatsApp helps protect users accounts from on-device malware while allowing uninterrupted access to calls and messages.
With key transparency, WhatsApp provides a set of proofs that affirms the correctness of public encryption keys.
Facebook for iOS (FBiOS) is the oldest mobile codebase at Meta. Since the app was rewritten in 2012, it has been worked on by thousands of engineers and shipped to billions of users, and it can support hundreds of engineers iterating on it at a time.
Here's what we've learned from making architecture changes to Meta’s event driven asynchronous computing platform.
Before jumping into the details of the migration story, we’d like to take a step back and try to explain the motivation and rationale for this migration.
Over time, the data platform has morphed into various forms as the needs of the company have grown. What was a modest data platform in the early days has grown into an exabyte-scale platform. Some systems serving a smaller scale began showing signs of being insufficient for the increased demands that were placed on them. Most notably, we’ve run into some concrete reliability and efficiency issues related to data (de)serialization, which has made us rethink the way we log data and revisit the ideas from first principles to address these pressing issues.
Logger is at the heart of the data platform. The system is used to log analytical and operational data to Scuba, Hive, and stream processing pipelines via Scribe. Every product and data platform team interacts with logging. The data format for logging was either Hive Text Delimited or JSON, for legacy reasons. The limitations of these formats are described in our previous article on Tulip.
Anonymous Credential Service (ACS) is a highly available multitenant service that allows clients to authenticate in a de-identified manner.
Nullsafe is a new static analysis tool that is used at Meta to detect NullPointerException (NPE) errors in Java code.