公司:meta
Meta Platforms, Inc.(商业名称:Meta)是美国一家经营社交网络服务、虚拟现实、元宇宙等产品的互联网科技公司,总部位于美国加州门洛帕克,旗下拥有Facebook、Instagram、WhatsApp等社交软件。它是由马克·扎克伯格和他的室友、哈佛学院的学生爱德华多·萨维林、安德鲁·麦科勒姆、达斯廷·莫斯科维茨和克里斯·休斯一起创立的,最初的名字是TheFacebook.com,后简化为Facebook,后于2021年10月28日由扎克伯格宣布改名为Meta。
Meta提供社交网络服务之外的其他产品和服务,包括Facebook Messenger、Facebook Watch和Facebook Portal,之后陆续收购了Instagram、WhatsApp、Oculus、Giphy和Mapillary,并持有Jio Platforms9.9%的股份。
Meta是全球最有价值的公司之一,与微软、亚马逊、苹果和Alphabet一起被认为是五大科技公司之一。
Revolutionizing software testing: Introducing LLM-powered bug catchers
Meta的自动化合规强化工具(ACH)通过LLM生成特定区域的代码故障(变异体)和测试,提升代码的稳健性。ACH不仅增加代码覆盖率,还精准定位特定故障,确保测试有效性。利用LLM,ACH自动生成逼真的变异体和相应测试,减少人工工作量。该工具已应用于Facebook、Instagram等平台,提升了代码对特定问题的抵御能力。通过现代化的自动测试生成,ACH能将各种来源的关注点转化为有效测试,优化软件测试流程。
Data logs: The latest evolution in Meta’s access tools
Meta在2024年2月更新了“下载你的信息”工具,增加了数据日志功能,提供用户更详细的数据访问。这项功能通过Hive数据仓库实现,允许用户获取其在Meta平台上的详细使用数据。为解决海量数据查询问题,Meta采用批处理请求的方法,通过内部任务调度服务和数据管道系统优化查询效率。数据日志经过隐私保护和用户友好转换后,以ZIP文件形式提供给用户。这一功能展示了Meta在数据透明性和用户数据控制方面的持续努力。
How Precision Time Protocol handles leap seconds
在快速发展的数字时代,引入新的闰秒对数据中心而言风险大于收益。Meta提倡停止使用闰秒,尤其是在PTP协议提供纳秒级同步的情况下。传统的NTP闰秒平滑处理在PTP环境中并不适用,Meta采用自动时间调整算法应对PTP中的闰秒问题,建议使用国际原子时(TAI)而非协调世界时(UTC)。支持2035年后不再引入新闰秒,以简化时间同步基础设施,提升时间精度。
How Meta discovers data flows via lineage at scale
Meta的隐私感知基础设施(PAI)中的数据血缘技术是确保用户隐私的重要工具,通过跟踪数据流动路径,帮助开发者有效实施隐私控制。本文详细介绍了如何在Facebook Dating应用中追踪宗教数据,从数据收集、信号分析到隐私控制的应用。通过静态和运行时分析工具,Meta创建了全面的数据流图,方便开发者识别和管理数据流,提升隐私保护效率。未来,Meta将继续扩展数据血缘覆盖,优化用户体验,并探索新的应用领域。
Strobelight: A profiling service built on open source technology
Meta的Strobelight是一种多元化的性能剖析管理工具,结合多种开源技术,帮助工程师提升服务器效率。Strobelight通过收集CPU使用、内存分配等数据,识别性能瓶颈并优化代码。其42种剖析器包括内存、AI/GPU、延迟等类型,支持动态采样和数据正则化。通过改进代码,如一个简单的引用符号“&”,Strobelight实现了显著的服务器容量节省,展示了其在性能优化中的强大潜力。
How we think about Threads’ iOS performance
How did the Threads iOS team maintain the app’s performance during its incredible growth? Here’s how Meta’s Threads team thinks about performance, including the key metrics we mon…
Sequence learning: A paradigm shift for personalized ads recommendations
AI plays a fundamental role in creating valuable connections between people and advertisers within Meta’s family of apps. Meta’s ad recommendation engine, powered by deep learning recommendation mo…
How Meta built large-scale cryptographic monitoring
Cryptographic monitoring at scale has been instrumental in helping our engineers understand how cryptography is used at Meta. Monitoring has given us a distinct advantage in our efforts to proactiv…
IPLS: Privacy-preserving storage for your WhatsApp contacts
Your contact list is fundamental to the experiences you love and enjoy on WhatsApp. With contacts, you know which of your friends and family are on WhatsApp, you can easily message or call them, an…
How Meta enforces purpose limitation via Privacy Aware Infrastructure at scale
Purpose limitation, a core data protection principle, is about ensuring data is only processed for explicitly stated purposes. A crucial aspect of purpose limitation is managing data as it flows across systems and services. Commonly, purpose limitation can rely on “point checking” controls at the point of data processing. This approach involves using simple if statements in code (“code assets”) or access control mechanisms for datasets (“data assets”) in data systems. However, this approach can be fragile as it requires frequent and exhaustive code audits to ensure the continuous validity of these controls, especially as the codebase evolves. Additionally, access control mechanisms manage permissions for different datasets to reflect various purposes using mechanisms like access control lists (ACLs), which requires the physical separation of data into distinct assets to ensure each maintains a single purpose. When Meta started to address more and larger-scope purpose limitation requirements that crossed dozens of our systems, these point checking controls did not scale.
How Meta animates AI-generated images at scale
We launched Meta AI with the goal of giving people new ways to be more productive and unlock their creativity with generative AI (GenAI). But GenAI also comes with challenges of scale. As we deploy new GenAI technologies at Meta, we also focus on delivering these services to people as quickly and efficiently as possible.
Meta AI’s animate feature, which lets people generate a short animation of a generated image, carried unique challenges in this regard. To deploy and run at scale, our model to generate image animations had to be able to serve billions of people who use our products and services, do so quickly – with fast generation times and minimal errors, and remain resource efficient.
Here’s how we were able to deploy Meta AI’s animate feature using a combination of latency optimizations, traffic management, and other novel techniques.
Taming the tail utilization of ads inference at Meta scale
Tail utilization is a significant system issue and a major factor in overload-related failures and low compute utilization. The tail utilization optimizations at Meta have had a profound impact on …
Logarithm: A logging engine for AI training workflows and services
Systems and application logs play a key role in operations, observability, and debugging workflows at Meta. Logarithm is a hosted, serverless, multitenant service, used only internally at Meta, tha…
Simple Precision Time Protocol at Meta
We’ve previously spoken in great detail about how Precision Time Protocol is being deployed at Meta, including the protocol itself and Meta’s precision time architecture.
As we deployed PTP into one of our data centers, we were also evaluating and testing alternative PTP clients. In doing so, we soon realized that we could eliminate a lot of complexity in the PTP protocol itself that we experienced during data center deployments while still maintaining complete hardware compatibility with our existing equipment.
This is how the idea of Simple Precision Time Protocol (SPTP) was born.
But before we dive under the hood of SPTP we should explore why the IEEE 1588 G8265.1 and G8275.2 unicast profiles (here, we just call them PTP) weren’t a perfect fit for our data center deployment.
AI debugging at Meta with HawkEye
HawkEye是一个强大的工具包,用于监控、可观察性和调试能力,支持机器学习工作流程。它通过分析模型的快照和特征重要性,为用户提供排名列表,指出导致预测异常的特征。HawkEye还能够追踪上游数据和管道的来源,并通过可视化工作流程帮助确定问题的根本原因。此外,HawkEye还能够诊断模型快照和训练数据问题,并提供相应的工具和可视化功能。未来,HawkEye将继续发展,为产品团队提供更多的调试工作流程,并扩展其功能。感谢所有当前和过去的HawkEye团队成员及其合作伙伴的支持,特别感谢Girish Vaitheeswaran、Atul Goyal、YJ Liu、Shiblee Sadik、Peng Sun、Adwait Tumbde、Karl Gyllstrom、Sean Lee、Dajian Li、Yu Quan、Robin Tafel、Ankit Asthana、Gautam Shanbhag和Prabhakar Goyal的贡献。
Automating data removal
This is the third and final post in our series on Meta’s Systematic Code and Asset Removal Framework (SCARF). SCARF contains a combination of subsystems that analyze code and data usage throughout Meta to aid in the safe and efficient removal of deprecated products. In our first post on automating product deprecation, we discussed the complexities of product deprecations and introduced SCARF’s workflow management tools that guide engineers through a coordinated process to safely deprecate products. In the second post on automating dead code cleanup, we discussed SCARF’s dead code subsystem and its ability to analyze both static and dynamic usage of code to automatically generate change requests to remove unused code. Throughout this series, we have referred to the example of the deprecation of a photo sharing app called Moments, which Meta launched in 2015 and eventually shut down in 2019.
In this post, we introduce the subsystem responsible for automating the identification and safe removal of unused data types at Meta. This process can be unexpectedly difficult, because large software systems are inevitably interconnected. Moments relied on several pieces of shared Facebook functionality and infrastructure, and deleting it was more complicated than simply turning off servers and removing data tables.