话题公司 › slack

公司:slack

Slack是由Slack技术所开发的一款基于云端运算的即时通讯软件,现属赛富时所有。Slack这个词其实是一个缩写,意思是“所有可搜索的会话和知识日志”(Searchable Log of All Conversation and Knowledge)。

Executing Cron Scripts Reliably At Scale

Cron scripts are responsible for critical Slack functionality. They ensure reminders execute on time, email notifications are sent, and databases are cleaned up, among other things. Over the years, both the number of cron scripts and the amount of data these scripts process have increased. While generally these cron scripts executed as expected, over time the reliability of their execution has occasionally faltered, and maintaining and scaling their execution environment became increasingly burdensome. These issues lead us to design and build a better way to execute cron scripts reliably at scale.

Running cron scripts at Slack started in the way you might expect. There was one node with a copy of all the scripts to run and one crontab file with the schedules for all the scripts. The node was responsible for executing the scripts locally on their specified schedule. Over time, the number of scripts grew, and the amount of data each script processed also grew. For a while, we could keep moving to bigger nodes with more CPU and more RAM; that kept things running most of the time. But the setup still wasn’t that reliable — with one box running, any issues with provisioning, rotation, or configuration would bring the service to a halt, taking some key Slack functionality with it. After continuously adding more and more patches to the system, we decided it was time to build something new: a reliable and scalable cron execution service. This article will detail some key components and considerations of this new system.

Traffic 101: Packets Mostly Flow

Slack handles billions of inbound network requests per day, all of which traverse through our edge network and ingress load balancing tiers. In this blog post, we’ll talk about how a request flows — from a Slack’s user perspective — across the vast ether of the network to reach AWS and then Slack’s internal services.

Real-time Messaging

Did you know that ground stations transmit signals to satellites 22,236 miles above the equator in geostationary orbits, and that those signals are then beamed down to the entire North American subcontinent? Satellite radios today serve hundreds of channels across 9,540,000 square miles. Unless you’re working at a secret military facility, deep underground, you can enjoy satellite radio everywhere.

Just like the satellites, Slack sends millions of messages every day across millions of channels in real time all around the world. If we look at the traffic on a typical work day, it shows that most users are online between 9am and 5pm local time, with peaks at 11am and 2pm and a small dip in between for lunch hour. Though the working hours are similar across regions, looking at the two peaks in the graph below, it is evident that prime time is not the same: It’s post-noon in some regions and pre-noon in other regions. Each colored line in the below graph represents a region.

Tracing Notifications

Notifications are a key aspect of the Slack user experience. Users rely on timely notifications of mentions and DMs to keep on top of important information. Poor notification completeness erodes the trust of all Slack users.

Technology Lifecycle

This blog post discusses the strategies that Slack uses to manage the lifecycle (development, support, and eventual retirement) of infrastructure projects, through the lens of the migration through three successive internal “platform” offerings.

Hakana: Taking Hack Seriously

We started migrating to a different language called Hack in 2016. Hack was created by Facebook after they had struggled to scale their operations with PHP. It offered more type-safety than PHP, and it came with an interpreter (called HHVM) that could run PHP code faster than PHP’s own interpreter.

Mobile Developer Experience at Slack

The mobile developer experience team empowers developers to ship code with confidence while enjoying a pleasant and productive engineering experience.

BuildRock: A Build Platform at Slack

Our build platform is an essential piece of delivering code to production efficiently and safely at Slack. Over time it has undergone a lot of changes, and in 2021 the Build team started looking at the long-term vision.

Some questions the Build team wanted to answer were:

  • When should we invest in modernizing our build platform?
  • How do we deal with our build platform tech debt issues?
  • Can we move faster and safer while building and deploying code?
  • Can we invest in the same without impacting our existing production builds?
  • What do we do with existing build methodologies?

In this article we will explore how the Build team at Slack is investing in developing a build platform to solve some existing issues and to handle scale for future.

Slowing Down to Speed Up - Circuit Breakers for Slack's CI/CD

How Slack increased developer productivity and prevented cascading internal failures by implementing orchestration-level circuit breakers.

AutoTransform: Efficient Codebase Modification

How Slack is bringing automation to bear to solve the problem of maintaining, modifying, and upgrading codebases.

Building Background Effects for Clips

Last September, Slack released Clips, allowing users to capture video, audio, and screen recordings in messages to help distributed teams connect and share their work. We’ve continued iterating on Clips since its release, adding thumbnail selection, background blur, and most recently, background image replacement.

This blog post provides a deep dive into our implementation of background effects (background blur and background image replacement) for browsers and the desktop client. We’ve used a variety of web technologies, including WebGL and WebAssembly, to make background effects as performant as possible on our desktop platforms.

Scaling Slack’s Mobile Codebases: Modernization

In the first two posts about the Duplo initiative, we described why we decided to revamp our mobile codebases, the initial phase to clean up tech debt, and our efforts to modularize our iOS and Android codebases (post 1, post 2). In this final post, we will discuss the last theme of the Duplo initiative, Modernization, and look at the overall results and impact on developers.

Slack’s Incident on 2-22-22

Slack experienced a major incident on February 22 this year, during which time many users were unable to connect to Slack, including the author — which certainly made my role as Incident Commander more challenging!

Handling Flaky Tests at Scale: Auto Detection & Suppression

This post describes the path we have taken to minimize the number of flaky tests through an approach of automated test failure detection and suppression. This is not a new problem that we are trying to solve; many companies have published articles on systems created for handling flaky tests. This article outlines how test flakiness is an increasing problem at scale and how we got it under control at Slack.

Balancing Safety and Velocity in CI/CD at Slack

A story of evolving socio-technical workflows that increased developer velocity and redefined confident testing and deploy workflows at Slack.

The Case of the Recursive Resolvers

On September 30th 2021, Slack had an outage that impacted less than 1% of our online user base, and lasted for 24 hours. This outage was the result of our attempt to enable DNSSEC — an extension intended to secure the DNS protocol, required for FedRAMP Moderate — but which ultimately led to a series of unfortunate events.

The internet relies very heavily on the Domain Name System (DNS) protocol. DNS is like a phone book for the entire internet. Web sites are accessed through domain names, but web browsers interact using IP addresses. DNS translates domain names to IP addresses, so that browsers can load the sites you need. Refer to ‘What is DNS?’ by Cloudflare to read more about how DNS works and all the necessary steps to do a domain name lookup.

首页 - Wiki
Copyright © 2011-2023 iteam. Current version is 2.118.1. UTC+08:00, 2023-10-04 14:34
浙ICP备14020137号-1 $访客地图$