Medium标签的映射

This was originally published Jan 18, 2018 on Hatch, Medium’s internal instance, to explain a hack week project to the company.
这篇文章最初于2018年1月18日在Hatch上发布,Hatch是Medium的内部实例,用于向公司解释一个黑客周项目。
When you publish a post on Medium, you’re prompted to add labels to your post that describe what your post is about. These tags are mostly free-form. Authors can write whatever they think describes their post.
当您在Medium上发布帖子时,系统会提示您为帖子添加描述帖子内容的标签。这些标签大多是自由形式的。作者可以写下他们认为描述他们帖子的任何内容。

Adding some tags to a Medium post.
给Medium文章添加一些标签。
As far as data goes, these tags are a gold mine. Authors are labelling their posts with a succinct word or phrase that other people understand. We can (and have) used these tags to inform our algorithms for showing content and organizing it.
就数据而言,这些标签是一个宝库。作者用简洁的词语或短语标记他们的帖子,其他人可以理解。我们可以(也已经)使用这些标签来为我们的算法提供内容和组织信息。
However, there are big issues with tags that limit their usefulness. One of the issues is that tags are scattered. At this point, authors have defined over 1 million unique tags. Many tags are essentially duplicates of other tags, or are so close that they have the same audience. Here are some examples:
然而,标签存在一些限制,限制了它们的实用性。其中一个问题是标签分散。到目前为止,作者已经定义了超过100万个独特的标签。许多标签实际上是其他标签的重复,或者非常接近以至于具有相同的受众。以下是一些例子:
- Global Warming = Climate Change
- 全球变暖 = 气候变化
- Hillary Clinton = Hilary Clinton (common misspelling)
- 希拉里·克林顿 = 希拉里·克林顿(常见拼写错误)
- Poetry = Poem = Poems = Poetry on Medium
- 诗歌 = 诗 = 诗歌在Medium上
- Startup = Entrepreneurship = Startup Lessons = Founder Stories
- 创业 = 企业家精神 = 创业经验 = 创始人故事
To computers, each tag is just a string of text and by default they don’t have meaning or relatedness. This makes it hard for us to wield them cohesively in algorithmic battles.
对于计算机来说,每个标签只是一串文本,默认情况下它们没有意义或相关性。这使得我们难以在算法战斗中有条不紊地使用它们。

The “Climate Change” tag is asked about the “Global Warming” tag
关于“全球变暖”标签的问题
Tags as multi-dimensional characters
标签作为多维字符
Instead of each tag just being represented by a string, what if we could represent it by its qualities and how it relates to other tags? When we talk about people, we ...