Airbnb Brandometer:利用人工智能在社交媒体数据上衡量品牌知名度
How we quantify brand perceptions from social media platforms through deep learning
我们如何通过深度学习从社交媒体平台量化品牌感知
By
由
,
,
Introduction
介绍
At Airbnb, we have developed Brandometer, a state-of-the-art natural language understanding (NLU) technique for understanding brand perception based on social media data.
在Airbnb,我们开发了Brandometer,这是一种基于社交媒体数据的先进自然语言理解(NLU)技术,用于理解品牌感知。
Brand perception refers to the general feelings and experiences of customers with a company. Quantitatively, measuring brand perception is an extremely challenging task. Traditionally, we rely on customer surveys to find out what customers think about a company. The downsides of such a qualitative study is the bias in sampling and the limitation in data scale. Social media data, on the other hand, is the largest consumer database where users share their experiences and is the ideal complementary consumer data to capture brand perceptions.
品牌感知是指顾客对一家公司的总体感受和体验。定量地测量品牌感知是一项极具挑战性的任务。传统上,我们依靠顾客调查来了解顾客对一家公司的看法。这种定性研究的缺点是采样偏差和数据规模的限制。另一方面,社交媒体数据是最大的消费者数据库,用户在其中分享他们的经验,是捕捉品牌感知的理想补充消费者数据。
Compared to traditional approaches to extract concurrency and count-based top relevant topics, Brandometer learns word embeddings and utilizes embedding distances to measure relatedness of brand perceptions (e.g., ‘belonging’, ‘connected’, ‘reliable’). Word embedding represents words in the form of real-valued vectors, and it performs well in reserving semantic meanings and relatedness of words. Word embeddings obtained from deep neural networks are arguably the most popular and evolutionary approaches in NLU. We explored a variety of word embedding models, from quintessential algorithms Word2Vec and FastText, to the latest language model DeBERTa, and compared them in terms of generating reliable brand perception scores.
与传统方法提取并基于并发性和计数的热门相关主题相比,Brandometer学习词嵌入并利用嵌入距离来衡量品牌感知的相关性(例如“归属感”,“联系性”,“可靠性”)。词嵌入以实值向量的形式表示单词,并且在保留单词的语义含义和相关性方面表现良好。从深度神经网络中获得的词嵌入被认为是自然语言理解中最流行和最具进化性的方法之一。我们探索了各种词嵌入模型,从经典算法Word2Vec和FastText,到最新的语言模型DeBERTa,并比较它们在生成可靠的品牌感知分数方面的表现。
For concepts represented...