论文标题

使用机器学习技术在基于区块链的社交媒体平台上发布机器人检测

Posting Bot Detection on Blockchain-based Social Media Platform using Machine Learning Techniques

论文作者

Kim, Taehyun, Shin, Hyomin, Hwang, Hyung Ju, Jeong, Seungwon

论文摘要

Steemit是一个基于区块链的社交媒体平台,作者可以以称为Steem and SBD(Steem Blockchain Dollars)的加密货币的形式获得作者奖励,如果他们的帖子被高投票。有趣的是,策展人(或选民)也可以通过投票通过他人的职位来获得奖励,这被称为策展奖励。奖励与策展人的Steem赌注成正比。在整个过程中,Steemit希望用户以分散的方式自动发现“好”内容,这被称为脑证明(POB)。但是,有许多编程的机器人帐户可以自动发布并获得奖励,这阻碍了真正的人类用户创建良好的内容。我们将这种类型的机器人称为发布机器人。尽管有许多论文在传统的集中社交媒体平台(例如Facebook和Twitter)上研究了机器人,但我们是第一个研究在基于区块链的社交媒体平台上发布机器人的人。与通常的社交媒体平台上的机器人检测相比,我们创建的功能具有一个优势,即可以在不限制帖子的数量或长度的情况下检测到机器人。我们可以通过在博客数据或答复之间划分距离来提取帖子的功能。这些特征是从频繁单词和文章(MAC-CDFA)之间的群集距离的最低平均簇中获得的,这在以前的任何社交媒体研究中均未使用。根据丰富的功能,我们增强了分类任务的质量。比较F1分数,我们创建的功能优于在Facebook和Twitter上用于机器人检测的功能。

Steemit is a blockchain-based social media platform, where authors can get author rewards in the form of cryptocurrencies called STEEM and SBD (Steem Blockchain Dollars) if their posts are upvoted. Interestingly, curators (or voters) can also get rewards by voting others' posts, which is called a curation reward. A reward is proportional to a curator's STEEM stakes. Throughout this process, Steemit hopes "good" content will be automatically discovered by users in a decentralized way, which is known as the Proof-of-Brain (PoB). However, there are many bot accounts programmed to post automatically and get rewards, which discourages real human users from creating good content. We call this type of bot a posting bot. While there are many papers that studied bots on traditional centralized social media platforms such as Facebook and Twitter, we are the first to study posting bots on a blockchain-based social media platform. Compared with the bot detection on the usual social media platforms, the features we created have an advantage that posting bots can be detected without limiting the number or length of posts. We can extract the features of posts by clustering distances between blog data or replies. These features are obtained from the Minimum Average Cluster from Clustering Distance between Frequent words and Articles (MAC-CDFA), which is not used in any of the previous social media research. Based on the enriched features, we enhanced the quality of classification tasks. Comparing the F1-scores, the features we created outperformed the features used for bot detection on Facebook and Twitter.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源