论文标题

键盘:基于图的感知(文本)表示

KeypartX: Graph-based Perception (Text) Representation

论文作者

Yang, Peng

论文摘要

大数据的可用性为个人,企业和学者开辟了重要机会,以了解其世界正在发生的事情。先前的文本表示作品主要集中在大量单词的频率或同时发生的信息上。但是,大数据是一把双刃剑,数量很大,但格式是非结构化的。非结构化的边缘需要特定的技术将“大”转换为有意义的,而不是单独提供信息。 这项研究提出了Keypartx,这是一种基于图表的方法,可以通过语音的关键部分来表示感知(文本)。与基本/向量的机器学习不同,该技术是类似人类的学习,可以从语言(语义,句法和务实)信息中提取含义。此外,Keypartx具有大数据,但不饿,甚至适用于文本的最低单位:句子。

The availability of big data has opened up big opportunities for individuals, businesses and academics to view big into what is happening in their world. Previous works of text representation mostly focused on informativeness from massive words' frequency or cooccurrence. However, big data is a double-edged sword which is big in volume but unstructured in format. The unstructured edge requires specific techniques to transform 'big' into meaningful instead of informative alone. This study presents KeypartX, a graph-based approach to represent perception (text in general) by key parts of speech. Different from bag-of-words/vector-based machine learning, this technique is human-like learning that could extracts meanings from linguistic (semantic, syntactic and pragmatic) information. Moreover, KeypartX is big-data capable but not hungry, which is even applicable to the minimum unit of text:sentence.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源