论文标题
文本的分层位置标准化系统
A Hierarchical Location Normalization System for Text
论文作者
论文摘要
如今,人们从大量文件中了解当地事件是很自然的。许多文本包含位置信息,例如城市名称或道路名称,这些信息总是不完整或潜在。提取文本的管理区域并组织区域层次结构,称为位置归一化是很重要的。现有的检测位置系统要么排除层次归一化或仅显示少数特定区域。我们提出了一个名为Roibase的系统,该系统通过中国等级行政部门将文本归一化。 Roibase采用共同发生的约束作为对行政区域命中率的基本框架,实现特殊嵌入的推断,并通过ROI(感兴趣的地区)扩展了召回率。它具有很高的效率和解释性,因为它主要基于确定的知识建立,并且逻辑不如监督模型。我们证明,Roibase可以在可行的解决方案中实现更好的性能,并且可作为位置标准化的强大支持系统。
It's natural these days for people to know the local events from massive documents. Many texts contain location information, such as city name or road name, which is always incomplete or latent. It's significant to extract the administrative area of the text and organize the hierarchy of area, called location normalization. Existing detecting location systems either exclude hierarchical normalization or present only a few specific regions. We propose a system named ROIBase that normalizes the text by the Chinese hierarchical administrative divisions. ROIBase adopts a co-occurrence constraint as the basic framework to score the hit of the administrative area, achieves the inference by special embeddings, and expands the recall by the ROI (region of interest). It has high efficiency and interpretability because it mainly establishes on the definite knowledge and has less complex logic than the supervised models. We demonstrate that ROIBase achieves better performance against feasible solutions and is useful as a strong support system for location normalization.