论文标题
使用随机森林分类识别M31中新的X射线二进制候选者
Identifying New X-ray Binary Candidates in M31 using Random Forest Classification
论文作者
论文摘要
在附近星系中识别X射线二进制(XRB)候选物需要将它们与包括前景恒星和背景活性银河核在内的可能的污染物区分开。这项工作调查了使用监督的机器学习算法来识别高概率X射线二进制候选者。我们使用仙女座银河系中的943个Chandra X射线源的目录,我们使用具有先前已知类型的163个源的X射线属性训练并测试了几种分类算法。在所测试的算法中,我们发现随机森林分类器与使用多个类别相比,在二进制分类(XRB/NON-XRB)上下文中提供了最佳性能,并且可以更好地工作。通过将可见光和硬X射线观测的分类作为全天狂人的仙女座库来评估我们的方法,我们发现90%级别的兼容性,尽管我们警告说,共同的源数量相当小。对象是X射线二进制的估计概率在随机森林二进制和多类方法之间很好地吻合,并且我们发现具有最高置信度的分类在X射线二进制类别中。用于分类的最有区别的X射线频带是1.7-2.8、0.5-1.0、2.0-4.0和2.0-7.0 KeV光子通量比率。在仙女座目录中的780个未分类源中,我们确定了16个新的高概率X射线二进制候选物,并将其列表其特性以进行后续。
Identifying X-ray binary (XRB) candidates in nearby galaxies requires distinguishing them from possible contaminants including foreground stars and background active galactic nuclei. This work investigates the use of supervised machine learning algorithms to identify high-probability X-ray binary candidates. Using a catalogue of 943 Chandra X-ray sources in the Andromeda galaxy, we trained and tested several classification algorithms using the X-ray properties of 163 sources with previously known types. Amongst the algorithms tested, we find that random forest classifiers give the best performance and work better in a binary classification (XRB/non-XRB) context compared to the use of multiple classes. Evaluating our method by comparing with classifications from visible-light and hard X-ray observations as part of the Panchromatic Hubble Andromeda Treasury, we find compatibility at the 90% level, although we caution that the number of sources in common is rather small. The estimated probability that an object is an X-ray binary agrees well between the random forest binary and multiclass approaches and we find that the classifications with the highest confidence are in the X-ray binary class. The most discriminating X-ray bands for classification are the 1.7-2.8, 0.5-1.0, 2.0-4.0, and 2.0-7.0 keV photon flux ratios. Of the 780 unclassified sources in the Andromeda catalogue, we identify 16 new high-probability X-ray binary candidates and tabulate their properties for follow-up.