仅通过查看大量光谱来学会推断出恒星参数

论文标题

仅通过查看大量光谱来学会推断出恒星参数

Machines Learn to Infer Stellar Parameters Just by Looking at a Large Number of Spectra

论文作者

Sedaghat, Nima, Romaniello, Martino, Carrick, Jonathan E., Pineau, François-Xavier

论文摘要

机器学习已被广泛应用于明确定义的天文学和天体物理学问题。但是，在这些领域，深度学习及其对古典机器学习的概念差异已被忽略了。我们作品背后的广泛假设是，让丰富的真实天体物理数据以最小的监督而没有标签，可以揭示出有趣的模式，这可能有助于发现新颖的身体关系。这是第一步，我们试图解释一种深厚的卷积神经网络选择学习的表示，并与当前的物理理解中找到相关性。我们培训了一个编码器架构，以自我监督的重建辅助任务进行训练，以允许其学习一般表示，而不会偏向任何特定任务。通过在网络的信息瓶颈上施加薄弱的分解，我们隐含地在学习的功能中可解释性。我们开发了两种独立的统计和信息理论方法，以查找学习的信息特征的数量，并衡量它们与天体物理验证标签的真正相关性。作为案例研究，我们将此方法应用于〜270000恒星光谱的数据集，每个数据集包含〜300000个维度。我们发现，该网络清楚地将特定节点分配给了参数的估计（概念），例如径向速度和有效温度，而无需要求这样做，这都是完全物理 - 不合时宜的过程。这支持了我们假设的第一部分。此外，我们高度自信地发现，有大约4个独立信息的维度，这些维度与我们的验证参数没有直接相关，为未来的研究提供了潜在的空间。

Machine learning has been widely applied to clearly defined problems of astronomy and astrophysics. However, deep learning and its conceptual differences to classical machine learning have been largely overlooked in these fields. The broad hypothesis behind our work is that letting the abundant real astrophysical data speak for itself, with minimal supervision and no labels, can reveal interesting patterns which may facilitate discovery of novel physical relationships. Here as the first step, we seek to interpret the representations a deep convolutional neural network chooses to learn, and find correlations in them with current physical understanding. We train an encoder-decoder architecture on the self-supervised auxiliary task of reconstruction to allow it to learn general representations without bias towards any specific task. By exerting weak disentanglement at the information bottleneck of the network, we implicitly enforce interpretability in the learned features. We develop two independent statistical and information-theoretical methods for finding the number of learned informative features, as well as measuring their true correlation with astrophysical validation labels. As a case study, we apply this method to a dataset of ~270000 stellar spectra, each of which comprising ~300000 dimensions. We find that the network clearly assigns specific nodes to estimate (notions of) parameters such as radial velocity and effective temperature without being asked to do so, all in a completely physics-agnostic process. This supports the first part of our hypothesis. Moreover, we find with high confidence that there are ~4 more independently informative dimensions that do not show a direct correlation with our validation parameters, presenting potential room for future studies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题