全部云覆盖预测的机器学习

论文标题

全部云覆盖预测的机器学习

Machine learning for total cloud cover prediction

论文作者

Baran, Ágnes, Lerch, Sebastian, Ayari, Mehrez El, Baran, Sándor

论文摘要

对总云覆盖（TCC）的准确和可靠的预测对于许多领域，例如天文学，能源需求，生产或农业至关重要。大多数气象中心都会对TCC进行整体预测，但是，这些预测通常是未校准的，并且表现出比对其他天气变量的合奏预测更差的预测技能。因此，强烈需要某种形式的后处理以提高预测性能。由于通常在离散量表上报告了TCC观察结果，仅采用九个不同的值称为OKTA的值，因此可以将TCC集合预测的统计校准视为分类问题，而Oktas的概率给出的输出提供了分类问题。这是应用机器学习方法的经典领域。我们研究了使用多层感知器（MLP）神经网络，梯度增强机（GBM）和随机森林（RF）方法的后处理的性能。基于欧洲中等范围的天气预报中心，在2002 - 2014年的全球TCC合奏预测中，我们将这些方法与比例的赔率逻辑回归（POLR）和多类逻辑回归（MLR）模型以及原始TCC集合预测进行了比较。我们进一步评估是否可以通过将降水的整体预测作为其他预测指标来改善预测技能。与原始合奏相比，所有校准方法都会显着提高预测技能。 RF模型提供了预测性能的最小增加，而MLP，POLR和GBM方法的表现最佳。降水预测数据的使用导致预测技能进一步提高，除了很短的交货时间外，扩展的MLP模型显示出最佳的总体性能。

Accurate and reliable forecasting of total cloud cover (TCC) is vital for many areas such as astronomy, energy demand and production, or agriculture. Most meteorological centres issue ensemble forecasts of TCC, however, these forecasts are often uncalibrated and exhibit worse forecast skill than ensemble forecasts of other weather variables. Hence, some form of post-processing is strongly required to improve predictive performance. As TCC observations are usually reported on a discrete scale taking just nine different values called oktas, statistical calibration of TCC ensemble forecasts can be considered a classification problem with outputs given by the probabilities of the oktas. This is a classical area where machine learning methods are applied. We investigate the performance of post-processing using multilayer perceptron (MLP) neural networks, gradient boosting machines (GBM) and random forest (RF) methods. Based on the European Centre for Medium-Range Weather Forecasts global TCC ensemble forecasts for 2002-2014 we compare these approaches with the proportional odds logistic regression (POLR) and multiclass logistic regression (MLR) models, as well as the raw TCC ensemble forecasts. We further assess whether improvements in forecast skill can be obtained by incorporating ensemble forecasts of precipitation as additional predictor. Compared to the raw ensemble, all calibration methods result in a significant improvement in forecast skill. RF models provide the smallest increase in predictive performance, while MLP, POLR and GBM approaches perform best. The use of precipitation forecast data leads to further improvements in forecast skill and except for very short lead times the extended MLP model shows the best overall performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题