论文标题

从数据部分中学习假设空间:收敛性和可行性

Learning the Hypotheses Space from data Part II: Convergence and Feasibility

论文作者

Marcondes, Diego, Simonis, Adilson, Barrera, Junior

论文摘要

在部分\ textit {i}中,我们为一般假设空间$ \ mathcal {h} $,学习空间$ \ mathbb {l}(\ Mathcal {h})$提出了一个结构,可以使用\ textIt {textit {textit {houtcal {h})$,在与相对短暂的相对短暂的空间中进行估算。另外,我们介绍了U-Curve属性,可以利用该属性来选择假设空间而无需详尽地搜索$ \ Mathbb {l}(\ Mathcal {h})$。在本文中,我们通过展示基于学习空间的模型选择框架的一致性来进一步携带议程,其中人们从数据中选择了假设的空间。本文开发的方法通过将VAPNIK-CHERVONENKIS理论扩展到\ textit {Random}假设空间,即从数据中学到的假设空间来增加模型选择中最新的方法。在此框架中,一个人估算一个随机子空间$ \ hat {\ Mathcal {m}} \ in \ Mathbb {l}(\ Mathcal {h})$,该$ a以目标为目标的概率是假设空间$ \ mathcal $ \ mathcal {m}^{m}^{\ star}^{所需的属性。正如收敛性所暗示的渐近无偏估计量一样,我们有一个一致的模型选择框架,这表明从数据中学习假设空间是可行的。此外,我们表明,在$ \ hat {\ Mathcal {m}} $上学习的概括错误比我们在$ \ Mathcal {H} $上学习时所提交的概述要小,因此在从数据中学习的子空间中学习的效率更高。

In part \textit{I} we proposed a structure for a general Hypotheses Space $\mathcal{H}$, the Learning Space $\mathbb{L}(\mathcal{H})$, which can be employed to avoid \textit{overfitting} when estimating in a complex space with relative shortage of examples. Also, we presented the U-curve property, which can be taken advantage of in order to select a Hypotheses Space without exhaustively searching $\mathbb{L}(\mathcal{H})$. In this paper, we carry further our agenda, by showing the consistency of a model selection framework based on Learning Spaces, in which one selects from data the Hypotheses Space on which to learn. The method developed in this paper adds to the state-of-the-art in model selection, by extending Vapnik-Chervonenkis Theory to \textit{random} Hypotheses Spaces, i.e., Hypotheses Spaces learned from data. In this framework, one estimates a random subspace $\hat{\mathcal{M}} \in \mathbb{L}(\mathcal{H})$ which converges with probability one to a target Hypotheses Space $\mathcal{M}^{\star} \in \mathbb{L}(\mathcal{H})$ with desired properties. As the convergence implies asymptotic unbiased estimators, we have a consistent framework for model selection, showing that it is feasible to learn the Hypotheses Space from data. Furthermore, we show that the generalization errors of learning on $\hat{\mathcal{M}}$ are lesser than those we commit when learning on $\mathcal{H}$, so it is more efficient to learn on a subspace learned from data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源