论文标题

时间限制的学习

Time-Constrained Learning

论文作者

Filho, Sergio, Laber, Eduardo, Lazera, Pedro, Molinaro, Marco

论文摘要

考虑一个场景,我们有一个巨大的标签数据集$ {\ cal d} $,以及使用$ {\ cal d} $训练一些给定学习者的有限时间。由于我们可能无法使用整个数据集,我们应该如何进行?这种性质的问题激发了时间约束的学习任务(TCL)的定义:给定一个数据集$ {\ cal d} $从未知分布的$μ$,一个学习者$ {\ cal l} $和时间限制$ t $采样,目标是以最高$ t $ t $ t $ t $ t $ t $ t $ t $获得。至$μ$,在可以使用dataset $ {\ cal d} $的$ {\ cal l} $构建的$中。 我们提出了TCT,这是一种基于机器教学原理​​设计的TCL任务的算法。我们提出了一项实验研究,其中涉及5个不同的学习者和20个数据集,其中我们表明TCT始终胜过另外两种算法:第一个是在[Dasgupta等人,ICML 19]中提出的黑盒学习者的老师,第二个是自然适应TCL设置的随机采样。我们还将TCT与随机梯度下降训练进行了比较 - 我们的方法再次持续更好。 尽管我们的工作主要是实用的,但我们还表明,TCT的剥离版本可证明保证。在合理的假设下,我们的算法达到一定准确性所需的时间永远不会比批处理老师(发送一批示例)实现相似准确性的时间更大,在某些情况下,它几乎要好得多。

Consider a scenario in which we have a huge labeled dataset ${\cal D}$ and a limited time to train some given learner using ${\cal D}$. Since we may not be able to use the whole dataset, how should we proceed? Questions of this nature motivate the definition of the Time-Constrained Learning Task (TCL): Given a dataset ${\cal D}$ sampled from an unknown distribution $μ$, a learner ${\cal L}$ and a time limit $T$, the goal is to obtain in at most $T$ units of time the classification model with highest possible accuracy w.r.t. to $μ$, among those that can be built by ${\cal L}$ using the dataset ${\cal D}$. We propose TCT, an algorithm for the TCL task designed based that on principles from Machine Teaching. We present an experimental study involving 5 different Learners and 20 datasets where we show that TCT consistently outperforms two other algorithms: the first is a Teacher for black-box learners proposed in [Dasgupta et al., ICML 19] and the second is a natural adaptation of random sampling for the TCL setting. We also compare TCT with Stochastic Gradient Descent training -- our method is again consistently better. While our work is primarily practical, we also show that a stripped-down version of TCT has provable guarantees. Under reasonable assumptions, the time our algorithm takes to achieve a certain accuracy is never much bigger than the time it takes the batch teacher (which sends a single batch of examples) to achieve similar accuracy, and in some case it is almost exponentially better.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源