论文标题
Scikit-dyn2sel-数据流的动态选择框架
scikit-dyn2sel -- A Dynamic Selection Framework for Data Streams
论文作者
论文摘要
挖掘数据流本身是一个挑战。它必须准备好处理大量数据,并且要处理批处理机器中不存在的问题,例如概念漂移。因此,应用批处理设计的技术,例如分类器的动态选择(DC)也提出了挑战。涉及流的合奏的动态特征为在此类分类器中应用传统DCS技术的应用带来了障碍。 Scikit-dyn2sel是一个开源Python库,该库量身定制,用于流媒体数据中的动态选择技术。 Scikit-Dyn2sel的开发遵循代码质量和测试标准,包括PEP8合规性和使用CodeCov.io和CircleCi.com自动化的高测试覆盖范围。源代码,文档和示例可在https://github.com/luccaportes/scikit-dyn2sel上提供。
Mining data streams is a challenge per se. It must be ready to deal with an enormous amount of data and with problems not present in batch machine learning, such as concept drift. Therefore, applying a batch-designed technique, such as dynamic selection of classifiers (DCS) also presents a challenge. The dynamic characteristic of ensembles that deal with streams presents barriers to the application of traditional DCS techniques in such classifiers. scikit-dyn2sel is an open-source python library tailored for dynamic selection techniques in streaming data. scikit-dyn2sel's development follows code quality and testing standards, including PEP8 compliance and automated high test coverage using codecov.io and circleci.com. Source code, documentation, and examples are made available on GitHub at https://github.com/luccaportes/Scikit-DYN2SEL.