论文标题
减少信息过载:因为即使是安全专家也需要眨眼
Reducing Information Overload: Because Even Security Experts Need to Blink
论文作者
论文摘要
计算机应急响应团队(CERT)面临越来越多的挑战,即处理与安全相关的信息的日益增长。威胁报告,安全咨询和脆弱性公告的每日手动分析导致信息过载,从而导致安全专业人员之间的倦怠和流失。这项工作评估了五个与安全有关的数据集中聚类算法和嵌入模型的196个组合,以识别自动信息整合的最佳方法。我们证明,聚类可以在保持语义连贯性的同时将信息处理需求减少超过90%,而深层聚类的安全性错误报告(SBR)达到0.88,基于分区的聚类达到0.51,以获取0.51的咨询数据。我们的解决方案需要最小的配置,保留所有数据点,并在五分钟内对消费者硬件进行处理。研究结果表明,聚类方法可以显着提高证书的运营效率,每位分析师每年节省超过3.750个工作时间,同时保持分析完整性。但是,复杂的威胁报告需要仔细的参数调整以实现可接受的性能,这表明未来优化的领域。该代码可在https://github.com/peasec/reducing-information-overload上提供。
Computer Emergency Response Teams (CERTs) face increasing challenges processing the growing volume of security-related information. Daily manual analysis of threat reports, security advisories, and vulnerability announcements leads to information overload, contributing to burnout and attrition among security professionals. This work evaluates 196 combinations of clustering algorithms and embedding models across five security-related datasets to identify optimal approaches for automated information consolidation. We demonstrate that clustering can reduce information processing requirements by over 90% while maintaining semantic coherence, with deep clustering achieving homogeneity of 0.88 for security bug report (SBR) and partition-based clustering reaching 0.51 for advisory data. Our solution requires minimal configuration, preserves all data points, and processes new information within five minutes on consumer hardware. The findings suggest that clustering approaches can significantly enhance CERT operational efficiency, potentially saving over 3.750 work hours annually per analyst while maintaining analytical integrity. However, complex threat reports require careful parameter tuning to achieve acceptable performance, indicating areas for future optimization. The code is made available at https://github.com/PEASEC/reducing-information-overload.