基于小样本学习的概念漂移自适应方案

曹凯伦; 王健; 刘雨洁; 楚天野; 田雨馨

0
0
浏览
下载

摘要
关键词
基金信息
论文图表
动态公开评议
相关论文
评论

基于小样本学习的概念漂移自适应方案

首发时间：2026-05-12

曹凯伦 ¹
曹凯伦（2004），男，网络空间安全
王健 ¹
王健，（1975-），男，教授，博导，主要研究方向：密码与隐私计算、人工智能系统安全、量子智能计算
刘雨洁 ¹ 楚天野 ¹ 田雨馨 ²

1、北京交通大学网络空间安全学院，北京，100044
2、北京交通大学数学与统计学院，北京，100044

摘要：机器学习方法能够以非常高的准确度检测Android恶意软件，但这些分类器存在致命弱点--概念漂移，即随着恶意和良性应用的持续演化，已部署的检测模型性能迅速衰退。现有基于主动学习的概念漂移自适应方法在新型恶意软件初期样本极少、呈现长尾分布的场景下适应性严重不足，导致新型恶意软件在最优方法中的误分类持续时间平均长达数月。为此，本文提出一种基于原型网络的小样本持续学习框架，通过构建良性样本与恶意样本的类别原型，融合多种主动学习采样策略，在有限标注预算下动态筛选高价值漂移样本，并结合历史数据回放与热启动策略实现模型的高效持续更新。在APIGraph、AndroZoo、BODMAS及Content等多个真实世界概念漂移数据集上的实验评估表明，本文方法在保持宏观检测性能的同时，显著提升了对新型小样本及长尾分布恶意家族的检测能力，将新型恶意软件的误分类持续时间平均降低50%以上，且在细化评估指标上优于现有主流方法。

关键词：信息安全恶意软件检测概念漂移自适应主动学习小样本学习原型网络

For information in English, please click here

Scheme of Concept Drift Self-Adaptation Based on Few-Shot Learning

Cao Kailun ¹
曹凯伦（2004），男，网络空间安全
Wang Jian ¹
王健，（1975-），男，教授，博导，主要研究方向：密码与隐私计算、人工智能系统安全、量子智能计算
Liu Yujie ¹ Chu Tianye ¹ Tian Yuxin ²

1、School of Cyberspace and Science Technology, Beijing Jiaotong University,Beijing 100044
2、School of Mathematics and Statistics, Beijing Jiaotong University,Beijing,100044

Abstract：Machine Learning methods can detect Android malware with very high accuracy, but these classifiers have a fatal weakness - concept drift, which means that as malicious and benign applications continue to evolve, the performance of deployed detection models rapidly deteriorates. Existing concept drift self-adaptation methods based on active learning are severely inadequate in scenarios where initial samples of new malware are extremely scarce and exhibit long-tailed distributions, resulting in the misclassification of new malware in the optimal method lasting an average of several months. To address this, this paper proposes a few-shot continual learning framework based on prototype networks. By constructing class prototypes of benign and malicious samples and integrating multiple active learning sampling strategies, it dynamically selects high-value drifting samples under limited annotation budgets, and combines historical data replay and warm-start strategies to achieve efficient and continual model updates. Experimental evaluations on multiple real-world concept drift datasets such as APIGraph, AndroZoo, BODMAS, and Content show that the proposed method significantly improves the detection ability of new few-shot samples and low-frequency families with long-tailed distributions while maintaining macroscopic detection performance, reducing the average misclassification duration of new malware by over 50%, and outperforming existing mainstream methods on refined evaluation metrics.

Keywords： Information Security Malware Detection Concept Drift Adaptation Active Learning Few-Shot Learning Prototypical Network

基金：

论文图表：

引用

导出参考文献

.txt

.ris

.doc

曹凯伦，王健，刘雨洁，等. 基于小样本学习的概念漂移自适应方案[EB/OL]. 北京：中国科技论文在线 [2026-05-12]. https://www.paper.edu.cn/releasepaper/content/202605-37.

No.****

动态公开评议

共计0人参与

动态评论进行中

全部评论

0/1000

论文编号	202605-37
论文题目	基于小样本学习的概念漂移自适应方案
文献类型
收录期刊	上传封面中文期刊英文期刊期刊名称（中文）期刊名称（英文）年，卷（）上传封面中文专著英文专著书名（中文）书名（英文）出版地出版社出版年上传封面中文译著英文译著书名（中文）书名（英文）出版地出版社出版年上传封面中文论文集英文论文集编者.论文集名称（中文） [c]. 出版地出版社出版年， - 编者.论文集名称（英文） [c]. 出版地出版社出版年，- 上传封面中文文献英文文献期刊名称（中文）期刊名称（英文）日期-- 在线地址http:// 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期-- 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期--
英文作者写法：中外文作者均姓前名后，姓大写，名的第一个字母大写，姓全称写出，名可只写第一个字母，其后不加实心圆点“.”, 作者之间用逗号“，”分隔，最后为实心圆点“.”, 示例1：原姓名写法：Albert Einstein,编入参考文献时写法：Einstein A. 示例2：原姓名写法：李时珍；编入参考文献时写法：LI S Z. 示例3：YELLAND R L,JONES S C,EASTON K S,et al.