基于大语言模型自训练的少样本领域文档级关系抽取方法
首发时间:2026-03-03
摘要:面向小样本领域的文档级信息抽取任务,长文档跨句证据分散、候选实体对噪声与标注稀缺往往共同导致召回率受限。本文提出基于数据增强的大模型自训练算法,首先以少量标注进行监督微调以对齐任务定义与结构化输出;随后基于概率重采样与规则校验,利用大模型生成并筛选高置信伪标注以扩充长尾关系模式;最后引入基于奖励建模的偏好对齐训练,通过格式、合法性与关系相似集等奖励约束抑制随机波动,使模型在噪声与长上下文下保持稳定决策。实验在 DocRED 与 Re-DocRED 上验证了方法有效性,并在网安 AZERG 数据集上表现出更高的召回与更稳健的结构化抽取能力,表明该范式适用于信息密度高但样本稀缺的领域文档级抽取场景。
关键词: 文档级关系抽取 小样本学习 自增强数据生成 奖励优化对齐 STIX 威胁情报
For information in English, please click here
A Self-Training Approach to Domain Document-Level Relation Extraction with Large Language Models
Abstract:For document-level information extraction in low-resource domains, dispersed cross-sentence evidence in long documents, noisy candidate entity pairs, and scarce annotations often jointly constrain recall. This paper proposes an LLM self-training algorithm based on data augmentation. It first performs supervised fine-tuning with a small amount of labeled data to align task definitions and structured outputs; then, guided by probabilistic re-sampling and rule-based verification, it uses an LLM to generate and filter high-confidence pseudo-labeled data to enrich long-tail relation patterns; finally, it introduces reward-based preference alignment, and suppresses stochastic fluctuations via a relation-similarity-set reward, enabling stable decision-making under noise and long contexts. Experiments on DocRED and Re-DocRED validate the effectiveness of the proposed method, and results on a cybersecurity AZERG dataset show higher recall and more robust structured extraction, indicating that this paradigm is well suited for information-dense yet annotation-scarce domain document extraction scenarios.
Keywords: document-level relation extraction few-shot learning self-augmented data generation preference alignment STIX threat intelligence
基金:
引用

No.****
动态公开评议
共计0人参与
勘误表
基于大语言模型自训练的少样本领域文档级关系抽取方法
评论
全部评论