基于提示学习的开放域问答系统检索算法
首发时间:2023-04-17
摘要:开放领域问答系统旨在解决来自可能使用非正式自然语言提出问题的人的问题,由于没有直接的结构化信息,问答系统必须从大量段落中找到答案。开放域QA方法通常采用两阶段范式:检索器-阅读器,检索器从段落语料库中选择包含问题答案的候选上下文,阅读器检查上下文以返回最终答案范围。检索器是提升问答系统性能的关键,然而检索模型的训练需要大量的监督样本,针对上述问题,本文提出了基于提示学习的文章检索器算法PPR,使用可替换的提示模板来提高问答检索任务的能力,并弥合不同数据域之间的差距。根据输入样本的特征,将提示模板中的多个部分替换为不同的提示词,PPR对每个数据领域的问题和文章的知识泛化进行建模,同时能够保留每个特定数据领域的知识。PPR使用无监督地大规模语料库进行预训练以增强通用检索能力。通过和其他检索器的对比实验结果表明,PPR在全量数据微调、小样本和零样本场景中都提高了检索器性能,证明了该算法的有效性和创新性。
关键词: 人工智能 稠密向量检索 开放域问答 提示学习 小样本学习
For information in English, please click here
Prompt-based Passage Retrieval Algorithm for Open-domain Question-answering
Abstract:The open domain question answering system aims to solve problems from people who may use informal natural language to ask questions. Without direct structural information, the question answering system must find answers from a large number of paragraphs. The open domain QA method typically adopts a two-stage paradigm: a retriever-reader paradigm, where the retriever selects candidate contexts containing the answer to the question from a paragraph corpus, and the reader checks the context to return the final answer range.Retriever is the key to improving the performance of question answering systems. However,retrieval models require a large number of supervised samples, in response to the above issues, this article proposes an article retrieval algorithm PPR based on prompt learning, which uses replaceable prompt templates to improve the ability of question answering retrieval tasks and bridge the gap between different data domains. Based on the characteristics of the input samples, multiple parts in the prompt template are replaced with different prompt words. PPR models the problem in each domain and the knowledge generalization of the article, while preserving the knowledge of each specific domain. PPR uses anunsupervisedlarge-scale corpus for pre training to enhance general retrieval capabilities. The comparative experimental results with other retrieval devices show that PPR improves performance in full data fine-tuning, few-shot sample, and zero-shot sample scenarios, demonstrating the effectiveness and innovation of this Algorithm .
Keywords: Artificial Intelligence Dense Retrieval Open-domain Question Answering Prompt Learning Few-shot Learning
基金:
引用

No.****
同行评议
勘误表
基于提示学习的开放域问答系统检索算法
评论
全部评论