基于提示学习的开放域问答系统检索算法

朱诚

0
0
浏览
下载

摘要
关键词
基金信息
论文图表
同行评议
相关论文
评论

基于提示学习的开放域问答系统检索算法

首发时间：2023-04-17

朱诚 ¹
朱诚，男，硕士研究生，主要研究方向：自然语言处理

1、北京邮电大学计算机学院（国家示范性软件学院），北京 100876

摘要：开放领域问答系统旨在解决来自可能使用非正式自然语言提出问题的人的问题，由于没有直接的结构化信息，问答系统必须从大量段落中找到答案。开放域QA方法通常采用两阶段范式：检索器-阅读器，检索器从段落语料库中选择包含问题答案的候选上下文，阅读器检查上下文以返回最终答案范围。检索器是提升问答系统性能的关键，然而检索模型的训练需要大量的监督样本，针对上述问题，本文提出了基于提示学习的文章检索器算法PPR，使用可替换的提示模板来提高问答检索任务的能力，并弥合不同数据域之间的差距。根据输入样本的特征，将提示模板中的多个部分替换为不同的提示词，PPR对每个数据领域的问题和文章的知识泛化进行建模，同时能够保留每个特定数据领域的知识。PPR使用无监督地大规模语料库进行预训练以增强通用检索能力。通过和其他检索器的对比实验结果表明，PPR在全量数据微调、小样本和零样本场景中都提高了检索器性能，证明了该算法的有效性和创新性。

关键词：人工智能稠密向量检索开放域问答提示学习小样本学习

For information in English, please click here

Prompt-based Passage Retrieval Algorithm for Open-domain Question-answering

ZHU Cheng ¹
朱诚，男，硕士研究生，主要研究方向：自然语言处理

1、School of Computer Science(National Pilot Software Engineering School),Beijing University of Posts and Telecommunications, Beijing 100876

Abstract：The open domain question answering system aims to solve problems from people who may use informal natural language to ask questions. Without direct structural information, the question answering system must find answers from a large number of paragraphs. The open domain QA method typically adopts a two-stage paradigm: a retriever-reader paradigm, where the retriever selects candidate contexts containing the answer to the question from a paragraph corpus, and the reader checks the context to return the final answer range.Retriever is the key to improving the performance of question answering systems. However,retrieval models require a large number of supervised samples, in response to the above issues, this article proposes an article retrieval algorithm PPR based on prompt learning, which uses replaceable prompt templates to improve the ability of question answering retrieval tasks and bridge the gap between different data domains. Based on the characteristics of the input samples, multiple parts in the prompt template are replaced with different prompt words. PPR models the problem in each domain and the knowledge generalization of the article, while preserving the knowledge of each specific domain. PPR uses anunsupervisedlarge-scale corpus for pre training to enhance general retrieval capabilities. The comparative experimental results with other retrieval devices show that PPR improves performance in full data fine-tuning, few-shot sample, and zero-shot sample scenarios, demonstrating the effectiveness and innovation of this Algorithm .

Keywords： Artificial Intelligence Dense Retrieval Open-domain Question Answering Prompt Learning Few-shot Learning

基金：

论文图表：

引用

导出参考文献

.txt

.ris

.doc

朱诚. 基于提示学习的开放域问答系统检索算法[EB/OL]. 北京：中国科技论文在线 [2023-04-17]. https://www.paper.edu.cn/releasepaper/content/202304-252.

No.****

同行评议

未申请同行评议

全部评论

0/1000

论文编号	202304-252
论文题目	基于提示学习的开放域问答系统检索算法
文献类型
收录期刊	上传封面中文期刊英文期刊期刊名称（中文）期刊名称（英文）年，卷（）上传封面中文专著英文专著书名（中文）书名（英文）出版地出版社出版年上传封面中文译著英文译著书名（中文）书名（英文）出版地出版社出版年上传封面中文论文集英文论文集编者.论文集名称（中文） [c]. 出版地出版社出版年， - 编者.论文集名称（英文） [c]. 出版地出版社出版年，- 上传封面中文文献英文文献期刊名称（中文）期刊名称（英文）日期-- 在线地址http:// 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期-- 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期--
英文作者写法：中外文作者均姓前名后，姓大写，名的第一个字母大写，姓全称写出，名可只写第一个字母，其后不加实心圆点“.”, 作者之间用逗号“，”分隔，最后为实心圆点“.”, 示例1：原姓名写法：Albert Einstein,编入参考文献时写法：Einstein A. 示例2：原姓名写法：李时珍；编入参考文献时写法：LI S Z. 示例3：YELLAND R L,JONES S C,EASTON K S,et al.