Improvement of a dedicated model for open domain persona-aware dialogue generation

August 27, 2020 · Entered Twilight · 🏛 arXiv.org

"Last commit was 5.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, AssignPersonality, AttentionRouting, AttentionRoutingPlus, LICENSE, PersonalityTraitFusion, README.md, metrics.py, utils.py

Authors Qiang Han arXiv ID 2008.11970 Category cs.CL: Computation & Language Citations 0 Venue arXiv.org Repository https://github.com/ghosthamlet/persona ⭐ 41 Last Checked 2 months ago

Abstract

This paper analyzes some speed and performance improvement methods of Transformer architecture in recent years, mainly its application in dedicated model training. The dedicated model studied here refers to the open domain persona-aware dialogue generation model, and the dataset is multi turn short dialogue, The total length of a single input sequence is no more than 105 tokens. Therefore, many improvements in the architecture and attention mechanism of transformer architecture for long sequence processing are not discussed in this paper. The source code of the experiments has been open sourced: https://github.com/ghosthamlet/persona