๐
๐
Old Age
PLACES: Prompting Language Models for Social Conversation Synthesis
February 07, 2023 ยท Entered Twilight ยท ๐ Findings
Repo contents: CODE_OF_CONDUCT.md, CONTRIBUTING.md, LICENSE, README.md, THIRD-PARTY-LICENSES.txt, conversation_synthesis.py, load_models.py, parse_topical_chat.py, prompts, utils.py, write_prompts.py
Authors
Maximillian Chen, Alexandros Papangelis, Chenyang Tao, Seokhwan Kim, Andy Rosenbaum, Yang Liu, Zhou Yu, Dilek Hakkani-Tur
arXiv ID
2302.03269
Category
cs.CL: Computation & Language
Cross-listed
cs.AI,
cs.IR
Citations
102
Venue
Findings
Repository
https://github.com/alexa/PLACES
โญ 11
Last Checked
1 month ago
Abstract
Collecting high quality conversational data can be very expensive for most applications and infeasible for others due to privacy, ethical, or similar concerns. A promising direction to tackle this problem is to generate synthetic dialogues by prompting large language models. In this work, we use a small set of expert-written conversations as in-context examples to synthesize a social conversation dataset using prompting. We perform several thorough evaluations of our synthetic conversations compared to human-collected conversations. This includes various dimensions of conversation quality with human evaluation directly on the synthesized conversations, and interactive human evaluation of chatbots fine-tuned on the synthetically generated dataset. We additionally demonstrate that this prompting approach is generalizable to multi-party conversations, providing potential to create new synthetic data for multi-party tasks. Our synthetic multi-party conversations were rated more favorably across all measured dimensions compared to conversation excerpts sampled from a human-collected multi-party dataset.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted