PLACES: Prompting Language Models for Social Conversation Synthesis

February 07, 2023 · Entered Twilight · 🏛 Findings

Repo contents: CODE_OF_CONDUCT.md, CONTRIBUTING.md, LICENSE, README.md, THIRD-PARTY-LICENSES.txt, conversation_synthesis.py, load_models.py, parse_topical_chat.py, prompts, utils.py, write_prompts.py

Authors Maximillian Chen, Alexandros Papangelis, Chenyang Tao, Seokhwan Kim, Andy Rosenbaum, Yang Liu, Zhou Yu, Dilek Hakkani-Tur arXiv ID 2302.03269 Category cs.CL: Computation & Language Cross-listed cs.AI, cs.IR Citations 102 Venue Findings Repository https://github.com/alexa/PLACES ⭐ 11 Last Checked 1 month ago

Abstract

Collecting high quality conversational data can be very expensive for most applications and infeasible for others due to privacy, ethical, or similar concerns. A promising direction to tackle this problem is to generate synthetic dialogues by prompting large language models. In this work, we use a small set of expert-written conversations as in-context examples to synthesize a social conversation dataset using prompting. We perform several thorough evaluations of our synthetic conversations compared to human-collected conversations. This includes various dimensions of conversation quality with human evaluation directly on the synthesized conversations, and interactive human evaluation of chatbots fine-tuned on the synthetically generated dataset. We additionally demonstrate that this prompting approach is generalizable to multi-party conversations, providing potential to create new synthetic data for multi-party tasks. Our synthetic multi-party conversations were rated more favorably across all measured dimensions compared to conversation excerpts sampled from a human-collected multi-party dataset.