Testing LLM performance on the Physics GRE: some observations
December 07, 2023 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Pranav Gupta
arXiv ID
2312.04613
Category
physics.ed-ph
Cross-listed
cs.LG
Citations
3
Venue
arXiv.org
Last Checked
1 month ago
Abstract
With the recent developments in large language models (LLMs) and their widespread availability through open source models and/or low-cost APIs, several exciting products and applications are emerging, many of which are in the field of STEM educational technology for K-12 and university students. There is a need to evaluate these powerful language models on several benchmarks, in order to understand their risks and limitations. In this short paper, we summarize and analyze the performance of Bard, a popular LLM-based conversational service made available by Google, on the standardized Physics GRE examination.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β physics.ed-ph
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Use of Eye-Tracking Technology to Investigate Cognitive Load Theory
R.I.P.
π»
Ghosted
Beyond Answers: Large Language Model-Powered Tutoring System in Physics Education for Deep Learning and Precise Understanding
R.I.P.
π»
Ghosted
How Peripheral Interactive Systems Can Support Teachers with Differentiated Instruction: Using FireFlies as a Probe
R.I.P.
π»
Ghosted
Combining surveys and sensors to explore student behaviour
R.I.P.
π»
Ghosted
Innovative Approaches to Teaching Quantum Computer Programming and Quantum Software Engineering
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted