Perzeptive Systeme Conference Paper 2021

How much coffee was consumed during EMNLP 2019? Fermi Problems: A New Reasoning Challenge for AI

Fermi teaser cropped

Many real-world problems require the combined application of multiple reasoning abilities -- employing suitable abstractions, commonsense knowledge, and creative synthesis of problem-solving strategies. To help advance AI systems towards such capabilities, we propose a new reasoning challenge, namely Fermi Problems (FPs), which are questions whose answers can only be approximately estimated because their precise computation is either impractical or impossible. For example, "How much would the sea level rise if all ice in the world melted?" FPs are commonly used in quizzes and interviews to bring out and evaluate the creative reasoning abilities of humans. To do the same for AI systems, we present two datasets: 1) A collection of 1k real-world FPs sourced from quizzes and olympiads; and 2) a bank of 10k synthetic FPs of intermediate complexity to serve as a sandbox for the harder real-world challenge. In addition to question-answer pairs, the datasets contain detailed solutions in the form of an executable program and supporting facts, helping in supervision and evaluation of intermediate steps. We demonstrate that even extensively fine-tuned large-scale language models perform poorly on these datasets, on average making estimates that are off by two orders of magnitude. Our contribution is thus the crystallization of several unsolved AI problems into a single, new challenge that we hope will spur further advances in building systems that can reason.

Author(s): Ashwin Kalyan and Abhinav Kumar and Arjun Chandrasekaran and Ashish Sabharwal and Peter Clark
Book Title: 2021 Conference on Empirical Methods in Natural Language Processing - Proceedings of the Conference
Pages: 7318--7328
Year: 2021
Month: November
Editors: Moens, Marie-Francine and Huang, Xuanjing and Specia, Lucia and Wen-tau Yih, Scott
Publisher: Association for Computational Linguistics
Bibtex Type: Conference Paper (inproceedings)
Address: Stroudsburg, PA
DOI: 10.18653/v1/2021.emnlp-main.582
Event Name: Conference on Empirical Methods in Natural Language Processing (EMNLP 2021)
Event Place: Punta Cana, Dominican Republic (and virtual)
State: Published
Electronic Archiving: grant_archive
ISBN: 978-1-955917-09-4
Links:

BibTex

@inproceedings{Kalyan_FermiProblems_2021,
  title = {How much coffee was consumed during {EMNLP} 2019? Fermi Problems: A New Reasoning Challenge for {AI}},
  booktitle = {2021 Conference on Empirical Methods in Natural Language Processing - Proceedings of the Conference},
  abstract = {Many real-world problems require the combined application of multiple reasoning abilities -- employing suitable abstractions, commonsense knowledge, and creative synthesis of problem-solving strategies. To help advance AI systems towards such capabilities, we propose a new reasoning challenge, namely Fermi Problems (FPs), which are questions whose answers can only be approximately estimated because their precise computation is either impractical or impossible. For example, "How much would the sea level rise if all ice in the world melted?" FPs are commonly used in quizzes and interviews to bring out and evaluate the creative reasoning abilities of humans. To do the same for AI systems, we present two datasets: 1) A collection of 1k real-world FPs sourced from quizzes and olympiads; and 2) a bank of 10k synthetic FPs of intermediate complexity to serve as a sandbox for the harder real-world challenge. In addition to question-answer pairs, the datasets contain detailed solutions in the form of an executable program and supporting facts, helping in supervision and evaluation of intermediate steps. We demonstrate that even extensively fine-tuned large-scale language models perform poorly on these datasets, on average making estimates that are off by two orders of magnitude. Our contribution is thus the crystallization of several unsolved AI problems into a single, new challenge that we hope will spur further advances in building systems that can reason.},
  pages = {7318--7328},
  editors = { Moens, Marie-Francine and Huang, Xuanjing and Specia, Lucia and Wen-tau Yih, Scott},
  publisher = {Association for Computational Linguistics},
  address = {Stroudsburg, PA},
  month = nov,
  year = {2021},
  slug = {kalyan_fermiproblems_2021},
  author = {Kalyan, Ashwin and Kumar, Abhinav and Chandrasekaran, Arjun and Sabharwal, Ashish and Clark, Peter},
  month_numeric = {11}
}