Published December 2023 | Published
Journal Article

From undecidability of non-triviality and finiteness to undecidability of learnability

  • 1. ROR icon California Institute of Technology
  • 2. ROR icon Freie Universität Berlin
  • 3. ROR icon Technical University of Munich

Abstract

Machine learning researchers and practitioners steadily enlarge the multitude of successful learning models. They achieve this through in-depth theoretical analyses and experiential heuristics. However, there is no known general-purpose procedure for rigorously evaluating whether newly proposed models indeed successfully learn from data. We show that such a procedure cannot exist. For PAC binary classification, uniform and universal online learning, and exact learning through teacher-learner interactions, learnability is in general undecidable, both in the sense of independence of the axioms in a formal system and in the sense of uncomputability. Our proofs proceed via computable constructions that encode the consistency problem for formal systems and the halting problem for Turing machines into whether certain function classes are trivial/finite or highly complex, which we then relate to whether these classes are learnable via established characterizations of learnability through complexity measures. Our work shows that undecidability appears in the theoretical foundations of artificial intelligence: There is no one-size-fits-all algorithm for deciding whether a machine learning model can be successful. We cannot in general automatize the process of assessing new learning models.

Copyright and License

© 2023 Elsevier Inc. All rights reserved.

Acknowledgement

I thank Michael M. Wolf for stimulating discussions on questions of undecidability and for suggesting the reasoning leading to Corollary 4.7. I also thank the anonymous reviewers and the meta-reviewer from COLT 2021, the anonymous reviewers from ITCS 2022, the anonymous reviewers from COLT 2022, the anonymous reviewers from JMLR, and the anonymous reviewers at the International Journal for Approximate Reasoning for their feedback. Furthermore, I thank Artem Chernikov, Asaf Karagila, Aryeh Kontorovich, Vladimir Pestov, and Roi Weiss for pointing the measure-theoretic subtleties around Theorem 2.3 out to me. Moreover, I thank Aryeh Kontorovich for bringing the references [21][12][15] to my attention. Finally, I thank Tom Sterkenburg for an illuminating discussion leading to a clarified statement of Remark 3.17.

Support from the TopMath Graduate Center of TUM the Graduate School at the Technische Universität München, Germany, from the TopMath Program at the Elite Network of Bavaria, from the German Academic Scholarship Foundation (Studienstiftung des Deutschen Volkes), from the BMWK (PlanQK), and from the DAAD PRIME Fellowship program is gratefully acknowledged.

Data Availability

No data was used for the research described in the article.

Additional details

Created:
December 23, 2024
Modified:
December 23, 2024