Abstract
We answer the present paper’s title in the negative. We begin by introducing and characterizing “real learning” (\(\mathcal {RL}\)) in the formal sciences, a phenomenon that has been firmly in place in homes and schools since at least Euclid. The defense of our negative answer pivots on an integration of reductio and proof by cases, and constitutes a general method for showing that any contemporary form of machine learning (ML) isn’t real learning. Along the way, we canvass the many different conceptions of “learning” in not only AI, but psychology and its allied disciplines; none of these conceptions (with one exception arising from the view of cognitive development espoused by Piaget), aligns with real learning. We explain in this context by four steps how to broadly characterize and arrive at a focus on \(\mathcal {RL}\).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The need for the qualifications (i.e. determinate, non-question-begging) should be obvious. The answer to the present paper’s title that a machine which machine-learns by definition learns, since ‘learn’ appears in ‘machine-learn,’ assumes at the outset that what is called ‘machine learning’ today is real learning—but that’s precisely what’s under question; hence the petitio.
- 2.
All mathematical models of learning relevant to the present discussion that we are aware of take learning to consist fundamentally in the learning of number-theoretic functions from \(\mathbb {N} \times \mathbb {N} \times \cdots \times \mathbb {N}\) to \(\mathbb {N}\). Even when computational learning was firmly and exclusively rooted in classical recursion theory, and dedicated statistical formalisms were nowhere to be found, the target of learning was a function of this kind; see e.g. (Gold 1965; Putnam 1965), a modern, comprehensive version of which is given in (Jain et al. 1999). We have been surprised to hear that some in our audience aren’t aware of the basic, uncontroversial fact, readily appreciated by consulting the standard textbooks we cite here and below, that machine learning in its many guises takes the target of learning to be number-theoretic functions. A “shortcut” to grasping a priori that all systematic, rigorously described forms of learning in matters and activities computational and mechanistic must be rooted in number-theoretic functions, is to simply note that computer science itself consists in the study and embodiment of number-theoretic functions, defined and ordered in hierarchies (e.g. see Davis and Weyuker 1983). We by the way focus herein on unary functions \(f : \mathbb {N} \mapsto \mathbb {N}\) only for ease of exposition.
- 3.
A biconditional isn’t needed. We use only a weaker set of necessary conditions, not a set of necessary and sufficient conditions.
- 4.
Not to be be confused with RL, reinforcement learning, in which real learning, as revealed herein, doesn’t happen.
- 5.
As many readers will know, Searle’s (1980) Chinese Room Argument (CRA) is intended to show that computing machines can’t understand anything. It’s true that Bringsjord has refined, expanded, and defended CRA (e.g. see Bringsjord 1992, Bringsjord and Noel 2002; Bringsjord 2015), but bringing to bear here this argumentation in support of the present paper’s main claim would instantly demand an enormous amount of additional space. And besides, as we now explain, calling upon this argumentation is unnecessary.
- 6.
Since at bottom, as noted (see note 2), the target of learning should be taken for generality and rigor to be a number-theoretic function, it’s natural to consider learning in the realm of the formal sciences.
- 7.
Just as (computer) programs can be correct or incorrect, so too proofs can be correct or incorrect. For more on this, see e.g. (Arkoudas and Bringsjord 2007).
- 8.
If we regard Turing to have been speaking of modern AI in his famous (Turing, 1950), note then too that his orientation is test-based: he gave here of course the famous ‘Turing Test.’.
- 9.
In fact, this is why real learning for humans in mathematics is challenging; see e.g. (Moore 1994).
- 10.
Our assumption here thus specifically invokes connectionist ML. But this causes no loss of generality, as we explain by way the “tour” of ML taken in Sect. 6.1, and the fact that the proof in the Appendix, as explained there, is a general method that will work form any contemporary form of ML.
- 11.
This is a rough-and-ready extraction from (Jain et al. 1999), and must be sufficient given the space limitations of the present short paper, at least for now. Of course, there are many forms of ML/machine learning in play in AI of today. In Sect. 6.1 we consider different forms of ML in contemporary AI. In Sect. 6.2 we consider different types of “learning” in psychology and allied disciplines.
- 12.
Lathrop (1996) shows, it might be asserted, that uncomputable functions can be machine-learned. But in his scheme, there is only a probabilistic approximation of real learning, and—in clear tension with (c1\('\))–(c3)—no proof in support of the notion that anything has been learned. The absence of such proofs is specifically called out in the formal deduction given in the Appendix.
- 13.
- 14.
We of course join epistemological cognoscenti in being aware of Gettier-style cases, but they can be safely left aside here. For the record, Bringsjord claims to have a solution anyway—one that is at least generally in the spirit of Chisholm’s (1966) proposed solution, which involves requiring that the justification in justified-true-belief accounts of knowledge be of a certain sort. For Gettier’s landmark paper, see (Gettier 1963).
- 15.
Specifically, we shall see that the formal deduction of the Appendix is actually a method for showing that other forms of “modern” ML, not just those that rely on ANNs, don’t enable machines to really learn anything. E.g., the method can take Bayesian learning in, and yield as output that such learning isn’t real learning.
- 16.
Shakespeare himself, or better yet even Ibsen, or better better yet Bellow, couldn’t have invented a story dripping with this much irony—a story in which the machine-learning people persecuted the logicians for building “brittle” systems, and then the persecutors promptly proceeded to blithely build comically brittle systems as their trophies (given to themselves).
- 17.
In which is by the way cited hypercomputational artificial neural networks.
- 18.
E.g. even beginning textbooks introducing single-variable differential/integral calculus ask for verification of human learning by asking for proofs. The cornerstone and early-on-introduced concept of a limit is accordingly accompanied by requests to students that they supply proofs in order to confirm that they understand this concept. Thus we e.g. have on p. 67 of (Stewart, 2016) a request that our reader prove that
$$\begin{aligned} \lim _{x \rightarrow 3} g(x) = (4x - 5) = 7 \end{aligned}$$. What machine-learning machine that has learned the function g here can do that?
- 19.
This is essentially the Short Short Story Game of (Bringsjord 1998), much harder than such Turing-computable games as Checkers, Chess, and Go, which are all at the same easy level of difficulty (EXPTIME).
- 20.
Outside of the present paper, we have carried out a second analysis that confirms this, by examining learning in AI as characterized in (Russell and Norvig 2009), and invite skeptical readers to carry out their own analysis for this textbook, and indeed for any comprehensive, mainstream textbook. The upshot will be the stark fact that \(\mathcal {RL}\), firmly in place since Euclid as what learning in the formal sciences is, will be utterly absent.
- 21.
Instead of looking to published attempts to systematically present AI (such as the textbooks upon which we rely herein), one could survey practitioners in AI, and see if their views harmonize with the publications explicitly designed to present all of AI (from a high-altitude perspective). E.g., one could turn to such reports as (Müller and Bostrom 2016), in which the authors report on a specific question, given at a conference that celebrated AI’s “turning 50” (AI@50), which asked for an opinion as to the earliest date (computing) machines would be able to simulate human-level learning. It’s rather interesting that 41% of respondents said this would never happen. It would be interesting to know if, in the context of the attention ML receives these days, the number of these pessimists would be markedly smaller. If so, that may well be because, intuitively, plenty of people harbor suspicions that ML in point of fact hasn’t achieved any human-level real learning.
- 22.
Luger’s book revolves around a fundamental distinction between what he calls weak problem-solving versus strong problem-solving.
- 23.
There are a few exceptions. Hummel (2010) has explained that sophisticated and powerful forms of symbolic learning, ones aligned with second-order logic, are superior to associative forms of learning. Additionally, there’s one clear historical exception, but it’s now merely a sliver in psychology (specifically, in psychology of reasoning), and hence presently has insufficient adherents to merit inclusion in the ontology we now proceed to canvass. We refer here to the type of learning over the years of human development and formal education posited by Piaget; e.g. see (Inhelder and Piaget 1958). Piaget’s view, in a barbaric nutshell, is that, given solid academic education, nutrition, and parenting, humans develop the capacity to reason with and even eventually over first-order and modal logic—which means that such humans would develop the capacity to learn in \(\mathcal {RL}\) fashion, in school. Since attacks on Piaget’s view, starting originally with those of Wason and Johnson-Laird (e.g. see Wason and Johnson-Laird 1972), many psychologists have rejected Piaget’s position. For what it’s worth, Bringsjord has defended Piaget; see e.g. (Bringsjord et al. 1998).
- 24.
We are happy to concede that years of laborious (and tedious?) study of conditioning using appetitive and aversive reinforcement (and such phenomena as inhibitory conditioning, conditioned suppression, higher-order conditioning, conditioned reinforcement, and blocking) has revealed that conditioning can’t be literally reduced to new reflexes, but there is no denying that in conditioning, any new knowledge and representation that takes form falls light years short of \(\mathcal {RL}\).
- 25.
Note that all occurrences of ‘understanding’ in the itemized list that follows, in keeping with the psychometric operationalization introduced at the outset in order not to rely on the murky concept of understanding, could be invoked here; but doing so would take much space and time, and be quite inelegant.
- 26.
Peano Arithmetic (PA) is rarely introduced by name in K–12 education, but all the axioms of it, save perhaps for the Induction Schema, are introduced and taught there.
- 27.
References
Achab, M., Bacry, E., Gaïffas, S., Mastromatteo, I., Muzy, J.F.: Uncovering causality from multivariate Hawkes integrated cumulants. In: Precup, D., Teh, Y.W. (eds) Proceedings of the 34th International Conference on Machine Learning, PMLR, International Convention Centre, Sydney, Australia. Proceedings of Machine Learning Research, vol. 70, pp. 1–10 (2017). http://proceedings.mlr.press/v70/achab17a.html
Arkoudas, K.: Denotational proof languages. Ph.D. thesis, MIT (2000)
Arkoudas, K., Bringsjord, S.: Computers, justification, and mathematical knowledge. Minds Mach. 17(2), 185–202 (2007)
Arkoudas, K., Musser, D.: Fundamental Proof Methods in Computer Science: A Computer-Based Approach. MIT Press, Cambridge (2017)
Bandura, A., Walters, R.H.: Social Learning Theory, vol. 1. Prentice-Hall, Englewood Cliffs (1977)
Bandura, A., Ross, D., Ross, S.A.: Transmission of aggression through imitation of aggressive models. J. Abnorm. Soc. Psychol. 63(3), 575 (1961)
Barrett, L.: Beyond the Brain: How Body and Environment Shape Animal and Human Minds. Princeton University Press, Princeton (2015)
Bellman, A., Bragg, S., Handlin, W.: Algebra 2: Common Core. Pearson, Upper Saddle River (2012). Series Authors: Charles, R., Kennedy, D., Hall, B., Consulting Authors: Murphy, S.G
Boolos, G.S., Burgess, J.P., Jeffrey, R.C.: Computability and Logic, 4th edn. Cambridge University Press, Cambridge (2003)
Bringsjord, S.: What Robots Can and Can’t Be. Kluwer, Dordrecht (1992)
Bringsjord, S.: Chess is too easy. Technol. Rev. 101(2), 23–28 (1998). http://kryten.mm.rpi.edu/SELPAP/CHESSEASY/chessistooeasy.pdf
Bringsjord, S.: Psychometric artificial intelligence. J. Exp. Theor. Artif. Intell. 23(3), 271–277 (2011)
Bringsjord, S.: The symbol grounding problem-remains unsolved. J. Exp. Theor. Artif. Intell. 27(1), 63–72 (2015). https://doi.org/10.1080/0952813X.2014.940139
Bringsjord, S., Arkoudas, K.: The modal argument for hypercomputing minds. Theor. Comput. Sci. 317, 167–190 (2004)
Bringsjord, S., Noel, R.: Real robots and the missing thought experiment in the Chinese room dialectic. In: Preston, J., Bishop, M. (eds.) Views into the Chinese Room: New Essays on Searle and Artificial Intelligence, pp. 144–166. Oxford University Press, Oxford (2002)
Bringsjord, S., Schimanski, B.: What is artificial intelligence? psychometric AI as an answer. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003), pp. 887–893. Morgan Kaufmann, San Francisco (2003). http://kryten.mm.rpi.edu/scb.bs.pai.ijcai03.pdf
Bringsjord, S., Zenzen, M.: Superminds: People Harness Hypercomputation, and More. Kluwer Academic Publishers, Dordrecht (2003)
Bringsjord, S., Bringsjord, E., Noel, R.: In defense of logical minds. In: Proceedings of the 20th Annual Conference of the Cognitive Science Society, pp. 173–178. Lawrence Erlbaum, Mahwah (1998)
Bringsjord, S., Kellett, O., Shilliday, A., Taylor, J., van Heuveln, B., Yang, Y., Baumes, J., Ross, K.: A new Gödelian argument for hypercomputing minds based on the busy beaver problem. Appl. Math. Comput. 176, 516–530 (2006)
Chisholm, R.: Theory of Knowledge. Prentice-Hall, Englewood Cliffs (1966)
Davis, M., Weyuker, E.: Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science, 1st edn. Academic Press, New York (1983)
Dodig-Crnkovic, G., Giovagnoli, R. (eds.): Computing Nature: Turing Centenary Perspective. Springer, Berlin (2013). https://www.springer.com/us/book/9783642372247
Domjan, M.: The Principles of Learning and Behavior, 7th edn. Cengage Learning, Stamford (2015)
Gallistel, C.R.: Learning and representation. In: Learning and Memory: A Comprehensive Reference, vol. 1. Elsevier (2008) https://doi.org/10.1016/j.neuron.2017.05.021
Gettier, E.: Is justified true belief knowledge? Analysis 23, 121–123 (1963). http://www.ditext.com/gettier/gettier.html
Gold, M.: Limiting recursion. J. Symb. Logic 30(1), 28–47 (1965)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016). http://www.deeplearningbook.org
Goodstein, R.: On the restricted ordinal theorem. J. Symb. Logic 9(31), 33–41 (1944)
Huitt, W.: Classroom Instruction. Educational Psychology Interactive (2003)
Hummel, J.: Symbolic versus associative learning. Cogn. Sci. 34(6), 958–965 (2010)
Inhelder, B., Piaget, J.: The Growth of Logical Thinking from Childhood to Adolescence. Basic Books, New York (1958)
Jain, S., Osherson, D., Royer, J., Sharma, A.: Systems That Learn: An Introduction to Learning Theory, 2nd edn. MIT Press, Cambridge (1999)
Kearns, M., Vazirani, U.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994)
Kitzelmann, E.: Inductive programming: a survey of program synthesis techniques. In: International Workshop on Approaches and Applications of Inductive Programming, pp 50–73. Springer (2009)
Lathrop, R.: On the learnability of the uncomputable. In: Saitta, L. (ed.) Proceedings of the 13th International Conference on Machine Learning, The conference was held in Italy, 3–6 July 1996, pp 302–309. Morgan Kaufman, San Francisco (1996). https://pdfs.semanticscholar.org/6919/b6ad91d9c3aa47243c3f641ffd30e0918a46.pdf
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Luger, G.: Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 6th edn. Pearson, London (2008)
Mackintosh, N.J.: Conditioning and Associative Learning. Calendron Press, Oxford (1983)
Marblestone, A.H., Wayne, G., Kording, K.P.: Toward an integration of deep learning and neuroscience. Front. Comput. Neurosci. 10(94) (2016). https://doi.org/10.3389/fncom.2016.00094
Moore, R.C.: Making the transition to formal proof. Educ. Stud. Math. 27(3), 249–266 (1994)
Müller, V., Bostrom, N.: Future progress in artificial intelligence: a survey of expert opinion. In: Müller, V. (ed.) Fundamental Issues of Artificial Intelligence (Synthese Library), pp. 553–571. Springer, Berlin (2016)
Penn, D., Holyoak, K., Povinelli, D.: Darwin’s mistake: explaining the discontinuity between human and nonhuman minds. Behav. Brain Sci. 31, 109–178 (2008)
Putnam, H.: Trial and error predicates and a solution to a problem of Mostowski. J. Symbolic Logic 30(1), 49–57 (1965)
Rado, T.: On non-computable functions. Bell Syst. Tech. J. 41, 877–884 (1963)
Ross, J.: Immaterial aspects of thought. J. Philos. 89(3), 136–150 (1992)
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice Hall, Upper Saddle River (2009)
Schapiro, A., Turk-Browne, N.: Statistical learning. Brain Mapp. Encyclopedic Ref. 3, 501–506 (2015)
Searle, J.: Minds, brains and programs. Behav. Brain Sci. 3, 417–424 (1980)
Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge (2014)
Stewart, J.: Calculus. 8th edn. Cengage Learning, Boston (2016), We refer here to an electronic version of the print textbook. The “Student Edition” of the hard-copy textbook has an ISBN 978-1-305-27176-0
Titley, H.K., Brunel, N., Hansel, C.: Toward a neurocentric view of learning. Neuron 95(1), 19–32 (2017)
Turing, A.: Computing machinery and intelligence. Mind LIX 59(236), 433–460 (1950)
Wason, P., Johnson-Laird, P.: Psychology of Reasoning: Structure and Content. Harvard University Press, Cambridge (1972)
Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7), 1341–1390 (1996)
Acknowledgement
We are deeply appreciative of feedback received at PT-AI 2017, the majority of which is addressed herein. The first author is also specifically indebted to John Hummel for catalyzing, in vibrant discussions at MAICS 2017, the search for formal arguments and/or theorems establishing the proposition Hummel and Bringsjord co-affirm: viz., statistical machine learning simply doesn’t enable machines to actually learn, period. Bringsjord is also thankful to Sergei Nirenburg for valuable conversations. Many readers of previous drafts have been seduced by it’s-not-really-learning forms of learning (including worse-off-than artificial neural network (ANN) based deep learning (DL) folks: Bayesians), and have offered spirited objections, all of which are refuted herein; yet we are grateful for the valiant tries. Bertram Malle stimulated and guided our sustained study of types of learning in play in psychology\(^+\), and we are thankful. Jim Hendler graciously read an early draft; his resistance has been helpful (though perhaps now he’s a convert). The authors are also grateful for five anonymous reviews, some portions of which reflected at least partial and passable understanding of our logico-mathematical perspective, from which informal notions of “learning” are inadmissible in such debates as the present one. Two perspicacious comments and observations from two particular PT-AI 2017 participants, subsequent to the conference, proved productive to deeply ponder. We acknowledge the invaluable support of “Advanced Logicist Machine Learning” from ONR, and of “Great Computational Intelligence” from AFOSR. Finally, without the wisdom, guidance, leadership, and raw energy of Vincent Müller, PT-AI 2017, and any ideas of ours that have any merit at all, and that were expressed there and/or herein, would not have formed.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Bringsjord, S., Govindarajulu, N.S., Banerjee, S., Hummel, J. (2018). Do Machine-Learning Machines Learn?. In: Müller, V. (eds) Philosophy and Theory of Artificial Intelligence 2017. PT-AI 2017. Studies in Applied Philosophy, Epistemology and Rational Ethics, vol 44. Springer, Cham. https://doi.org/10.1007/978-3-319-96448-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-96448-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96447-8
Online ISBN: 978-3-319-96448-5
eBook Packages: Computer ScienceComputer Science (R0)