Deep Learning, Alien Knowledge and Other UFOs

Deep learning generates observations we can’t explain. Is this the end of theory or a rallying cry for deep explanations? A response to David Weinberger.

I initially dismissed David Weinberger’s report of alien knowledge as tabloid sensationalism. But as the recommendations for his essay accumulated, it gave me pause. Weinberger’s post rewards a close reading.

My intent here is to present a more incremental, less revolutionary perspective on AI and machine learning. I believe the historical antecedents paint a far more earthly, but perhaps no less sensational, picture.

I also believe Weinberger is accurately expressing the concerns (and possibly even hopes) of many within the community, expert and layperson alike. Many of the questions I raise here also surfaced in the comments to Weinberger’s post. He’s been generous with his responses, so I’ve tried to incorporate that extended discussion.

The end of theory and the scientific method?

Weinberger begins by resurrecting Chris Anderson’s argument from 2008, announcing the end of theory and the scientific method as we know it. It’s a shaky foundation. As Weinberger recalls, Anderson’s piece “kicked up a little storm.” Peter Norvig even maintained that Anderson “was being provocative, presenting a caricature of an idea, even though he knew the idea was not really true.”

“All models are wrong, and increasingly you can succeed without them.” To set the record straight: That’s a silly statement, I didn’t say it, and I disagree with it. The ironic thing is that even the article’s author, Chris Anderson, doesn’t believe the idea.

While crediting Anderson for surfacing an important development, Norvig asserted, “that does not mean we throw out the scientific method.”

I wouldn’t try building on an argument that Anderson disavowed and Norvig called silly. Yet Weinberger not only dismisses objections to Anderson’s essay as “quaint”, he leaps dramatically into an abyss of “alien knowledge”.

Apparently, Anderson now believes his argument or at least that it’s “getting less wrong by the day. Which is not to say ‘right’, but…”

Yes, a lot of buts. Weinberger describes complex models we can’t understand. But that description applies to virtually all of our tools, mathematical or otherwise. Weinberger describes a new form of alien knowledge. But the output of these machines is better understood as generating new observations, data that reside at a fairly low level in the knowledge creation hierarchy. Weinberger describes explanatory knowledge as simple and reductive. But explanations are often deep, integrative, and defy a reductionist hierarchy.

I’ll address each of these buts in more detail below. I hope you’ll be persuaded that Norvig’s conclusion of a decade again still stands: “Theory has not ended, it is expanding into new forms.”

Machines creating models we can’t understand?

“We are increasingly relying on machines that derive conclusions from models that they themselves have created, models that are often beyond human comprehension, models that ‘think’ about the world differently than we do.”

Deep learning models, corralling millions of parameters, are certainly beyond direct comprehension. But unless you’re a mathematical savant, you don’t have to venture beyond a handful of parameters before being so humbled. One of the brightest computer scientists that ever lived, John von Neumann, once said, “Young man, in mathematics you don’t understand things. You just get used to them.” And once you get used to them, you use them. There’s a reason machine learning is so closely aligned with applied statistics.

Weinberger’s characterization of incomprehensible models can be extended to virtually any tool. Consider one man’s effort to acquire a fundamental understanding of a toaster. Yes, a toaster. Or a television. Or a computer. Or a complex mathematical model. Even our eyes embody “alien knowledge” in this frame, converting photons to electrical signals in neurons, interpreted by hypotheses of what we expect to see. I don’t comprehend it, yet I see.

Raising the extended mind theory of Andy Clark and David Chalmers, Weinberger acknowledges, “knowing is something we’ve always done out in the world with tools.” And in a comment, he clarifies, “[Deep learning] makes that fact inescapable.” Inescapable, not in the sense that “thinking with tools” is new or somehow enslaves us, but merely that it’s now self-evident.

We don’t need to understand complex models to pursue the collectiveendeavor of knowledge creation, a collaboration across centuries. Our technological and scientific complex has long advanced beyond direct comprehension. Yes, deep learning adds another layer of complexity to an already vast and incomprehensibly complex system. But it’s incremental, not revolutionary.

The essence of Weinberger’s argument then is “that the nature of computer-based justification is not at all like human justification. It is alien.” Relating the advice of Mike Williams, “we need to be especially vigilant about the prejudices that often, and perhaps always, make their way into [experiments].” But deep learning seems to defy vigilance. In a follow up email, Weinberger cautioned, “at least in some cases, we cannot interrogate [deep learning] and get understandable answers.”

Tools always carry a degree of inscrutability and observations are inherently biased. Rather than an alien phenomenon, is it better to consider these challenges of experimental design? Again, deep learning doesn’t introduce these challenges, but rather makes them self-evident.

More importantly, I think the standard of care becomes clearer when machine learning is framed as an observational tool. Observations, machine-generated or otherwise, shouldn’t be misconstrued as knowledge, much less edicts.

Are new observations knowledge?

It’s Weinberger’s post, so he can call the proceeds of machine learning anything he likes: knowledge, conclusions, outcomes, predictions, decisions. With tongue-in-cheek, he can even anthropomorphize this “understanding” and “thinking”. When Rick Fischer recommended in a comment that the terms “data” or “output” would be more appropriate, Weinberger found no reason to disagree.

But terminology (and consistent choices of terminology) matter, particularly when we’re elevating one class of information above another. I’m partial to the term, observations. These machines predict new observations. Like telescopes, they reveal latent information, patterns and regularities.

Yes, observations may be awe-inspiring, now and always. Can you imagine the awe, the sense of “alien knowledge”, when Galileo first turned his telescope to the skies, revealing the Moon’s rocky surface, the satellites of Jupiter, and spots on the sun. I suspect it’s the exact sort of experience we’re having today with AI and machine learning, the illumination of brand new vistas.

But knowledge, as I understand it, constitutes a much higher standard than observations, even the awe-inspiring sort. It’s the deeper explanation of the phenomena. Consider the reach of explanations like quantum mechanics or evolution to explain a dizzying array of observations, both seen and unseen, expected and unimaginable.

The temporary absence of explanations doesn’t anoint a model as “alien knowledge”. It’s just the absence of explanations. To suggest that some regularity doesn’t have an underlying explanation is only to say that a deeper understanding of the phenomenon has yet to be discovered.

And humans are quick to fill that void. Weinberger surfaced a wonderful example of this process in AlphaGo, but he didn’t carry the example far enough to communicate this hierarchy of knowledge creation. In a comment, Thomas G. Dietterich observed, “After the triumph of AlphaGo, the Go players of the world are now studying its tactics and changing how they play.” AlphaGo’s moves are new observations, they’re not new explanations. And this is a difference that matters.

Even the seemingly unfathomable inner workings of deep learning will eventually succumb to explanations. Weinberger is obviously aware of the effort. In a comment, Weinberger replied, “To be clear, I do not mean to say that we can never understand how an instance of Deep Learning has come up with its results and can never discover a human-usable regularity. I know there is work going on to help us understand those decisions when possible.”

So what’s possible? Are there limits to this process of knowledge creation?

The limits of explanatory knowledge?

“We thought knowledge was about finding the order hidden in the chaos. We thought it was about simplifying the world. It looks like we were wrong. Knowing the world may require giving up on understanding it.”

I fear explanations are not well represented in Weinberger’s essay, which may lead some to fear the limits of our explanations. Repeatedly, he characterizes them as rules, “simple” and “reductive”, but these terms misrepresent our best knowledge. Elegant expressions like E=mc² get all the press, but many explanations are extremely complex. Far from “gross acts of simplification”, they can be mind-blowingly accurate. Explanations at one stratum are often autonomous, in that they don’t reduce to more “fundamental” strata. Consider the breadth of explanations across emergent phenomena such as thermodynamics, biology, psychology, economics, and yes, computer science.

Weinberger is a staunch critic of reductionism and views his entire essay an argument against it. In an email, he asserted, “our knowledge has been constrained by our mind’s need to understand the world by reducing it to the rules our mind can understand, and by the limited amounts of data our instruments have been able to manage.” But I don’t believe we understand the world by reducing it; we understand it through increasingly deep explanations.

The physicist and philosopher David Deutsch summarized this optimistic viewpoint: “Scientific knowledge, like all human knowledge, consists primarily of explanations. Mere facts can be looked up, and predictions [and prediction generating machines] are important only for conducting crucial experimental tests to discriminate between competing scientific theories that have already passed the test of being good explanations. As new theories supersede old ones, our knowledge is becoming both broader (as new subjects are created) and deeper (as our fundamental theories explain more, and become more general). Depth is winning.”

I agree completely with this point from Weinberger: “Our machines have made obvious our epistemological limitations, and by providing a corrective, have revealed a truth about the universe.” But to be clear, he believes this moment in time marks the limit of our explanations. And in another comment, “Knowledge has always been the surface of far vaster ignorance. My personal hope is that Deep Learning is making us more aware of, and comfortable with, that.”

I hope you’re uncomfortable with that. Yes, as it will always be, our current knowledge is the surface of far vaster ignorance, what Deutsch describes as The Beginning of Infinity. But the revolution of scientific knowledge marks the beginning of this process, not the end.

Deep explanations are winning. Deep learning should be celebrated, not for revealing the limits of knowledge, but as a powerful observational tool, the telescope of our time, at the service of explanations.

Special thanks to David Weinberger, Mat Wilson, and Nikhil Sriraman for their reviews of earlier drafts.