Doug Lenat, Ph.D.

Author Term

Doug Lenat, Ph.D., contributor, is a pioneer in artificial intelligence. Dr. Lenat is founder of the long-standing Cyc project and CEO of Cycorp, a provider of semantic technologies for unprecedented common sense reasoning. He has been a Professor of Computer Science at Carnegie-Mellon and Stanford and has received numerous honors, including the bi-annual IJCAI Computers and Thought Award (the highest honor in Artificial Intelligence). Dr. Lenat is a founder and Advisory Board member of TTI Vanguard, where he continues to co-run four conferences each year. Doug also holds the distinction of being the only individual to have served on the Scientific Advisory Boards of both Microsoft and Apple.

Articles

Sometimes the Veneer of Intelligence is Not Enough

Author Term

Original COGNITIVE WORLD publication date May, 2017


We all chafe against the brittleness of SIRI, Alexa, Google… every day.  They’re so incredibly useful and yet so incredibly… well, stupid, at the same time.  For instance:


Me to Google: How tall was the President of the United States when Donald Trump was born?

Google to me: 37 million hits, none of which (okay, I didn’t really read through all of them) tell me the answer.  Almost all of them tell me that the Donald was born on June 14, 1946.

Me to Google: How tall was the President of the United States on June 14, 1946?

Google to me: only 6 million hits this time, but again it doesn’t appear that any contain the answer.

Me to Google: Who was the President of the United States on June 14, 1946?

Google to me: 1.4 million hits, and this time most of them are about Donald Trump but at least some of them contain sentences that reveal that the President then was Harry S. Truman.

Me to Google: How tall was Harry S. Truman?

Google to me: Now the hits (and info box) tell me the answer millions of times over: 5’ 6”.


Knowledge base

Of course, using Google and other search engines day after day, decade after decade, has trained me by now so I would never ask that original question to begin with,  How tall was the President of the United States when Donald Trump was born?  I know better, and I’d put together a little plan to ask a sequence of queries much like I ended up doing.

And yet that logical inference is one of the hallmarks of human cognition.  We know a lot of things, but so do dogs and search engines.   What separates us is that we also can deduce things, at least if they only require a couple reasoning steps, and we do that so quickly we aren’t even aware of doing it, and we expect everyone else to be able to.  Witness every time you use a pronoun, or an ambiguous word, or an analogy or metaphor, or an ellipsis. 

Fred was mad at Sam because he stole his lunch.

Who stole whose lunch?  Your brain decodes that into the image of Sam stealing Fred’s lunch.  Your brain doesn’t stop there; you can answer a ton of additional questions just from that one sentence, like: did Sam know he was going to steal it before he did it (yes, probably, at least a little before)?  Did Fred immediately know his lunch had been stolen (no, probably not until he went to fetch it from the refrigerator)?  Why did Sam do it (he might have been hungry, or wanted revenge on Fred for some wrong Fred committed, or maybe Sam just mistook one bag for another and the criminal intent is all in Fred’s mind)?  Are Fred and Sam about the same age (probably, otherwise it would have been worth mentioning)?  Were Fred and Sam in the same building on the day this happened?  And your brain goes even further and abduces a whole rich panoply of additional even less certain (but still quite likely) things: Where was the lunch (in a bag in a refrigerator)?

Some linguists will tell you, and themselves, that that assignment of pronouns is because of some grammar-related thing, like “Sam” being the proper noun closest to “he”.  To that I give them this alternate sentence I could have written:

Fred was mad at Sam so he stole his lunch.

I’ve just changed one word, “because” to “so”.  But now clearly Fred is the thief.  And notice that in neither case did you for an instant think that Fred stole Fred’s lunch or Sam stole Sam’s lunch.   In fact I could have given you this pair of sentences back to back:

Fred was mad at Sam because he stole his lunch.  So he stole his lunch.

Now Sam is the initial thief, and then Fred is stealing Sam’s lunch to get back at him (and/or just so he won’t go without lunch that day, if it’s all happening on the same day.)

All this is akin to what Dan Kahneman calls thinking slow; though, as we’ve seen here, short reasoning chains can happen so quickly we aren’t even consciously aware of going through them.  I mentioned ellipsis, above, and while sometimes that is an explicit “etc.”, sometimes it’s just another sentence which presumes you inferred certain things, or will infer them once you read this next sentence.  That’s part of what happened in the sentence pair above, but this third sentence is an even clearer example:

Fred was mad at Sam because he stole his lunch.  So he stole his lunch.  Their teacher just sighed.

Wait, what teacher?  Presumably Sam and Fred are students, and all this drama happened in school, and they are in the same class, and they brought their lunches to school, and they’ve done things like this before,…  and so on.  Those three dots contain multitudes, but even each innocuous-looking sentence break, above, contains multitudes.

This is a large part of the reason why unrestricted natural language understanding is so difficult to program.  No matter how good your elegant theory of syntax and semantics is, there’s always this annoying residue of pragmatics, which ends up being the lower 99% of the iceberg.  You can wish it weren’t so, and ignore it, which is easy to do because it’s out of sight (it’s not explicitly there in the letters, words, and sentences on the page, it’s lurking in the empty spaces around the letters, words, and sentences.)  But lacking it, to any noticeable degree, gets a person labeled autistic.   They may be otherwise quite smart and charming (such as Raymond in Rain Man and Chauncey Gardiner in Being There), but it would be frankly dangerous to let them drive your car, mind your baby, cook your meals, act as your physician, manage your money, etc.  And yet those are the very applications the world is blithely handing over to severely autistic AI programs! 


I’ll have some more to say about those worries in future columns, but my focus will be on constructive things that can be done, and are being done, to break the AI brittleness bottleneck once and for all. 


Knowing a lot of facts is at best a limited substitute for understanding.  Google is proud that its search bar knows 70 billion things, and it should be proud of that.   By contrast, the Cyc system only knows 15 million things, and relatively few of them are what one would call specific facts, they’re more general pieces of common sense knowledge like “If you own something, you own all its parts” and “If you find out that someone stole something of yours, you’re likely to be mad at them” and “If someone wants something to be true then they’re more likely to act in ways that bring that state of affairs about”.  Google’s 70 billion facts can answer 70 billion questions, but Cyc’s 15 million rules and assertions can answer trillions of trillions of trillions of queries – just like you and I can – because it can reason a few steps deep with/about what it knows.  Cyc’s reasoning “bottoms out” in needing specific facts, such as the birth date of Donald Trump, but fortunately those are exactly the sorts of things it can “look up” just as you or I would.

There are several important elements of this approach to Cognitive Computing that I haven’t talked about here, and plan to cover in future columns:

  • The importance of having a formal representation for knowledge, so that inference can be automated.  While it’s vastly more convenient to just leave things in natural language, it’s vastly more difficult for programs to understand.  There are of course some things that programs can do without really understanding, like recognize patterns and make recommendations, and AI today is a wonderland of exactly such applications.
  • The importance of the formal representation language being expressive enough to capture all the sorts of things people say to each other, the things one might find in an ad or a novel or a news report.  Drastically less expressive, simpler formal languages (such as RDF and OWL) focus on what can readily be done efficiently, and then may try to add on some tricks to recoup a little of the lost expressivity, but that’s a bit like trying to get to the moon by building taller and taller towers.  By contrast, what I’ve found necessary is to force yourself to use a fully expressive language (higher order logic) and then try to add on some tricks to recoup lost efficiency.
  • The general mechanism and the specific steps by which a system like I’ve been talking about can be “told about” various online data repositories and services, so that it can know how to access them when it’s appropriate and necessary to do so.  For instance, it needs to understand, just like you and I do, what sorts of queries Google is and is not likely to be able to answer.
  • Most significantly of all, for Cognitive Computing in the coming decade, is how this sort of “left brain” deduction, induction, and abduction can collaborate and synergize with the sort of “right brain” thinking-fast that all the rest of AI is, today. 

To give just a sketch of what I mean by this synergy, I’ll close with an example from a project we did a few years ago for the National Library of Medicine (one of the NIH).  Medical researchers were initially very optimistic about genome-wide association studies building up databases of correlations between patient DNA mutations and the particular disease or condition they were suffering from.  But they lamented the high “noise” level in that, because it couldn’t separate correlation from causation.  Cyc was able to reason its way to suggest various alternative plausible causal pathways, in some AßàZ correlation cases, where there were independently testable hypotheses here and there in the pathway, hypotheses which could then be tested statistically by going back to patient data.  It’s that back and forth reasoning, using big data and causal reasoning and big data again, which is the main path forward I see for achieving true Cognitive Computing – neither hemisphere of our brain suffices, on its own, and neither of the two AI paradigms suffices, on its own.

Company

Associated Corporate Member

Not Good As Gold: Today's AIs Are Dangerously Lacking In AU (Artificial Understanding)

Author Term

We would not be comfortable giving emotionally impaired people real-time decision-making authority over our family members’ health, our life savings, our cars or our missile defense systems. Yet we are hurtling in that direction with today’s emotionally impaired AIs. 

They — those people, and those AI programs — have trouble doing multi-step abstract reasoning, and that limitation makes them brittle, especially when confronted by unfamiliar, unexpected and unusual situations.

Don’t worry, this is not one of those “Oh, woe is us!” AI fear-mongering articles as we have been graced with by such uniquely qualified AI researchers as Henry Kissinger, Stephen Hawking, and Elon Musk. Yes, we are moving toward a nightmarish AI crisis, but No, it is not unavoidable:  there is a clear path out of this devil’s bargain, and I’m going to articulate exactly what it is and how and when it’s going to save us.

Before I can explain that though, I need to say a few more words about today’s AIs. 

“I knew, and worked on, machine learning as a Stanford professor in the 1970s, decades before it was a new thing.”

I knew, and worked on, machine learning as a Stanford professor in the 1970s, decades before it was a new thing. Machine learning algorithms have scarcely changed at all, in the last 40 years.[1] But several big things have happened, in that time period that has breathed new life into applying that old AI technology:

(i) Computers are a hundred thousand times faster, and on top of that the video game market has given birth to cheap, fast, parallel GPUs which turned out to be well-matched to the voracious appetites of these AIs

(ii)  Data storage costs and transmission speeds have likewise improved by orders of magnitude

(iii) the internet has grown up (well, at least grown), and

(iv) “big data” has gone from scarce to scarcely avoidable. This means there are lots of patches of fertile ground, now, for successfully applying machine learning; I don’t need to survey them here — just try to avoid hearing about them these days.

“Machine Learning has changed much less than, say, the Honda Accord since 1982.”

Current AIs can form and recognize patterns, but they don’t really understand anything. That’s what we humans use our left brain hemispheres for — what Dan Kahneman calls “thinking slow.” That’s the other kind of thinking we do, and that’s also the other kind of AI technology that exists in the world.  It involves representing pieces of knowledge explicitly, symbolically, to build a model of (part of) the world, and then doing logical inference step by step to conclude things which can then become the grist for even deeper logical reasoning.  Think, e.g., of the Sherlock Holmes character’s dazzling displays of deduction.[2]

For most of this article, I want to talk about symbolic representation and reasoning (SR&R) — the “other AI” besides machine learning. So let’s try to contrast those two types of thinking; those two types of AI.

ML is a form of statistical inference:  multi-layer neural networks trained on big data.  By contrast, what I’m talking about here is knowledge-based inference. It’s much like the difference between correlation and causation.  Here are a handful of examples to illustrate the difference between these two very different types of thinking:

  • A police officer may statistically profile a person based on his/her appearance and body language (correlation), versus actually investigating and deducing the person’s guilt or innocence (causation).

  • Until WWII, the “engineering” of large bridges was done mostly by imitating other bridges and just hoping for the best (correlation). Today, we understand the material science of stress, load, elasticity, shear, etc., so mechanical engineering models can be built that prevent tragedies like those that the purely statistical approach led to (e.g., the 1940 collapse of the Tacoma Narrows Bridge) and can go back and analyze what went wrong in those cases (causation).

  • 700 years ago, sometime between Giotto and Brunelleschi, the creation of perspective in paintings went from a mysterious art, only transmitted via years of apprenticeship, to a well-understood technique mechanically created via horizon lines and geometric projections.

  • For millennia, people observed that if two non-redheads had a red-haired child, then about ¼ of all their children would turn out to be red-headed. Now that we understand genetics, we understand how and why a “rR” carrier of the recessive red-headed gene “r” has zero chance of having red hair themselves but if two carriers have offspring, then half their children will be “rR” carriers and one quarter of their children will actually be “rr” and therefore have red hair.

  • Amazon or Netflix might strongly recommend Private School because you enjoyed the first two Hannah Kline Mysteries, but your friend — who knows that you just lost a baby, and that that’s an element of Private School– would understand it’s a terrible recommendation for you now.

It may surprise you that both types of reasoning have been harnessed in AIs since the 1970s.  Both paradigms looked promising, at first, back then, but then each approach encountered enormous obstacles which stalled their progress for several decades.  Several things have changed, in the last 50 years, which have made it cost-effective, finally, to revisit — and harness — both sources of power. 

I’ve already described the changes that led to a resurgence of ML applications ((i)-(iv), above). What has changed that leads me to say that the knowledge-based AI solutions approach — what used to be called “expert systems” — is viable, finally?

It turns out that there weren’t four roadblocks and missing technologies, in this case, there were about 150 (in addition to the need for 100,000x faster/cheaper computers and storage, and access to online data). One by one, large-scale engineering efforts have found adequate engineering solutions (not scientific breakthroughs) for all 150!  I won’t go through them all, but here are a handful of the more important problems, and for each, a description of the engineering solution that successfully tamed it:

“It turns out that in 1969 there were 150 different roadblocks to knowledge-based expert systems succeeding; one by one each has since been removed by treating it as a large-scale engineering (not scientific) problem to overcome.”
  1. Reusability. Each new “expert system” application had to be built from scratch.  And each of those was a long, labor-intensive process, so expert system knowledge engineers inevitably “cut corners” in ways that made their IF/THEN rules almost never reusable in later systems.  For instance, one EMYCIN-based system about blood diseases had rules which acted as though all of a patient’s data was obtained on the same day; a different EMYCIN-based system about pulmonary dysfunction needed rules that carefully indicated what measurements were taken exactly when (e.g., tracking the patient’s smoking history over time).  Each system performed well, but simply unioning those two rule-sets would have led to horrible errors of commission when trying to get that mash-up to try to perform either application task. The large-scale engineering approach to remediating this problem was to painstakingly identify, collect, and formalize — once and for all, thankfully –the tens of millions of general rules of good guessing and good judgment that comprise human common sense and human expert knowledge in dozens of different application domains.  This is a case of making a problem harder to solve it: for the last 35 years that Manhattan-Project-like effort has occupied a team of over a hundred knowledge engineers (whom I dubbed “ontologists” back then) — that’s millions of person-hours of writing and testing and debugging IF/THEN rules.  The requirement was that the growing system continue to perform well on all of its past and present domains, plus common sense, and that requirement in turn forced all the rules to be stated in a sufficiently general, domain-independent, and hence reusable form.[3]

  2. Efficiency. Automated logical reasoning (running a set of IF/THEN rules, doing “Resolution” theorem-proving on them) was painfully slow, even when there were only a few hundred rules, and a few hundred “facts” (ground assertions, such as a patient’s medical data). The theory behind this automatic theorem proving was well understood, but in practice (especially with tens of millions of rules and billions of facts) it almost never would have returned answers to questions before the heat death of the universe.[4]

We could separate the epistemological problem — what should the system know?” — from the heuristic problem — how can the system represent that knowledge in a way that enables inference to happen fast (i.e., fast enough) on it?”

There were two independent large-scale engineering approaches that, working together, finally remediated this problem.   The first half of the solution was inspired by the insight that we could separate the epistemological problem — what should the system know? — from the heuristic problem — how can the system represent that knowledge in a way that enables inference to happen fast (i.e., fast enough) on it? While every rule can and should be represented in a nice, clean, logical “epistemological level” language (more on this later, and in my next posting), on which a general theorem prover could operate, it is also possible to redundantly represent the same rule or fact in many ways, each with its own idiosyncratic data structures and algorithms (that operate on those data structures) for doing certain kinds of reasoning super-fast on that, etc.  By 1989, we had identified and implemented about 20 such special-case reasoners, each with its own data structures and algorithms.  Today there are over 1100 of these “heuristic level reasoning modules.”  These work together cooperatively as a sort of community of agents to usurp the need for a general (but hopelessly slow) theorem prover.  Some of these stylized reasoning agents are narrowly domain-dependent such as how to efficiently balance a chemical equation, and some agents are very general, such as caching transitive binary relations like during and subOrganizations.

That sped up reasoning, but frustratingly it was still the case that one could speed it up even more by excising portions of the knowledge base — i.e., by removing parts of its brain! This radical surgical approach seems like a step in the wrong direction, whether one is dealing with AI programs or human beings.  So why doesn’t our having more knowledge slow us all down, all the time? We don’t become an expert at some task by forgetting everything we know about lots of other topics.

So what happens with humans, as we become an expert at some complicated task?  We learn the new domain concepts, rules, and so on, but we also learn new rules of thumb, rules of good guessing, rules of good judgment for how to approach problems in that domain, how to prioritize and so on.

We’ve been able to take that same approach successfully with our symbolic AI reasoners:  Whenever the system slows down, we just add more knowledge, more rules, to speed it up.  If it’s working in some domain application, we ask the human experts to look over its step by step reasoning trace, to diagnose where it was wasting time. Typically, there was some missing rule of thumb using, that the expert could get to an answer in a few seconds whereas it took the program minutes to deduce the same answer. Adding that meta-level knowledge speeds the program up, incrementally approaching both the correctness, and the efficiency of the best humans who solve that sort of domain problem.

The largest symbolic representation and reasoning system today spends about 90% of its time working on one or another application domain problem, 9% of its time sitting back and doing meta-level tactical reasoning, and 1% of its time sitting even farther back and metaphorically puffing on its Meerschaum pipe and doing meta-meta-level strategic reasoning.”

In other words, we keep in the system’s knowledge base many meta-level rules that tactically plan and coordinate an attack on the current problem, much like a quarterback does in football.  Sometimes we even need to get experts to articulate their meta-meta-rules — strategies — that monitor how the tactician is doing and, like a sideline coach, decide when it’s time to pull the current quarterback from the game and let some other tactician take over.  The largest symbolic representation and reasoning system today spends about 90% of its time working on one or another application domain problem, 9% of its time sitting back and doing meta-level tactical reasoning, and 1% of its time sitting even farther back and metaphorically puffing on its Meerschaum pipe and doing meta-meta-level strategic reasoning.

So adding more and more meta-knowledge, then, is the basis of the second way that symbolic AI systems can be massively sped up.

Inconsistency. Rule-based systems did not deal well with the inevitable inconsistencies of rich, real-world information: once an expert system concluded False, bad things inevitably happened.[5], But the real world is full of inconsistency! How can we reconcile this with the need for knowledge bases to be logically consistent if we’re going to use anything like logic to infer new content?To remediate this problem of ubiquitous inconsistency, we had to replace the requirement of global consistency of the knowledge base with the notion of local consistency.[6]  Every rule and ground assertion in the knowledge base then has n labels or tags that identify what portion of this n-dimensional knowledge base that rule or assertion holds true in.  A rule or assertion might be true at some time, in some place, in someone’s belief system or ideology, up to some level of granularity, etc. etc.  Each of those — time, space, level of granularity, etc. — is a dimension of context-space, a dimension of the knowledge base. This explicitly models the context in which the rules’ premises and conclusions are true, and that ripples out to conclude, mechanically and automatically, in what context a final answer can and should be safely assumed to be valid.  For instance, the standard set of modern rules of thumb about bridge-building are going to get you into trouble if you’re bridging an active volcano in Hawaii, or you’re bridging a fissure on Venus, or you are a child trying to bridge from your bed to your chair. John McCarthy, Guha, and others working on our team also had to figure out a way for our symbolic AI to reason not by theorem-proving – manipulating rigid “True” and “False” token –  but rather by something called argumentation:  coming up with all the pro- and con- arguments it possibly can, in each situation, eliminating the self-contradictory ones, and then reasoning about the remaining arguments to decide what to believe in that context. Each context, also sometimes called a “micro-theory,” is a first class object in the system’s ontology of terms, and can be reasoned about just like oil wells and diseases.  That enables the symbolic AI to carry out the necessary meta-level reasoning it needs to; reasoning about arguments.

Automatically using “big data” as though it were part of the knowledge base. The general rules in a symbolic representation and reasoning system need to “run on data” – individual patient data, stock data, oil well sensor readings, etc.  And most of that data in the world “out there” is in the form of database content or accessible via web services where the meaning of the data is a combination of the data itself plus the meaning of the relations, search fields, etc.  A human, or a custom-built application program, interprets the data accordingly; e.g., in one table of one relational database, a cell with the number “48.3” means “the employee represented by this row has an annual salary of USD $48,300.”  Often that slightly interpreted data referred to as The human (or custom program) further contextualizes that information: e.g., that entire database table contains information which was true in 2014, or represents what some company’s marketing department today wants potential customers to believe. That multi-step interpretation process needs to happen, somehow, before the results of a symbolic knowledge representation and reasoning system can and should be trusted.  I.e., there needs to be some semantic mapping between the terms in a symbolic knowledge representation and reasoning system ontology, and the schema elements in third-party information sources such as databases and web services.  Without that, the system is like a human who, no matter how smart they are, is limiting themselves by never accessing the wealth of relevant information available online.

To remediate this in the case of small data (say hundreds of megabytes or less) one can – once the above ontology alignment has been done – simply import 100% of that data into the knowledge base.  But in the case of terabytes/petabytes/exabytes of data that approach becomes, respectively, undesirable/unacceptable/ unimaginable. To remediate this problem of big data, the knowledge based AI system can have rules which effectively say “to find out the number of inhabitants of any geopolitical US entity, generate the following type of SQL query, where the table is the NGA-pop table, the relation is POP, etc., and ask that of the following database which can be reached via the following protocol…”   In other words, the knowledge based AI system remotely queries relevant third party information sources when/as appropriate just as you or I or a subject matter expert would.

Explanation to end users (and browsing/editing/querying of the KB by end users). The vast majority of end users of these symbolic representation and reasoning AIs won’t want to make the effort to -- and even if they tried wouldn’t be able to -- make heads or tails of some long sequence of IF/THEN-rule-firings, especially if those rules are written in some sort of logical language. But this functionality — explanation of the system’s line of reasoning that led it to an answer — can’t be omitted: it is exactly that step-by-step reasoning chain which users need to audit, and therefore trust, the system. In cases where the user disagrees with the system’s reasoning, if he or she can follow the line of reasoning then he or she is easily able to offer feedback and provide his or her own knowledge to override and improve the system (at least in that context or any context in which that user is trusted).So, for multiple reasons, it is imperative that each long rule trace of formal rule-firings can be automatically converted, somehow, into a terse, readable, understandable explanation, ideally in some natural language like English. So how is the remediation of this coming?  Well, there is bad news and good news. The bad news:  Unfortunately, open-ended unrestricted NLU (complete automatic translation of a natural language text into a formal representation language, without throwing away a lot of the meaning) is still years away from being a reality – the current state of the art is to recognize entities in text, recognize sentiment, recognize very simple binary relations (often with important modifiers like “not” missed!), and notice degrees of co-occurrence and frequency of word combinations.  In a typical English paragraph, this throws out about 90% of the baby – the meaning of the text – with the bathwater.

Unfortunately, open-ended unrestricted NLU (complete automatic translation of a natural language text into a formal representation language, without throwing away a lot of the meaning) is still years away from being a reality…”

“… but for NLG (natural language generation), a surprisingly simple compositional recursive algorithm succeeds quite well.”

The good news: Fortunately, what’s needed to remediate the Explanation problem is not NLU but just NLG (natural language generation), and for that surprisingly simple compositional recursive algorithm succeeds quite well.  E.g., the logical expression (biologicalMother X  Y)  can be translated this way into English as “Y is the biological mother of X” where X and Y are, recursively, the translations of the expressions X and Y.  For example, the nested expression (biologicalMother  MaryAnneMcLeod  (winnerOfIn USPresidentialElection 2016)) turns into “Mary Anne McLead is the biological mother of the winner of the 2016 US Presidential Election,” which is a bit stilted but fully understandable by an English speaker unfamiliar with formal logic.  This also forms the heart of an interface whereby such individuals can query, browse, and edit the knowledge base.

The small residue of cases where this compositional approach fails – commonly occurring cases that lead to confusing or bizarre English sentences being generated – can be handled by idiosyncratic rules that generate natural-sounding glosses for those logical expressions.

This simple compositional approach to NLG also performs poorly on very long sentences that can be dozens of words long.  One way to remediate this is to automatically break them into a set of smaller logical pieces – the nested components of the compound logical expression –  which short logical expressions are then translated into short natural language sentences one at a time.  This approach works but generally leads to translations where a single long sentence gets turned into a series of several short sentences that sound a bit like My First Reader but are nevertheless both understandable and complete (i.e., do not omit any of the intended content which is present in the logical form of the representation.)

Next time:  The other half of the story.  Everything I’ve discussed so far is only half of my argument about when and how we will have AIs with functioning left brain hemispheres, AIs which are not brittle in the face of novel situations.  In my next posting, I will go through the other half of the argument, the teaser for which is this:  

  • Some of the best AI systems today do have and make heavy use of some sort of symbolic representation and reasoning engine, but the representations of knowledge that they use (triple stores, RDF/OWL ontologies, knowledge graphs, etc.) are much too shallow.  They make those choices for efficiency reasoning, but the result is a lot like the joke “We’re lost, but we’re making good time!”  Researchers and application builders tolerate their AI systems having just the thinnest veneer of intelligence, and that may be adequate for fast internet searching or party conversation or New York Times op-ed pieces, but that simple representation leads to inferences and answers which fall far short of the levels of competence and insight and adaptability that expert humans routinely achieve at complicated tasks, and leads to shallow explanations and justifications of those answers.

  • There is a way out of that trap, though it’s not pleasant or elegant or easy.  The solution is not a machine-learning-like “free lunch” or one clap-of-thunder insight about a clever algorithm:  it requires a lot of hard work just like all 5 of the bottleneck remediations I have discussed above, hard work involving higher order (e.g., modal) logics, writing down the formal statements in that language that capture the pragmatics of the real world (and, if we want to reason about it, the Marvel universe and other fictional worlds), and getting serious about pro- and con- argumentation.  The path is uphill and long, but it’s there, and it’s clear, and we can already see the first signs of successfully traversing it:  Yes, there are finally some AIs – AIs you’ve probably not heard about yet – on earth today that truly understand.


[1] A few tweaks have been made, such as increasing the number of hidden neural net layers, convolution, and rectified linear activation, but overall ML has changed much less than, say, the Honda Accord since 1982.

[2] which are actually something logicians call “abduction,” but let’s not worry about that yet.

[3] AI researchers started out forty years ago with object/attribute/value triples -- much like today’s knowledge graphs –, but it turned out to require more and more expressive logics to represent the full meaning of utterances and writings as tersely as they can be expressed in a natural language such as English.  I’ll discuss this more in my next posting.

[4] This is just another instance of W. Pascal’s well-known observation:  “In theory, there is no difference between theory and practice. But, in practice, there is.”

[5] Think of what happens in algebra when you accidentally divide zero, or Tevye’s grappling with contradiction in Fiddler on the Roof, or almost any episode of Star Trek where a computer is inconsistent. 

[6] A good analogy is how we all know that the surface of the earth is roughly spherical, but we live our everyday lives as though it were flat, and that works well for us almost all the time because it is locally flat.  In much the same way, we can organize our symbolic AIs knowledge base into a multidimensional context space, with nearby contexts being mostly consistent with each other. As inference proceeds, it reaches farther- and farther-flung contexts, and the inevitable contradictions that are encountered are treated just as a sign to stop reasoning in that “direction.” All symbolic reasoners are resource-limited, so this is just a hint for it to “search elsewhere!"


"Doug Lenat, Ph.D., contributor, is a pioneer in artificial intelligence. Dr. Lenat is founder of the long-standing Cyc project and CEO of Cycorp, a provider of semantic technologies for unprecedented common sense reasoning. He has been a Professor of Computer Science at Carnegie-Mellon and Stanford and has received numerous honors, including the bi-annual IJCAI Computers and Thought Award (the highest honor in Artificial Intelligence). Dr. Lenat is a founder and Advisory Board member of TTI Vanguard, where he continues to co-run four conferences each year. Doug also holds the distinction of being the only individual to have served on the Scientific Advisory Boards of both Microsoft and Apple.

 

Our Team ID

Doug Lenat Ph.D.

Company

Associated Corporate Member