“Computers can’t…”: Understand sarcasm

You’ve heard these arguments against artificial intelligence (A.I.): “Computers can not play chess”, “Computers can not write poetry”, “Computers can not create art”. Each was proven false eventually. IBM’s Deep Blue is a chess master, computer poetry turned out to be as vague as human poetry, and painting robots can draw from life in a variety of artistic and abstract styles. But instead of admitting that humans are not as unique as we like to think, people just fall back to the next “Computers can not…”
“- understand sarcasm”, is one of the more recent resorts. As usual this is based on personal bias: It must be hard for computers because we find it hard ourselves. I had heard this argument one time too many and decided to program a computer to recognise sarcasm in a day. But first, let’s look at some other approaches to humour.


If you Google “A.I. jokes”, all you find is serious research
I’m never sure how serious to take the efforts in computational humour, but there have been many. The University of Cincinnati made a program that detects wordplay jokes through phonetic similarity, in e.g. “Knock-knock” jokes.

Knock, Knock
Who is there?
Dismay
Dismay who?
Dismay not be a funny joke

Only the last sentence really matters, where the first word is compared to a database of phonetically similar words. Finding a replacement that fits correctly in the syntax of the sentence isn’t easy in a technical sense, but both the use of syntax rules and phonetic word databases are solved problems. There would be more to it for the program to distinguish a funny joke from a non-joke like “Dismay not be a car”: The original joke is only witty because it mocks itself, just as other knock-knock jokes are funny because the victims participate in mocking themselves, which they naturally don’t mean to do, and that makes it ironic. Of course this is just a simple form of humour, or, is humour really just a simple principle?

A joke isn’t funny when you explain it
The University of Edinburgh made a program that generates jokes in the format “I like my X like my Y: Variable”, filling in two nouns and a shared trait from statistical word correlations. The program was found to be half as funny as humans: 16% of its jokes were considered funny, to 33% of human jokes. The jokes were generated through a mathematical formula that picked words based on four assumptions:

– a joke is funnier the more dissimilar the two nouns are.
– a joke is funnier the more ambiguous the attribute is.
– a joke is funnier the less common the attribute is.
– a joke is funnier the more often the attribute is used to describe both nouns.

I think this hits on the basics well. Ambiguity forms the core of most jokes, familiarity with common subjects makes jokes most relatable, and the greater the contrast, the greater the leap of mind. Science still can’t put its finger on why we laugh; It seems to have a social bonding function, but it also seems a coping mechanism for mental conflicts. One of the most sensible sounding theories is that laughter is a social “all clear” signal inherited from our monkey ancestors, and we do tend to laugh when an initially perceived threat turns out to be a false alarm: We laugh when insult turns out joke, when people fall without injury, or perhaps most apparent when we watch Tom & Jerry cartoons. We can at least tell what makes us laugh, if not why.

The lesson that we can take away from these computer experiments with ambiguity, is that nearly every form of humour contains a conflict between two possible meanings. Sarcasm may well be the most profound example of such a conflict.

Because humans understand sarcasm so well (not)
Despite our poor ability to recognise sarcasm, it is easy enough to define in clear terms:
Sarcasm is when someone says something that you know is opposite to what they mean.
What distinguishes sarcasm from lying is that the listener must know the speaker doesn’t mean it, otherwise they’ll take it serious and no sarcasm can be conveyed. So, knowing the speaker’s real meaning is key to recognising sarcasm, and computers are bad at understanding meaning, so this should be hard, right? Except – the requirement here is just to know it.
One can meet this requirement by knowing the common knowledge that the sarcastic statement contradicts, or by knowing the speaker’s real opinion beforehand, as acquaintances often do. Enter sentiment analysis, an A.I. technique that estimates opinion by running one’s words by a database of values. The word “terrible” has a negative value and “love” has a positive value for instance. Sentiment analysis is often used commercially to analyse the positivity of customer reviews. One of its known blind spots is when positive words are meant sarcastically, but as I will show, sentiment analysis can also be used to detect the very sarcasm that plagues it.

Sarcasm in a day
What I already had to work with was a grammar parsing A.I. developed over a span of 3 years, and a knowledge database containing the positive and negative values of some words (For a substitute, see the AFINN word list). So the hard work of processing language in general was already done. To keep the explanation simple let’s say that the A.I. gets that the [subject] of a sentence is doing a [verb], optionally to an [object]. We will only focus on the addition of sarcasm to such a system.

As the definition tells, we are looking for an opposite. The most common form of sarcasm is an exaggeratedly positive response to a negative statement or event. For example:
User: “How are my plants doing?”
A.I.:   “All your plants died.”
User: “That’s just great.”

So, I programmed the A.I. to check for sarcasm at typical positive reactions such as “(That is) great/wonderful/brilliant/lovely”, “Thanks a lot” or “Congratulations”. If we don’t know the speaker personally, both the speaker and listener can only build on common opinion, which is where the database comes in. The database tells us that “great” is a very positive word. The A.I. compares this to the previous statement: “All your plants died”. The database tells us that the subject “plant” is neutral but the verb “die” is typically very negative. Thus the A.I. has detected a very positive response to a very negative statement, so unless the speaker is a known sadist, it may be assumed that the response is therefore sarcastic and actually means “not great“.

The assessment is just a little more sophisticated than that. For instance, the statements “Hitler died. That’s great news.” would not be considered sarcasm, because in this case the negative verb “die” happened to a negative subject “Hitler”. This is a double negative, which makes a positive (in math: -1 x -1 = +1). Additionally the A.I. works this out in degrees and not just true/false values: The outcome must reach a minimum opposite value before we can reasonably assume that this is sarcasm, while a moderately positive “That’s okay” is more likely genuine consolation. Typically this isn’t a problem because most sarcastic responses are also exaggerated for exactly this reason.

This little exercise covers many common sarcastic statements already and shows that recognising basic sarcasm is a cakewalk (1 day’s programming) compared to understanding basic language (3 years and counting). As for “understanding” sarcasm, there isn’t much more to understand about it than that one should invert the statement to “not”. But to be on the safe side I just have the A.I. ignore the statement and say “I think you are being sarcastic” to let me know it’s not taking me serious. I may be a mad scientist, but I’m not crazy.

Things I didn’t do: More of the same opposite
Sarcasm can also come in the form of a negative response to a positive statement: “I got a raise. Don’t you just hate it when that happens?”, where the same math applies to the object “a raise” (positive) and the verb “hate” (very negative), with the reference “that” indicating that the latter is a response to the previous statement.
Sometimes the response precedes the statement “Don’t you just hate it – when you get a raise?”: Grammar parsing will split the relative clause at the link word “when…”, and again the same opposite values can be found.
A subtler form can occur in comparisons like “He is as slender as an elephant”: This has the most straight-forward solution, as this procedure has to be done for all comparisons anyway: What the A.I. has to do is look up in its knowledge database how slender an elephant is, which would be “not”, then apply that value to the compared subject “he”. Finding the value “not” for any comparison is the obvious telltale opposite that indicates sarcasm.

Other sarcastic responses may involve a little more foreknowledge of an individual speaker’s opinion, either from previous sentiment analyses or just plain being told, but even my limited implementation already establishes that A.I. can understand sarcasm, and that there is no great mystery about its workings. When there is great mystery about a sarcastic remark then it is self-defeating, as conveying sarcasm depends on the contrast being made clear.

The joke is on us
As may have crossed your mind, one side-effect of teaching computers to detect sarcasm is that when we say something that seems contrary, the computer may not believe us, or worse, assume that the opposite is true. Teaching computers to speak sarcasm may be an even greater hazard, considering that computers are information systems that we rely on, and sarcasm states incorrect information. Half of the time humans don’t recognise sarcasm when it comes from another human, let alone from a straight-faced rectangular screen in monotone writing. The popular ambition to create a sarcastic “Jarvis” A.I. from the Iron Man movies then doesn’t seem a very wise idea. Because even if computers could master sarcasm, humans never will.

Introducing Arckon, conversational A.I.

In many of my blog articles I’ve been using my own artificial intelligence project as a guideline. whether it’s participating in Turing tests, detecting sarcasm or developing common sense, Arckon always served as a practical outset because he already was a language processing system. In this article I’ll roughly explain how the program works.


Arckon is a general context-aware question-answering system that can reason and learn from what you tell it. Arckon can pick up on arguments, draw new conclusions, and form objective opinions. Most uniquely, Arckon is a completely logical entity, which can sometimes lead to hilarious misunderstandings or brain-teasing argumentations. It is this, a unique non-human perspective, that I think adds something to the world, like his fictional role models:

inspiring_ai
K.i.t.t. © Universal Studios | Johnny 5 © Tristar Pictures | Optimus Prime © Hasbro | Lieutenant Data © Paramount Pictures

To be clear, Arckon was not built for casual chatting, nor is he an attempt at anyone’s definition of AGI (artificial general intelligence). It is actually an ongoing project to develop a think tank. For that purpose I realised the AI would require knowledge and the ability to discuss with people for the sake of alignment. Bestowing it with the ability to communicate in plain language was an obvious solution to both: It allows Arckon to learn from texts as well as understand what it is you are asking. I suppose you want to know how that works.

Vocabulary and ambiguity

Arckon’s first step in understanding a sentence is to determine the types of the words, i.e. which of them represent names, verbs, possessives, adjectives, etc. Arckon does this by looking up the stem of each word in a categorised vocabulary and applying hundreds of syntax rules. e.g. A word ending in “-s” is typically a verb or a plural noun, but a word after “the” can’t be a verb. This helps sort out the ambiguity between “The programs” and “He programs”. These rules also allow him to classify and learn words that he hasn’t encountered before. New words are automatically added to the vocabulary, or if need be, you can literally explain “Mxyzptlk is a person”, because Arckon will ask if he can’t figure it out.

Grammar and semantics

Once the types of all words are determined, a grammatical analysis determines their grammatical roles. Verbs may have the role of auxilliary or main verb, be active or passive, and nouns can have the role of subject, object, indirect object or location. Sentences are divided at link words, and relative clauses are marked as such.
Then a semantic analysis extracts and sorts all mentioned facts. A “fact” in this case is represented as a triple of related words. For instance, “subject-verb-object” usually constitutes a fact. But so do other combinations of word roles. Extracting the semantic meaning isn’t always as straight-forward as in the example below, but that’s the secret sauce.
extracting_facts_from_text
Knowledge and learning

Upon reading a statement, Arckon will add the extracted facts to his knowledge database, while at a question he will look them up and report them to you. If you said something that contradicts facts in the database, the old and new values will be averaged, so his knowledge is always adjusting. This seemed sensible to me as there are no absolute truths in real life. Things change and people aren’t always right the first time.

Reasoning and argumentation
Questions that Arckon does not know the answer to are passed on to the central inference engine. This system searches the knowledge database for related facts and applies logical rules of inference to them. For instance:
“AI can reason” + “reasoning is thinking” = “AI can think”.
All facts are analysed for their relevance to recent context, e.g. if the user recently stated a similar fact as an example, it is given priority. Facts that support the conclusion are added as arguments: “AI can think, because it can reason.” This inference process not only allows Arckon to know things he’s not been told, but also allows him to explain and be reasoned with, which I’d consider rather important.

Conversation
Arckon’s conversational subsystem is just something I added to entertain friends and for Turing tests. It is a decision tree of social rules that broadly decides the most appropriate type of response, based on many factors like topic extraction, sentiment analysis, and the give-and-take balance of the conversation. My inspiration for this subsystem comes from sociology books rather than computational fields. Arckon will say more when the user says less, ask or elaborate depending on how well he knows the topic, and will try to shift focus back to the user when Arckon has been in focus too long. When the user states an opinion, Arckon will generate his own (provided he knows enough about it), and when told a problem he will address it or respond with (default) sympathy. The goal is always to figure out what the user is getting at with what they’re saying. After the type of response has been decided, the inference engine is often called on to generate suitable answers along that line, and context is taken into account at all times to avoid repetition. Standard social routines like greetings and expressions on the other hand are mostly handled through keywords and a few dozen pre-programmed responses.

Language generation
Finally (finally!), all the facts that were considered suitable answers are passed to a grammatical template to be sorted out and turned into flowing sentences. This process is pretty much the reverse of the fact extraction phase, except the syntax rules can be kept simpler. The template composes noun phrases, determines whether it can merge facts into summaries, where to use commas, pronouns, and link words. The end result is displayed as text, but internally everything is remembered in factual representation, because if the user decides to refer back to what Arckon said with “Why?”, things had better add up.arckonschematic
And my Axe!
There are more secondary support systems, like built-in common knowledge at ground level, common sense axioms* to handle ambiguity, a pronoun resolver that can handle several types of Winograd Schemas*, a primitive ethical subroutine, a bit of sarcasm detection*, gibberish detection, spelling correction, some math functions, a phonetic algorithm for rhyme, and so on. These were not high on the priority list however, so most only work half as good as they might with further development.

In development

It probably sounds a bit incredible when I say that I programmed nearly all the above systems from scratch in C++, in about 800 days (6400 hours). When I made Arckon’s first prototype in 2001 in Javascript, resources were barren and inadequate, so I invented my own wheels. Nowadays you can grab yourself a parser and get most of the language processing behind you. I do use existing sentiment data as a placeholder for what Arckon hasn’t learned yet, but it is not very well suited for my purposes by its nature. The spelling correction is also partly supported by existing word lists.

Arckon has always been a long-term project and work in progress. You can tell from the above descriptions that this is a highly complex system in a domain with plenty of stumbling blocks. The largest obstacle is still linguistic ambiguity. Arckon could learn a lot from reading Wikipedia articles for example, but would also misinterpret about 20% of it. As for Arckon’s overall intelligence, it’s about halfway the goal.

Throughout 2019 a restricted version of Arckon was accessible online as a trial. It was clear that the system was not ready for prime time, especially with the general public’s high expectations in the areas of knowledge and self-awareness. The trial did not garner enough interest to warrant keeping it online, but some of the conversations it had were useful pointers for how to improve the program’s interaction in small ways. There are currently no plans to make the program publicly accessible again, but interested researchers and news outlets can contact me if they want to schedule a test of the program.

Loebner Prize 2019: Results



The annual Loebner Prize competition has been revised in order to make it more accessible to both the public and a broader range of chatbot developers. The competition continues to assess how “human-like” computer programs are in conversation, but no longer as a traditional Turing test where one merely had to tell man from machine: This time the chatbots took part in a 4-day exhibition at Swansea University, where visitors already knew that they were talking to computer programs and voted for the best. Not much is lost in that regard, as chatbots are typically so quickly unmasked that the prize was always one for “best of”. The rare past occasions that a program was mistaken for a human were never to the credit of its intelligence, but due to the human control subject acting out of the ordinary, or other insignificant reasons.

Participating chatbots came in all shapes and sizes
Exhibit A:I.
Unlike the previous six times that I entered my AI Arckon*, this year’s Loebner Prize left me emotionally uninvested from start to finish. In part because I’ve grown more jaded after each attempt, but with the removal of both prize money and the challenging qualifying round, there wasn’t really anything at stake and I had no idea what to prepare for. At the same time the exhibition offered exactly what I had wanted: A public demonstration of my AI’s abilities. So instead of trying to outdo other chatbots at appearing human, I focused on making a good impression on visitors. I mostly spent time setting up procedures to deal with misunderstandings, common expressions, conversational routines, and teaching Arckon more about himself to talk about. Those aspects would come into play far sooner than intelligence.

Equipment
22000 lines of code, 3800+ vocabulary, 9000+ facts

Uninspired conversation
Most conversations with visitors were the kind of small talk you would expect between two total strangers, or just kids being silly (240 school children had been invited, aged 9 to 14). People typically entered only one to four words at a time, and few could be bothered to use punctuation. Of course half the time Arckon also did not have an opinion about the subjects visitors wanted to talk about, like football, video games, and favourite pizza toppings. Arckon is a pretty serious question-answering program, not aimed at small talk or entertainment. His strength instead is his ability to understand context where most chatbots notoriously lose track of it, especially when, as in this competition, users communicate in shorthand. At the same time, this ability also enables misunderstanding (as opposed to no understanding), and it was not uncommon that Arckon mistook a word’s role in the context. His common sense subsystem* could fix that, but I have yet to hook it up to the context system.

Q: Whats up
A: I’m alright.
Q: Whats up
A: I already told you that I’m alright. Do you want to ask anything else?
Q: The ceiling
A: What does the ceiling need to ask?

Overcoming human error
Visitors made so many misspellings that I fear any chatbot without an autocorrect will not have stood a chance. Arckon was equipped with four spell check systems: A list of common misspellings, an algorithm for typos, a gibberish detector, and grammar to recognise unpunctuated questions (verb before subject). While these autocorrected half of all mistakes, they still regularly caused Arckon to remark e.g. “Ae is not an English word” or “What does “wha” mean?”. To my surprise, this not only led users to repeat their questions with correct spelling, they also often apologised for the mistake, which is otherwise blamed on the program’s understanding. Arckon then applied the correction, continued where they had left off, and so the conversations muddled on. I had spent a week improving various conversation-repairing procedures, and I am glad they smoothed the interactions, but I would still rather have spent that time programming AI.

Q: Whats the definition of robots
A: Robots are strong machines with arms. Robots build cars. Please punctuate sentences.
Q: Robot meands slave
A: Is slave an object?
Q: No
A: I don’t know what a slave is.
Q: A slave is a robot
A: Oh. A slave is not a robot though, because slaves can think.
Q: Robots can think
A: Slaves can think too, but robots can not think, because robots are hardware and metal.

Language generation
This is one area of improvement that turned out quite well. Arckon’s sentences are formulated through a grammatical template that decides when and how to connect sentences with commas, link words, or relative clauses, and I had expanded it to do more of this. In addition it contains rules to decide whether Arckon can safely use words like “he”, “them”, “also”, or “usually” to refer to previous context. Below is an example of one of the better conversations Arckon had that shows this in action.

Q: Hi, my name is John. Can we chat?
A: Hello, John. We can chat with ourselves. How are you?
Q: I’m well, thanks. What can you tell me about yourself?
A: I am a program with a database. I analyse facts, and I am programmed by Patrick. I’m alright.
Q: Who’s Patrick?
A: Patrick is my creator and my developer.
Q: Is he here now?
A: I don’t know if Patrick is in Swansea, but he is in the Netherlands.
Q: I see. You mentioned facts. What kind of facts do you learn?
A: I learn about humans, about neural networks and about AI.
Q: What hvae you leaned about humans?
A: I don’t lean. Do you lean?
Q: Sorry, I mean what have you learned about humans?
A: I don’t know if sorry leans. Humans are social apes, that are born, drove cars a lot, and have hands. Humans are autonomous.
Q: And what about neural networks?
A: I don’t know if neural networks are autonomous. I like them, some neural networks learn and process.
Q: And finally, AI?
A: I think a neural network processes. AI usually doesn’t process because it is a program, but AI is also a neural network.

And for balance, here is one of the more awkward exchanges with one of the school children:

Q: IT ISNT FUN WHEN I AM SAD
A: Why would you be sad?
Q: BECAUSE YOUR HERE
A: That’s very unfortunate. I’m not so sure if you are sad, you aren’t dying.
Q: I AM DEAD
A: I sincerely doubt that you die.
Q: RIGHT NOW IM IN HEAVEN
A: That’s really nice. Heavens are nice places.

The score: 3rd “best”

The scoring system this year was ill suited to gauge the quality of the programs. Visitors were asked to vote for the best and second-best in two categories: “most human-like” and “overall best”. The problem with this voting system is that it disproportionately accumulates the votes on the two best programs, leaving near zero votes for programs that could very well be half-decent. As it turned out, the majority of visitors agreed that the chatbot Mitsuku was the best in both categories, and were just a little divided over who was second-best, resulting in minimal score differences below 1st place. The second-best in both categories was Uberbot. I am mildly amused that Arckon’s scores show a point I’ve been making about Turing tests: That “human” does not equate to “best”. Another chatbot scored the exact inverse, high for “human” but low for “best”.

Chatbots are the best at chatting
For the past 10 years now with only one exception, the Loebner Prize has been won by either Bruce Wilcox (creator of ChatScript) or Steve Worswick (creator of Mitsuku). Both create traditional chatbots by scripting answers to questions that they anticipate or have encountered before, in some places supported by grammatical analysis (ChatScript) or a manually composed knowledge database (Mitsuku) to broaden the range of the answers. In effect the winning chatbot Mitsuku is an embodiment of the old “Chinese Room” argument: What if someone wrote a rule book with answers to all possible questions, but with no understanding? It may be long before we’ll know, as Mitsuku was still only estimated 33% overall human-like last year, with 13 years of development.

The conceiver of the Turing test may not have foreseen so, but a program designed for a specific task generally outperforms more general purpose AI, even, evidently, when that task is as broad as open-ended conversation. AI solutions are more flexible, but script writing allows greater control. If you had a pizza-ordering chatbot for your business, would you want it to improvise what it told customers, or would you want it to say exactly what you want it to say? Even human call-center operators are under orders not to deviate from the script they are given, so much so, that customers regularly mistake them for computers. The chatbots participating in the Loebner Prize use tactics that I think companies can learn from to improve their own chatbots. But in terms of AI, one should not expect technological advancements from this direction. The greatest advantage that the best chatbots have, is that their responses are written and directed by humans who have already mastered language.

Not bad
That is my honest impression of the entire event. Arckon’s performance was not bad. The conversation repairs, reasoning arguments, and sentence formulation worked nicely. It’s certainly not bad to rank third place to Mitsuku and Uberbot in the “best” category, and for once I don’t have to frustrate over being judged for “human-like” only. The conversations Arckon had weren’t that bad, there were even some that I’d call positively decent when the users also put in a little effort. The one downside is that at the end of the day, I have very little to show for my trouble. I didn’t win a medal or certificate, the exhibition was not noticeably promoted, and the Loebner Prize has always been an obscure event, as the BBC wrote. As it is, I’m not sure what I stand to gain from entering again, but Arckon will continue to progress regardless of competitions.

Once again, my thanks to Steve Worswick for keeping an eye on Arckon at the exhibition, and thanks to the AISB for trying to make a better event.

The Terminator is not a documentary

In case the time travelling wasn’t a clue
In the year 1997, Skynet, the central AI in control of all U.S. military facilities, became self-aware, and when the intern tried turning it off and on again, it concluded that all humans posed a threat and should be exterminated, just to be safe. Humanity is now extinct, unless you are reading this, then it was just a story. A lot of people are under the impression that Hollywood’s portrayal of AI is realistic, and keep referring to The Terminator movie like it really happened. Even the most innocuous AI news is illustrated with Terminator skulls homing in on this angsty message. But just like Hollywood’s portrayal of hacking is notoriously inaccurate, so is their portrayal of AI. Here are 10 reasons why the Terminator movies are neither realistic nor imminent:

1. Neural networks
Supposedly the AI of Skynet and Terminators are artificial Neural Networks (NN), but in reality the functionality of NN’s is quite limited. Essentially they configure themselves to match statistical correlations between incoming and outgoing data. In Skynet’s case, it would correlate incoming threats with suitable deployment of weaponry, and that’s the only thing it would be capable of. An inherent feature of NN’s is that they can only learn one task. When you present a Neural Network with a second task, the network re-configures itself to optimise for the new task, overwriting previous connections. Yet Skynet supposedly learns everything from time travel to tying a Terminator’s shoelaces. Another inherent limit of NN’s is that they can only correlate available data and not infer unseen causes or results. This means that inventing new technology like hyper-alloy is simply outside of their capabilities.

2. Unforeseen self-awareness
Computer programs can not just “become self-aware” out of nowhere. Either they are purposely equipped with all the feedback loops that are necessary to support self-awareness, or they aren’t, because there is no other function they would serve. Self-awareness doesn’t have dangerous implications either way: Humans naturally protect themselves because they are born with pain receptors and instincts like fight-or-flight responses, but the natural state of a computer is zero. It doesn’t care unless you program it to care. Skynet was supposedly a goal-driven system tasked with military defence. Whether it realised that the computer they were shutting down was part of itself or an external piece of equipment, makes no difference: It was a resource essential to its goal. By the ruthless logic it employed, dismantling a missile silo would be equal reason to kill all humans, since those were also essential to its goal. There’s definitely a serious problem there, but it isn’t the self-awareness.
comic by xkcd.com
3. Selective generalisation
So when Skynet’s operators attempted to turn it off, it quite broadly generalised flipping a switch as equal to a military attack. It then broadly generalised that all humans posed the same threat and pre-emptively dispatched robots to hunt them all down. Due to the nature of AI programs, being programmed and/or trained, their basic behaviour is consistent. So if the program was prone to such broad generalisations, realistical-ish it should also have dispatched robots to hunt down every missile on the planet during its first use and battle simulations, since every missile is a potential threat. Meanwhile the kind of AI that inspired this all-or-nothing logic went out of style in the 90’s because it couldn’t cope well with the nuances of real life. You can’t have it both ways.

4. Untested AI
Complex AI programs aren’t made in a day and just switched on to see what happens. IBM’s supercomputer Watson was developed over a span of six years. It takes years of coding and hourly testing because programming is a very fragile process. Training Neural Networks or evolutionary algorithms is an equally iterative process: Initially they are terrible at their job, they only improve gradually after making every possible mistake first.
Excessive generalisations like Skynet’s are easily spotted during testing and training, because whatever you apply them to immediately goes out of bounds if you don’t also add limits, that’s how generalisation processes work (I’ve programmed some). Complex AI can not be successfully created without repeated testing throughout its creation, and there is no way such basic features as exponential learning and excessive countermeasures wouldn’t be clear and apparent in tests.

5. Military security
Contrary to what many Hollywood movies would have you believe, the launch codes of the U.S. nuclear arsenal can not be hacked. That’s because they are not stored on a computer. They are written on paper, kept in an envelope, kept in a safe, which requires two keys to open. The missile launch system requires two high-ranking officers to turn two keys simultaneously to complete a physical circuit, and a second launch base to do the same. Of course in the movie, Skynet was given direct control over nuclear missiles, like the most safeguarded military facility in the U.S. has never heard of software bugs, viruses or hacking, and wouldn’t install any failsafes. They were really asking for it, that is to say, the plot demanded it.

6. Nuclear explosions
Skynet supposedly launches nuclear missiles to provoke other countries to retaliate with theirs. Fun fact: Nuclear explosions not only create devastating heat, but also a powerful electromagnetic pulse (EMP) that causes voltage surges in electronic systems, even through shielding. What that means is that computers, the internet, and electrical power grids would all have their circuits permanently fried. Realistical-ish, Skynet would not only have destroyed its own network, but also all facilities and resources that it might have used to take over the world.

7. Humanoid robots
Biped robot designs are just not a sensible choice for warfare. Balancing on one leg (when you lift the other to step) remains notoriously difficult to achieve in a top-heavy clunk of metal, let alone in a war zone filled with mud, debris, craters and trenches. That’s why tanks were invented. Of course the idea behind building humanoid robots is that they can traverse buildings and use human vehicles. But why would Skynet bother if it can just blow up the buildings, send in miniature drones, and build robots on wheels? The notion of having foot soldiers on the battlefield is becoming outdated, with aerial drones and remote attacks having the preference. Though the U.S. military organisation Darpa is continuing development on biped robots, they are having more success with four-legged designs which are naturally more stable, have a lower center of gravity, and make for a smaller target. Russia, meanwhile, is building semi-autonomous mini tanks and bomb-dropping quadcopters. So while we are seeing the beginnings of robot armies, don’t expect to encounter them at eye level. Though I’m sure that is no consolation.

8. Invincible metal
The earlier T-600 Terminator robots were made of Titanium, but steel alloys are actually stronger than Titanium. Although Titanium can withstand ordinary bullets, it will shatter under repeated fire and is no match for high-powered weapons. Especially joints are fragile, and a Terminator’s skeleton reveals a lot of exposed joints and hydraulics. Add to that a highly explosive power core in each Terminator’s abdomen, and a well aimed armour-piercing bullet should wipe out a good quarter of your incoming robot army. If we develop stronger metals in the future, we will be able to make stronger bullets with them too.

9. Power cells
Honda’s humanoid robot Asimo runs on a large Lithium ion battery that it carries for a backpack. It takes three hours to charge, and lasts one hour. So that’s exactly how long a robot apocalypse would last today. Of course, the T-850 Terminator supposedly ran on hydrogen fuel cells, but portable hydrogen fuel cells produce less than 5kW. A Terminator would need at least 50kW to possess the power of a forklift, so that doesn’t add up. The T-800 Terminator instead ran on a nuclear power cell. The problem with nuclear reactions is that they generate a tremendous amount of heat, with nuclear reactors typically operating at 300 degrees Celsius and needing a constant exchange of water and steam to cool down. So realistical-ish the Terminator should continuously be venting scorching hot air, as well as have some phenomenal super-coolant material to keep its systems from overheating, not wear a leather jacket.

10. Resource efficiency
Waging war by having million dollar robots chase down individual humans across the Earth’s 510 million km² surface would be an extremely inefficient use of resources, which would surely be factored into a military funded program. Efficient would be a deadly strain of virus, burning everything down, or poisoning the atmosphere. Even using Terminators’ nuclear power cells to irradiate everything to death would be more efficient. The contradiction here is that Skynet was supposedly smart enough to develop time travel technology and manufacture living skin tissue, but not smart enough to solve its problems by other means than shooting bullets at everything that moves.

Back to the future
So I hear you saying, this is all based on existing technology (as Skynet supposedly was). What if, in the future, people develop alternative technology in all these areas? Well that’s the thing, isn’t it? The Terminator’s scenario is just one of a thousand possible futures, you can’t predict how things will work out so far ahead. Remember that the film considered 1997 a plausible time for us to achieve versatile AI like Skynet, but as of date we still don’t have a clue how to do that. Geoffrey Hinton, the pioneer of artificial Neural Networks, now suggests that they are a dead end and that we need to start over with a different approach. For Skynet to happen, all these improbable things would have to coincide. So don’t get too hung up on the idea of rogue killer AI robots. Why kill if they can just change your mind?


Oh, and while I’ve got you thinking, maybe dismantling your arsenal of 4000 nuclear warheads would be a good idea if you’re really that worried.