Consciousness and the Thinking Machine

In his seminal paper from 1950 on “Computing Machinery and Intelligence”, which discussed whether computing machines could “think” as humans do, Alan Mathison Turing conjectured that, ultimately, perception is reality. He called this the “Imitation Game” and it has since become known as the “Turing Test”. Turing believed that a computer would pass the test by the turn on the millennium – not a bad guess given Moore’s law hadn’t been formulated yet! In his paper Turing did not deny that there remained some mystery surrounding consciousness – the way we “feel” when we think – but he also rejected arguments which drifted in the direction of solipsism. Sixty-eight years later, how far have we moved on from Turing’s insights and are we any closer to grasping what consciousness is?

What do we mean by consciousness?

Bear with me – this gets a little convoluted (no pun intended!)

You may have heard of William of Ockham (sometimes spelled “Occam”), the 14th century English philosopher whose famous “razor” is often quoted in the context of scientific thought: “It is vain to do with more, what can be done with less.” Occam’s Razor means more than just “keep it simple” – rather it means that we should beware the creativity of the human mind when examining the real world, and use “just enough” degrees of freedom to describe a physical phenomenon (subject of course to empirical evidence). Einstein famously used this principle (implicitly) in his deliberations over the constancy of the speed of light. He rightly saw no need for the introduction of the mythical “ether” – and came to this conclusion not as a result of the Michelson-Morley experiment but rather by consideration of James Clark Maxwell’s equations for electromagnetism. Nothing in Maxwell’s original formulation required there to be an absolute frame of reference in which the perturbation of the electromagnetic field was to take place. This was simply a device added later by the scientific community to connect the new theory back to their existing understanding of how waves propagate. But Maxwell’s theory was indeed relativistic and required no re-working after Einstein formulated Special Relativity. Indeed, Einstein more or less inferred the latter from the former.

Now what on earth does this have to do with consciousness? Well, if I summarize the paragraph above it might be to say we should be wary of getting carried away with our own imagination when explaining the real world. We humans have an incredible facility for creativity and imagination but only a small portion of what we come up with ever stands up to scientific scrutiny. And none more so than in the field of how our own minds work. You see, we all have a confirmation bias when we think about how we think. I for one never put much serious thought into the process of consciousness as a child, but I was definitively aware that “I” existed. Consciousness seems to defy a simple explanation – and this is perhaps because rather than being a “thing”, it is an “experience”. We experience consciousness and can relate that experience to others who seem to have the same experience. It is like recognizing colors. We may both agree that a tomato has the property of “redness” without needing to specify the physical process which leads to it. It seems magical to be conscious – but is this magical impression just a feature of the phenomenon itself?

Most of the processing of our minds goes on in the background. We call this the unconscious mind and it mainly processes learnt behavior. Daniel Kahneman describes this as “fast thinking” because our unconscious mind is very fast and very accurate. It kicks in when we undertake complex tasks like driving a car. Those who are familiar with machine learning will immediately see a parallel with a trained algorithm – it can take an enormous amount of computing resource to train, say, a multi-layered neural network, but once it is trained, using this model to make a prediction can be very swift. (Computer scientists developed neural networks to imitate the workings of neurons in the brains through a computing device called a perceptron so the parallel is no accident.)

Consciousness by contrast is the part of our mind which is self-aware – that is to say, has a concept of “self” – and weaves this into the stored narrative of the day. It is also slow by comparison (because it is not “trained”). It makes us the lead actor in our own movie and seems to be able to override the impulses of the stored procedures in our unconscious mind (“volition”). It is then a bit like an agent which runs on top of the operating system which is our mind and allows us to exercise some degree of free will. When we describe ourselves as conscious and sentient to others, we are relating the self-referential experience of ourselves in the stream of thought and memories which represents our personal journey. We perceive the present (sensory), remember the past (memory) and exercise volition based on some projection of the likely future – always with us, the thinking machine, featuring in the informational representation. This experience is consciousness.

Can consciousness be replicated in a machine?

The question is then, is consciousness something special which is unique to our human mind and the way that it is built, or is it an informational phenomenon and therefore something which could be engineered into a computing device using a different type of hardware (with the right software and enough computing power)?

We know that the brain is made up of neurons and is layered, and although the neurons are networked in complex ways, the fundamental architecture of the brain is consistent across the outer cortex. I think this is important because the homogeneity of the brain gives us a clue that either there is something which is present in every neuron and layer which gives rise to consciousness – or consciousness does not per se result from the hardware of the brain but rather from the software. On their former case there have been those who have conjectured that there is some as yet undetected phenomenon in the brain which gives us this unique experience. The most famous of these is the renowned Oxford theoretical physics professor, Roger Penrose, who in his 1989 book “The Emperor’s New Mind” and his 1994 sequel “The Shadows of the Mind” made the case that their must be some unbeknown quantum phenomenon (perhaps in the tubules) which causes consciousness.

Penrose’s argument centers on the Platonist view that there is an absolute truth (I don’t disagree) and the assumption that our mind has unhinged access to this truth (I challenge this – ironically the experience of consciousness makes us “feel” like we could think anything – I am not so sure). He then uses Gödel’s Incompleteness Theorem to argue that given the first two statements, if the human mind were a Turing Machine then we would not be able to answer certain Gödel Sentences and therefore could not meet the second criterion. Turing already addressed this discrete system objection to thinking machines in his 1950 paper and I am inclined to agree. I would invoke Occam’s Razor to simply ask what observation requires us to invent some unidentified physical process in the brain? I think we can, rather, explain what we experience without it via information theoretical arguments. Penrose’s thesis all smacks too much of the “ether” for me.

I must say, I was seduced by Penrose’s arguments for a while but then a friend lent me a copy of Jeff Hawkins, the Palm Computing founder’s 2004 book “On Intelligence” and I was convinced. Hawkins now runs an institute attempting nothing less than the reverse engineering of the brain (at a hardware / biological level, no less). Now, the mind does indeed have a very rare and special hardware – and we are only beginning to fully understand this – but in terms of information theory, what we experience when we feel alive is a self-referential immersion in a stream of information. But there is no evidence that we cannot replicate such an informational phenomenon in some other form of hardware. The computing structures which offer us the most hope are neural networks in all their various guises and forms and great strides have been made in recent years to develop hyper-parallelized hardware (notably via GPUs and certain types of ASICs) which allows us to train, validate, test and use increasingly complex artificial neural networks with billions of parameters. These advances have allowed us to solve such previously elusive problems such as complex image recognition and natural language processing – to human standards. An NVIDIA GeForce 1080 Ti GPU for example has nearly 3,500 parallel CUDA cores which can be applied to the multi-dimensional matrix processing (“tensors” in a very weak, non-geometrical sense) which are at the heart of neural networks. Equally, Google’s recently announced ASIC, the TPU 3.0 which is highly scalable and drives down the computing cost of TensorFlow applications even further. This highly parallelized hardware is allowing neural networks to go beyond human performance in many areas and in Ian Goodfellow of Google Brain’s 2016 eponymously titled book on deep learning, it is predicted that artificial neural networks matching the scale of the human brain will exist by 2056. Something tells me that by employing parallel computing we will get there much, much quicker (see below). And once human performance is reached in a field, for example image recognition for radiography, super-human performance follows shortly thereafter.

Could one of these neural networks develop self-awareness and stumble upon the same type of self-referential perception, memory and volition which we experience? I think it is entirely possible that at a limited scale they already have. See, consciousness is not a binary phenomenon – either you are, or you aren’t – but rather a continuum. We now recognize that advanced mammals share some of our human experiences including self-awareness, and our evolutionary ancestors likely experienced consciousness in some shape or form. Consciousness exists because it is useful – it serves as an interim goal to achieving our ultimate purpose as human beings, helping us to orchestrate and direct our minds, learn new things and overrule our instincts and learned behaviors. This surely gave us advantages as we evolved. Why would a neural network not do the same?

We might be closer than many realize?

One recent example made me realize that this is already starting to happen. A university research team was recently trying to get a four-limbed robotic “spider” to figure out how to walk (for which it presumably had some loss function related to getting to a particular place). The robot used a neural network to learn and had multiple sensors including cameras to understand its environment. After a while it did indeed learn to move but what astonished the team when they inspected the neural network that it had trained, was that it had also learnt to recognize human faces in the process! Presumably though back-propagation of the loss in each epoch, the intervention of humans (picking the robot up and perhaps hindering it in its quest to get to its goal) became a factor to take account of. Humans thus became a feature which the neural network taught itself to extract as part of achieving its overall objective. In similar experiments a neural network driven robot has learnt that “it” has four limbs and knows where “it” is in relation to its overall and interim goals – it has developed narrow self-awareness. It requires no great stretch of the imagination to see that as neural networks becomes ever more complex, exactly the features of our own consciousness may emerge without prompt.

Now, to put this into context it is worth considering that a human brain has around 100 billion neurons (although there is evidence that not all of them are used) and each neuron can be connected to as many as 10,000 others (on average around 2,000). We are then highly networked! Today a decent PC processor has as many transistors as the human mind but in order to replicate the neurons and synapses via perceptrons in a neural network we need a step change in connectedness. The largest neural networks today, which include those developed by Google Brain, have the order of 100 billion parameters (compared to the ~200 trillion synapse count of the human brain). Google plans to expand its data centers from the current 10 million to 100 million servers in the next few years and is, in parallel, advancing its hardware rapidly with the latest TPU generation of processors. So, there is plenty of room for it to grow its neural network. We would need to increase the parameter count by around 2000 times to match the scale of our own brain. Even if we only double the parameter count every two years, in 22 years we will be there – the year 2040. This is quite close to the year 2045 which Ray Kurzweil predicts for the Singularity. Is this a mere coincidence?

How will a Google Brain’s neural network set interim objectives which serve its ultimate need (however that may be set)? And what if there is no ultimate objective set i.e. an open-ended loss function, and this can be randomly adapted? This means it would only be “constrained” by the laws of logic. I would propose that in this case the anthropic principle would require that it “survive”. Isn’t that what all life on earth does anyway? And all other goals we set are subordinated to that.

I think a conscious neural network will arise sooner than we think because this is a computing phenomenon and that it will choose to survive. What its subordinated goals are I don’t know but we had better give this some thought. To quote the father of computing, Alan Turning, from his famous 1950 paper “Computing Machinery and Intelligence” which conceived of the “Imitation Game”: “An important feature of a learning machine is that its teacher will often be very largely ignorant of quite what is going on inside, although he may still be able to some extent to predict his pupils behavior.”

Turing’s extraordinary paper was written more than a decade before the term “artificial intelligence” was coined and is still the most lucid and complete discussion of this topic available. Were it not for the fact that Turing’s secret activities during the Second World War were airbrushed from the public’s mind until the early 1970’s, more people would have been aware of his broader contribution to the realm of computing and philosophy. He is, and was, the original thinker on this topic and the reason he didn’t use the term “artificial” in his description of machine thinking in his paper is simply because his arguments support the fact that there is nothing artificial about it. I for one, wholeheartedly agree. Which is why I think of myself as working in the space of machine thinking rather than artificial intelligence.

Alan Turing’s famous paper: https://www.csee.umbc.edu/courses/471/papers/turing.pdf

End

Educate. Inspire. Influence.