A short history of artificial neural networks

| April 12, 2023

In 1972, computers in Australia were beginning to make a difference to the operation of large companies. The Australian Computer Society was forming and we were looking at systems analysis as a way of improving efficiency.

At that time I was also a member of the Australian Inventors Association which met monthly in Melbourne. These people were lateral thinkers who were prepared to try virtually anything. The main thing I learned from them was the value of ‘whiteboarding’ where a question was raised and everyone made a suggestion about achievability – many were way out but got people thinking along different lines.

I found this very useful in the ACS. It was only a short step from systems analysis to creating programs which could learn. However, it was ten years or so before computer memory advances allowed for any useful results to be achieved and another 25 years before meaningful responses were realised.

It must be remembered that in 1972 a large memory was 1 kilobyte, made by sewing graphite rings together with three wires by women in Hong Kong. This was followed by an equal memory printed onto a one square foot plastic sheet. Now, a one terabyte (1,000,000,000 bytes) external drive is common.

The step from machine learning to development of a neural network mimicking that of the human brain was large, but a logical one. The term neural network incorporates a wide range of systems, yet centrally, according to IBM, these “neural networks – also known as artificial neural networks (ANNs) or simulated neural networks (SNNs) – are a subset of machine learning and are at the heart of deep learning algorithms”.

Crucially, the term itself and their form and structure are “inspired by the human brain, mimicking the way that biological neurons signal to one another”.

Neural networks are now often understood to be the future of AI. They have big implications for us and for what it means to be human. We have heard echoes of these concerns recently with calls to pause new AI developments for a six-month period to ensure confidence in their implications.

As far back as 1989, a team at AT&T Bell Laboratories used back-propagation techniques to train a system to recognise handwritten postal codes. The recent announcement by Microsoft that Bing searches will be powered by AI, making it your “copilot for the web”, illustrates how the things we discover and how we understand them will increasingly be a product of this type of automation.

Drawing on vast data to find patterns, AI can similarly be trained to do things like image recognition at speed – resulting in them being incorporated into facial recognition, for instance. This ability to identify patterns has led to many other applications, such as predicting stock markets. Neural networks are changing how we interpret and communicate too. Developed by the Google Brain Team, Google Translate is another prominent application of a neural network.

Looking back at the history of neural networks tells us something important about the automated decisions that define our present or those that will have a possibly more profound impact in the future. Their presence also tells us that we are likely to understand the decisions and impacts of AI even less over time.

These systems are not simply black boxes, they are not just hidden bits of a system that can’t be seen or understood. There is a good chance that the greater the impact that artificial intelligence comes to have in our lives the less we will understand how or why.

The mystery is even coded into the very form and discourse of the neural network. They come with deeply piled layers – hence the phrase deep learning – and within those depths are the even more mysterious sounding “hidden layers”. The mysteries of these systems are deep below the surface.

There is a good chance that the greater the impact that artificial intelligence comes to have in our lives the less we will understand how or why. Today there is a strong push for AI that is explainable. We want to know how it works and how it arrives at decisions and outcomes.

There is a need for explainability, demanding that “for high-risk AI systems, the requirements of high-quality data, documentation and traceability, transparency, human oversight, accuracy and robustness, are strictly necessary to mitigate the risks to fundamental rights and safety posed by AI”.

This is not just about things like self-driving cars (although systems that ensure safety fall into the EU’s category of high-risk AI), it is also a worry that systems will emerge in the future that will have implications for human rights.

These neural networks may be complex systems, yet they have some core principles. Inspired by the human brain, they seek to copy or simulate forms of biological and human thinking. In terms of structure and design they are, as IBM also explains, comprised of “node layers, containing an input layer, one or more hidden layers, and an output layer”.

Within this, “each node, or artificial neuron, connects to another”. Because they require inputs and information to create outputs, they “rely on training data to learn and improve their accuracy over time”. These technical details matter but so too does the wish to model these systems on the complexities of the human brain.

In 2018, technology journalist Richard Waters noted how neural networks “are modelled on a theory about how the human brain operates, passing data through layers of artificial neurons until an identifiable pattern emerges”.

This creates a knock-on problem, Waters proposed, as “unlike the logic circuits employed in a traditional software program, there is no way of tracking this process to identify exactly why a computer comes up with a particular answer”.

Waters’ conclusion is that these outcomes cannot be unpicked. The application of this type of model of the brain, taking the data through many layers, means that the answer cannot readily be retraced. The multiple layering is a good part of the reason for this.

As the layers of neural networks have piled higher their complexity has grown and has led to the growth of ‘hidden layers’ within these depths. This problem needed to be reckoned with especially as, he thought, this was something “the nervous system figured out a long time ago”. As the layers multiplied, deep learning plumbed new depths.

The neural network is trained using training data that, computer science writer Larry Hardesty explained, “is fed to the bottom layer – the input layer – and it passes through the succeeding layers, getting multiplied and added together in complex ways, until it finally arrives, radically transformed, at the output layer”.

The more layers, the greater the transformation and the greater the distance from input to output. The development of Graphics Processing Units (GPUs), in gaming for instance enabled the one-layer networks of the 1960s and the two to three-layer networks of the 1980s to blossom into the 10, 15, or even 50-layer networks of today.

Neural networks are getting deeper. Indeed, it’s this adding of layers that is “what the ‘deep’ in ‘deep learning’ refers to”. This matters, he proposes, because “currently, deep learning is responsible for the best-performing systems in almost every area of artificial intelligence research”.

But the mystery gets deeper still. As the layers of neural networks have piled higher their complexity has grown. It has also led to the growth in what are referred to as “hidden layers” within these depths. The discussion of the optimum number of hidden layers in a neural network is ongoing. The media theorist Beatrice Fazi has written that “because of how a deep neural network operates, relying on hidden neural layers sandwiched between the first layer of neurons (the input layer) and the last layer (the output layer), deep-learning techniques are often opaque or illegible even to the programmers that originally set them up”.

SHARE WITH: