The dead end in neural network research

Are you a developer?

Join my startup: Elody

Work together with other developers and build a Gestalt AI.

Elody can make your programs accessible to
non-technical users.

The dead end in neural network research

Contemporary Artificial Neural networks are a (very profitable) dead end.

Hardware acceleration and hyperspecialized algorithms have allowed neural networks to become much more successful than other AI approaches. However, they have also ensured that neural networks are overspecialized. There are some tasks for which they are not suitable at all, but researchers have no incentive to change this.

If someone were to invent a revolutionary new technique to improve artificial neural networks, if the improvement does not benefit from hardware acceleration, then it won't be adopted. Even though the end result of continuing that line of research would be amazing in the long term, nobody would ever even try because the immediate results just can not compete with the performance of the overspecialized contemporary neural networks.

In terms that should be clear to AI researchers: If we view AI research itself as an optimization problem, then we are currently stuck in a local optimum. All our research to improve neural networks further only drives us further into the local optimum. The global optimum, which represents a General Artificial Intelligence, will not be achieved unless we change course.

What contemporary neural networks are missing

There are a number of features the brain has that artifical neural networks make no attempt to replicate. While many of these features are probably not strictly necessary and are just artifacts of evolution, some of them strike me as important enough that I think artifical neural networks are really missing out on something.

The reason artificial neural networks do not have these features is because replicating them would add mathematical complexity that makes it harder to model and simulate.

Consider that our first research attempts at modelling these features would be unlikely to work, because that is the nature of research, and it should become clear why nobody is funding this type of research. Why spend your time on something that might not even work, when there is an existing alternative that works several orders of magnitude better from hardware acceleration alone?

The following is a short list of just a few of the features biological brains have that artifical neural networks do not emulate:

  • Brains are able to form new neurons as necessary, or remove or repurpose old ones that prove useless. If a concept in the brain turns out to be more important and complex than first anticipated, new neurons are created and allocated to it.

  • Brains are inherently recurrent and can iterate on themselves. Artificial Neural Networks are mostly linear, and have some recurrence built in as a special case. Brains are mostly complex recurrent things and have some linear relations built in as a special case.

  • Brains can form new connections as necessary and delete old ones. Most artifical neural networks work in layers: Every node in one layer is connected to every node in one other layer, but not to any other nodes in other layers. This rigidity is the reason why artifical neural neworks can be optimized so effectively, but it also puts a strict limit on their flexibility.

  • Brains have many different types of neurons that all work in different ways. Sometimes the differences are subtle, sometimes enormous. Artifical neural networks usually have only one type of neuron.

  • Neurons can have internal states that change over time and influence their firing behavior and their growth.

  • The brain contains various hormones that can transmit information by altering the behavior of large numbers of neurons at a time. In an artifical neural network, this would be like having a mechanism that alters the hyperparameters of part of the network based on the firing behavior of the network. This has an immense potential for many different purposes, but to my knowledge nobody has ever tried it, or if they have then the results weren't good enough to publish and be used in practice.