Biological & Artificial Neural Networks

This notebook sets out to describe Biological & Artifical Neural networks and to understand some key concepts

Background

Biological Networks

A biological network is any network that applies to biological systems.

Biological systems are often represented as networks which are complex sets of binary interactions or relations between different entities. Essentially, every biological entity has interactions with other biological entities, from the molecular to the ecosystem level, providing us with the opportunity to model biology using many different types of networks such as ecological, neurological, metabolic or molecular interaction networks.

Systems biology aims to understand biological entities at the systemic level, analysing them not only as individual components, but also as interacting systems and their emergent properties. Related to this is network biology which allows the representation and analysis of biological systems using tools derived from graph theory.

Network theory is the study of graphs as a representation of either symmetric relations or asymmetric relations between discrete objects. In computer science and network science, network theory is a part of graph theory: a network can be defined as a graph in which nodes and/or edges have attributes

Network

There are a number of networks in biology and the one that is of particular importance in this case are Neuronal networks

The complex interactions in the brain make it a perfect candidate to apply network theory. Neurons in the brain are deeply connected with one another and this results in complex networks being present in the structural and functional aspects of the brain

A neural circuit is a population of neurons interconnected by synapses to carry out a specific function when activated.

One principle by which neurons work is neural summation – potentials at the postsynaptic membrane will sum up in the cell body. If the depolarization of the neuron at the axon goes above threshold an action potential will occur that travels down the axon to the terminal endings to transmit a signal to other neurons.

Summation, which includes both spatial and temporal summation, is the process that determines whether or not an action potential will be generated by the combined effects of excitatory and inhibitory signals, both from multiple simultaneous inputs (spatial summation), and from repeated inputs (temporal summation). Depending on the sum total of many individual inputs, summation may or may not reach the threshold voltage to trigger an action potential.

At any given moment, a neuron may receive postsynaptic potentials from thousands of other neurons. Whether threshold is reached, and an action potential generated, depends upon the spatial (i.e. from multiple neurons) and temporal (from a single neuron) summation of all inputs at that moment. It is traditionally thought that the closer a synapse is to the neuron's cell body, the greater its influence on the final summation

Weights

In neuroscience and computer science, synaptic weight refers to the strength or amplitude of a connection between two nodes, corresponding in biology to the amount of influence the firing of one neuron has on another. The term is typically used in artificial and biological neural network research. Each connection between two neurons has a unique synapse with a unique weight attached to it.

Artificial Intelligence

Deep Learning is a subset of Machine Learning and Machine Learning is turn a subset of Artificial Intelligence. This illustration explains the relationships between the three entities

Biological Neuron

It is easier to understand how a biological neuron works by breaking down the process into 3 phases:

  • First phase can be referred to as the input phase where the dendrites gather inputs through synaptic pathways, in biological terms this is referred to as impulses that may travel from other neurons .
  • Second phase consists of the main cell body which consists of the cell nucleus, which is the cells control centre. Here it is important to understand that if the resulting activation exceeds a certain threshold it generates an action potential that propagates into the third phase
  • Third phase is the output phase where the axon carries the signal to the next cell or neuron provided as mentioned before that a certain threshold has been reached.

Artificial Neuron

An artificial neuron is a mathematical function loosely conceived from biological neurons.

Again to make it easier to understand how artificial neurons work, we break it down into 3 phases:

  • First phase Involves one or more inputs $x$0 multiplied to weights $w$0
  • Second phase Involes the summation of all the inputs and adding them to a bias term also known as the weighted sum
  • Third phase Involves a non-linear function typically known as an activation function which determines whether the weighted sum is greater than a threshold triggering an output event.

Phase 1 - Input Phase

The input phase consists of one or more inputs, $x$ which are multiplied by an equivalent number of weights $w$.

Inputs can be a number of things such as images, words, sentences, sound but for this tutorial we are going to use images as the input

It is important to understand that inputs can be images or words but whatever the input is the computer has to convert this into a numerical form so that it can understand the input.

Images can be broken down into pixels. A pixel is generally thought of as the smallest single component of a digital image where each pixel has a value between 0 and 255 where 0 typically refers to the absence of light and hence black in color and 255 refers to white.

In the image below notice that the image has been inverted hence the reason why 0 is showing as white and 255 as black.

#TO DO - need input example
import numpy as np
#Create random data
#create 20 random numbers ranging from -10 to 10
x = np.linspace(-10, 10, 20)
print(x)
[-10.          -8.94736842  -7.89473684  -6.84210526  -5.78947368
  -4.73684211  -3.68421053  -2.63157895  -1.57894737  -0.52631579
   0.52631579   1.57894737   2.63157895   3.68421053   4.73684211
   5.78947368   6.84210526   7.89473684   8.94736842  10.        ]

Phase 2

#TO DO phase 2 info = Hidden layers?? Summation of all inputs
#TO DO - phase 2 example

Phase 3 - Activation Function

In artificial neural networks, the activation function of a node defines the output of that node, or "neuron," given an input or set of inputs. This output is then used as input for the next node and so on until a desired solution to the original problem is found

It maps the resulting values into the desired range such as between 0 to 1 or -1 to 1 etc. (depending upon the choice of activation function). For example, the use of the logistic activation function would map all inputs in the real number domain into the range of 0 to 1.

There are a number of activation functions as described below.

Sigmoid Activation Function

A logistic function or logistic curve is a common "S" shape (sigmoid curve), with equation:

import numpy as np
#Sigmoid function in python
#The function returns a value between 0 and 1
#if x is a large negative number, the sigmoid function will return a value closer to 0
#If x is a large positive number, the sigmoid function will return a value closer to 1
def sigmoid(x):
  return 1 / (1 + np.exp(-x))
#Large negative number - closer to 0
x = -7
sigmoid(x)
0.0009110511944006454
#Large positive number - closer to 1
x = 7
sigmoid(x)
0.9990889488055994
#plotting sigmoid function
from matplotlib import pyplot as plt
input = np.linspace(-10, 10, 100)

fig = plt.figure()
ax = fig.add_subplot(111)
ax.grid(True)
ax.plot(input, sigmoid(input), c="r")
ax.set_title('Sigmoid Activation Function')

plt.show()

Tanh Activation Function

The tanh function, a.k.a. hyperbolic tangent function, is a rescaling of the logistic sigmoid, such that its outputs range from -1 to 1.

#Tanh function in python
#The function returns a value between -1 and 1
#if x is a large negative number, the tanh function will return a value closer to -1
#if x is a large positive number, the tanh function will return a value close to 1
def tanh(x):
    return np.tanh(x)
#large negative number - closer to -1
x = -7
tanh(x)
-0.9999983369439447
#Large positive number - closer to 1
x = 7
tanh(x)
0.9999983369439447
#plotting tanh curve
input = np.linspace(-10, 10, 100)

fig = plt.figure()
ax = fig.add_subplot(111)
ax.grid(True)
ax.plot(input, tanh(input), c="r")
ax.set_title('Tanh Activation Function')

plt.show()

Relu Activation Function

In the context of artificial neural networks, the rectifier is an activation function defined as the positive part of its argumen

where x is the input to a neuron. This is also known as a ramp function and is analogous to half-wave rectification in electrical engineering. This activation function was first introduced to a dynamical network by Hahnloser et al. in 2000 with strong biological motivations and mathematical justifications.[1][2] It has been demonstrated for the first time in 2011 to enable better training of deeper networks,[3] compared to the widely-used activation functions prior to 2011, e.g., the logistic sigmoid (which is inspired by probability theory; see logistic regression) and its more practical[4] counterpart, the hyperbolic tangent. The rectifier is, as of 2018, the most popular activation function for deep neural networks.[5][6]

A unit employing the rectifier is also called a rectified linear unit (ReLU)

#Relu function in python
#The function returns a value bewtten 0 and max
#If x is negative number the function will return 0
#If x is positive number the function will return the current positive number
def relu(x):
    return max(0, x)
#Using negative number
x = -7
relu(x)
0
#Using postive number
x = 12
relu(x)
12
z = np.arange(-2, 2, .1)
zero = np.zeros(len(z))
y = np.max([zero, z], axis=0)

fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(z, y)
ax.set_ylim([-1.0, 2.0])
ax.set_xlim([-1.0, 2.0])
ax.grid(True)
ax.set_xlabel('z')
ax.set_title('Rectified linear unit')

plt.show()

Softmax Activation Function

The softmax function is a function that takes as input a vector of K real numbers, and normalizes it into a probability distribution consisting of K probabilities

#Softmax function in python
# The function returns probabilites that add up to 1
def softmax(x):
    return np.exp(x) / np.sum(np.exp(x), axis=0)
#x is an array of 3 numbers
x = [4.0, 2.5, 5.0]
#Softmax function returns an array of probabilites that add up to 1
softmax(x)
array([0.25371618, 0.05661173, 0.68967209])