I was totally wrong about Instagram hashtags. And so are you.

Add five hashtags to your post and thats 15 cents of value. Add the full 30 hashtags you’re just shy of a dollar in added value. Thats how I used hashtags. It’s how the rest of the world uses…

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转

Artificial Neural Networks And its Intuition

Artificial Neural Networks(ANN): ANNs are a collection of simple nodes called Neurons that are connected in a rather complex manner to make up the intelligent computing systems. These systems are inspired by the biological Neurons found in the Human brains.

Neuron: A Neuron is the smallest unit of neural network which implements a mathematical function relevant to the network in context.
A neuron has three parts.
1. Input connections
2. Core
3. Output connections

A Neuron takes input from the input connections, performs some mathematical functions on the inputs, and gives the result as an output through Output connections.

Below is a picture describing a simple Neuron.

Here the Values X₁, X₂, …, Xn are received as input to the neuron via corresponding input connections. Each connection is associated with a value called weight and there’s another connection called bias. Now the neuron multiplies the input value to the corresponding weight associated with the connection, now all these multiplied values and biases are summed together to obtain a value. This value is fed to the Activation function to give the output Y.

What is a bias and Activation function?

A bias is an extra connection that is constant to the neuron. The purpose of bias is a bit too technical to go into, for simplicity bias helps the model to learn better on the given data. And now an Activation function is a function that describes a rule for the neuron. It can be something like output 1 whenever the value is above a certain threshold and 0otherwise.

This is what pretty much Neurons are.

Now a bunch of interconnections between these simple neurons is what called Artificial Neural Networks(ANN). Usually, these connections are made in layers, where each neuron from one layer will be connected to another neuron from another layer.

The connections can be in any way that suits the problem we’re trying to solve. It can be fully connected networks where every neuron in one layer is connected to all the neurons in another layer, or it can be a one to one connection among neurons between two layers. It can be any random connection, it qualifies as a neural network but it might or might not solve the problem under consideration. Connections and Architectures of these Neural Networks highly affect the working of the network and its learning capability.

An important question to ponder.

To answer this, let’s consider an example:

Let’s say you’re trying to figure out whether to go out for dinner on a particular day. When you’re trying to make any decision you have some factors to consider before making a decision. In this case let’s say they are weather, mood, and what day it is. And each of these factors has a certain importance to them like you might go out if weather and mood are favorable even though it’s a weekday. This means weather and mood have high importance compared to the day. This doesn’t mean the day isn’t relevant. Now on a given day after considering these factors you decide whether to go out for dinner or not.

In Neural networks terminology Factors are Inputs, their corresponding importance is weights.

Neural networks also work the same way as you did, but on a very large scale that’s impossible for a human to consider such big factors and make a decision. To give an intuition, if we want to predict what the stock price of a particular company will be in the future there are thousands of factors at play each with their importance, considering all these factors are impossible for humans to digest. So here come Neural networks to the rescue, they can work on millions of parameters and give astonishingly accurate results.

This might be the question on everyone’s head.

The answer to this question is simple.

They learn.

The learning method is simple, its a Reward-Punishment scheme where a model is asked to predict something and if it’s close to the correct answer its rewarded, and if it’s wrong it is punished. Whenever the model is rewarded the model tries to keep up the good score to gain future rewards, and whenever the model is punished, it tries to make fewer mistakes to reduce its chances of being punished in the future. These Reward-Punishment cycles can go on until the desired accuracy is achieved, or it can be stopped after a certain number of training steps.

This type of learning is called Supervised Learning. Its called Supervised because there’s supervision on the model to reward based on the performance. It’s like a teacher rewarding a student.

The other method of learning is unsupervised learning. Here the model doesn’t have a supervisor to guide through the training. The model itself has to learn the patterns and come up with results.

Unsupervised learning is learning without a teacher(like online tutorials), its good learning experience but not as good as the one with a teacher. Usually, Supervised models outperform Unsupervised models.

Supervised

For training a neural network the most important thing to remember is the Loss function. A Loss function evaluates the loss of the model, i.e how bad the model is performing. when training the motive of the algorithm is to minimize the Loss value for the model. This is done by modifying the weights of the model in such a way that the loss function gives a lower value than the previous iteration value. This is done by Backpropagation through time and using a Gradient descent optimization algorithm(mostly).

Back Propagation Through Time(BPTT): BPTT is a continuous loop where after each iteration the algorithm reverses back to modify the weights of its connections, and this modification is done by Gradient Descent Algorithm.

Gradient Descent: This algorithm tries to minimize the loss function by finding the local minima. It takes small steps in the direction of the negative gradient to hit a local minimum.

Below is the picture describing the Gradient Descent.

J(w) is a loss function for weight w. In the first step, weights are initialized randomly, and each step after that follows the above pattern

1. Calculate the gradient
2. Update the weights based on the Gradient
3. Go to step 1.

The training of a Neural Network is influenced by a parameter called learning rate which the user has to set for a model before training it. Hence it is a Hyperparameter.

The Learning rate is essentially how big the step should be after each weight updating step. If the learning rate is too small the steps taken are small and the model might not reach the local minimum. And if the learning rate is too high the model might miss the local minimum and bounce back-and-forth trying to find local minima.

L(w) is the Loss function. Epoch is the training step.

Essentially Gradient Descent can be viewed as a hiker(weight) who wants to hike down the tip(minima) of a valley(loss function). And each step taken depends upon the steepness(gradient) of the slope and the leg length(learning rate) of hiker.

Neural Networks are found in many research domains. Mostly found in Deep Learning architectures for Computer vision, Time-series prediction, Natural Language Processing et cetera.

The Deep Learning domain of ANNs is the closest one to human brain architecture because of the depth of the neuron layers and its striking similarity with the working of the human brain. A typical Deep Learning model can contain from 5 layers to thousands of layers. Due to the exponential growth in computing power and the high availability of data DL is becoming the new hot favorite in the AI community.

Some of the Deep Learning architectures for Computer vision include CNNs, U-Net, ResNet.

LSTMs and GRUs are most sought after architectures for Time-series prediction and Natural Language Processing(NLP).

Great!! now that you have finished the article, don’t stop that curious mind and never stop asking dumb questions. Because small dumb steps are what makes you intelligent at the end, like our ANNs. ;D

Keep asking… Keep learning…

I was totally wrong about Instagram hashtags. And so are you.

Artificial Neural Networks And its Intuition

Add a comment

Related posts:

Here is a Tip For You

Thousands deported during coronavirus pandemic

The Books I Wish I Had