Neural Networks
Leveraging Artificial Intelligence to further our own.
One of the most intriguing creations in the field of machine learning and at the very centre of deep learning, neural networks are an artificial method of mimicking the human brain’s biological actions of sending neural signals and identifying patterns. The different “neurons” within the network each receive certain information, process it, and then pass it forward from which, at the end, a certain conclusion is obtained. This information can then be considered as a certain form of intelligence. In this paradigm, a computer is usually fed with various labelled inputs in which it recognizes patterns and is then able to accurately correlate new inputs with what their classifications should be.
Neural Networks vs Climate Change
The application of these artificial neural networks can be found in all parts of our daily lives such as language translators, face recognition software or even the customized social media feeds we meet every day; however, its application can be seen on a much larger scale as well. A neural network by the name of Calving Front Machine (CALFIN) has been created which can analyse decade’s worth of satellite imagery to measure the glacier edges and their changing sizes, with the same accuracy as trained scientists. It is dependable, autonomous and can monitor the rate of glacier loss taking place all over the world by processing countless amounts of glacier images, which humans cannot. To prepare the neural network, a training period was conducted where the network was exposed to thousands of glacial images and the weights and thresholds for each of the nodes were tweaked accordingly to achieve the final network which could judge the new inputs correctly.
Neural Networks in Cosmology
Another application, reported through an article by Carnegie Mellon University, is the use of neural networks and machine learning to better compute cosmological simulations. Cosmological simulations are an essential source of information as they allow scientists to analyse data about dark matter and dark energy in the universe. However, the issue that the researchers faced was that they could either have low-resolution simulations of a large-scale view or a high-resolution simulation of a very small-scale view, which prevented them from being able to analyse all areas and aspects accurately. To counter this obstacle, professors at Carnegie Mellon University have created a machine-learning algorithm that can convert a low-resolution simulation into a super-resolution simulation. The algorithm can generate simulations that can focus clearly on up to 512 times as many more particles, all under a reduced processing time that is only a tenth of what it was originally.
The Professors that are heading this project, Yin Li and Yueying Ni, expect the algorithm to get much more efficient as it is continuously trained using various data sets. They used a specific approach known as the generative adversarial network, in which two neural networks are pitted against each other. Given some inputs, both the networks are used to create new data to arrive at a result with more speed and accuracy. In this case, one of the neural networks handles taking low-resolution images as input and converting it into high-resolution images. The second network handles distinguishing between the artificially created image and the image created conventionally by scientists. The method is repeated until the second network is unable to distinguish between the artificial simulation and the real one created by humans, meaning that the first network is creating simulations with just as much accuracy as normal scientists.
But the blazing efficiency we see today in the creation of such simulations was not easy. The algorithm did not work for over 2 years until finally, the Professors made a breakthrough and the model started generating accurate models. These simulations were focused on dark matter and gravity primarily and hence have omitted “smaller” events such as star-formation and the effects of black holes. They hope to extend the algorithm to study these events as well in the future. Studying such a large entity such as the universe can seem impossible, but with breakthroughs like these, we get closer to understanding the workings of the universe.
How it works
The actual working of the network is such that it is made up of a series of smaller nodes that are densely interconnected and may even have various layers. Each of these nodes is connected to a multitude of other nodes from which it receives data or sends data. The nodes all have a certain value called a “weight” and whenever they receive certain information, they process it using the weight and then compare it to the pre-set threshold value. If the processed value is smaller than the threshold, no value gets passed onwards from the node, but if it is greater than the threshold, the node pushes forward its number output, which is a sum of all its weighted inputs, to all its outward connections.
Mathematically, first of all, each input is multiplied with the corresponding weights that have been assigned to that specific node. The weight signifies a particular correlation between two particular nodes and decides how much/little they influence each other, and it is evaluated through the following summation equation.
Next to obtain the row vectors of inputs and weights, the dot product is calculated:
Adding the bias of that node, the function is moved forward to obtain the required output:
The last step in this process of forward propagation is passing the value obtained through an activation function, such as the sigmoid function, which makes it undergo a nonlinear transformation. This makes it better suited to handling a variety of different complex scenarios which may not be linear in nature. The sigmoid function for example is:
Next, to understand the accuracy of the network, the difference between the actual value of the input and the value predicted by the node is analyzed using Mean Squared Error formula.
Once this accuracy is found, it must be understood by how much the algorithm is overshooting or undershooting the predictions and based on that the weights and the bias are changed accordingly to better enhance the working of the algorithm, in a sense allowing it to “learn.”
A certain Cost function is used to understand the loss in accuracy, where the lower the Cost function, the lower the difference between the actual and predicted values. Therefore, in every iteration, the partial derivative of that cost function is calculated with respect to each of the parameters in the equation as such:
This can be simplified to:
And the values of the bias and the weights are updated accordingly with every iteration to reach a value as accurate as possible.
Conclusion
In a sense, neural networks have opened us up to a world of new possibilities in terms of a very accurate and surprisingly fast analytics system with skills that can almost match that of a human. In today’s day and age where information and knowledge are essentially the key to any step forward, neural networks are sure to take up an indispensable role in all fields and industries, prompting continued research and a deeper understanding of their working and implementations.
Written by Ishita Pandey and Yaswanth Biruduraju
References
- https://news.mit.edu/2017/explained-neural-networks-deep-learning-0414
- https://www.ibm.com/cloud/learn/neural-networks
- https://news.climate.columbia.edu/2021/05/05/artificial-neural-network-joins-fight-against-receding-glaciers/
- https://www.cmu.edu/mcs/news-events/2021/0504_supersims.html
- https://towardsdatascience.com/introduction-to-math-behind-neural-networks-e8b60dbbdeba