Data Visualization
Aim of this project was to predict and visualize the temperature change over a century of changes by using neural networks. The use of heatmaps made the visualization effective due to the time series nature of the data.
Team : 3 ; Niharika Mathur, Shreya Verma, Shivam Ghildiyal
Individual Contribution: Neural Network Design, Visualization
Tools used: Python Libraries, Jupyter Notebook, RStudio
This project was done by us as part of the class project for the course Data Visualization (CSE3020).
OVERVIEW
Global warming, also referred to as climate change, is the observed century-scale rise in the average temperature of the Earth's climate system and its related effects.
Multiple lines of scientific evidence show that the climate system is warming. Since it is difficult for humans to comprehend huge amount of raw data, we reach for data visualization solutions.
In our approach, we will apply time series analysis on temperature data and produce heat maps for rainfall and temperature.
The data of changing earth temperature can be very helpful to predict the rising temperature of a specific geolocation and future temperature changes.
The aim of this project is to visualize and examine trends in temperatures and rainfall across the world and in India from the twentieth to the twenty-first century.
Parameter Metrics
Temperature in locations over time can be used to predict future trends. The LSTM layer expects input to be in a matrix with the dimensions: [samples, time steps, features] .
-
Samples: These are independent observations from the domain, typically rows of data.
-
Time steps: These are separate time steps of a given variable for a given observation.
-
Features: These are separate measures observed at the time of observation.
-
Number of neurons: Also called the number of memory units or blocks. The network requires a single neuron in the output layer with a linear activation to predict the temperature at the next time step.
-
Number of epochs (iterations): The batch size, along with the number of epochs, defines how quickly the network learns the data (how often the weights are updated).
Root Mean Square Error: Root Mean Square Error (RMSE) is the standard deviation of the residuals (prediction errors). Residuals are a measure of how far from the regression line data points are; RMSE is a measure of how spread out these residuals are. In other words, it tells you how concentrated the data is around the line of best fit. Root mean square error is commonly used in climatology, forecasting, and regression analysis to verify experimental results.
​
Procedure:
-
Read data from CSV
-
Create a pandas data frame
-
Set the color scale for heat map
-
Create a dictionary to specify the data parameters like color scale and values
-
Create a dictionary to specify the layout parameters like title and projection
-
Create a dictionary combining the data and layout dictionaries
-
Plot the graph
Visualizations
Avg. Temperature change from 1900 - 2012
Global Temp. on 1st Jan, 1900
Temperature change in Europe between 1850 and 1900
Rainfall in India in 1901
Rainfall in India in 2012
Rainfall change in India between 1901 and 2012
Conclusions
Heatmaps of the world were visualized to analyze the temperature trends. After examining the heatmaps, it was observed that there was significant increase in temperature from twentieth to twenty-first century particularly in the northern hemisphere. Europe was a particular area of interest.
Analyzing heatmaps showed that there was considerable increase in Europe’s temperature from 1850 to 1900 presumably due to effects of industrialization. This module was implemented on Jupyter Notebook (Python 3.6).
Heatmaps of India and its states were visualized to examine the rainfall patterns. We interpreted the maps and concluded that rainfall in India over a century ago was higher. This module was implemented on RStudio.