Using Plotly in R for Panel Data Visualization

Holaaa, readers!

Right now, I want to share about how to using plotly in R for Panel Data Visualization. Stay tune :)

Source : https://plot.ly/products/dash/

Before we talk about how to using plotly in R, I’ll tell you about Panel Data.

What is Panel Data?

Data panel is data that is formed from two data structures, time series and cross section.

Time series is is a group of obsevation on a single entity over time. For example : Number of Rainfall each day in Indonesia for 10 years.

Cross section is a group of observations of multiple entities at a single time. For example : Number of Population for each Indonesian provinces in 2018.

We can call our data as Panel Data if the data organized in both dimentions. For example : Number of Population for each provinces in from 2010–2018.

The data that I used is dataset Gapminder. Just click the link below to get the data!

Dataset Gapminder

First of all, input the data from sheet 1, sheet2, sheet 3, and sheet 4. The data format is .xlsx, so we must install package “openxlsx” before we use the appropriate script to input XLSX data into R.

#for sheet 1
gapminder <- read.xlsx("E:\\file kuliah\\smt 6\\Data Visualization\\uk2\\gapminder.xlsx", sheet = 1, startRow = 1, colNames = TRUE)
View(gapmider)
GDP Data
#for sheet 2
gapminder1 <- read.xlsx("E:\\file kuliah\\smt 6\\Data Visualization\\uk2\\gapminder.xlsx", sheet = 2, startRow = 1, colNames = TRUE)
View(gapminder1)
Population Data
#for sheet 3
gapminder2 <- read.xlsx("E:\\file kuliah\\smt 6\\Data Visualization\\uk2\\gapminder.xlsx", sheet = 3, startRow = 1, colNames = TRUE)
View(gapminder2)
Life Expectancy Data
#for sheet 4
gapminder3 <- read.xlsx("E:\\file kuliah\\smt 6\\Data Visualization\\uk2\\gapminder.xlsx", sheet = 4, startRow = 1, colNames = TRUE)
Region Data

Then create a vector that contains only one column, namely the country. Then repeat as many as 47 times, because of the years from 1970 to 2016. Repeat the years (from 1970 to 2016) as many as 170 times, because there are 170 countries.

#mengambil varibelcountry.vec <- gapminder [,1]
country.vec
# membuat pengulangan variabel panel (replikasi)country_panel <- c()
for (i in 1:170)
{
x = rep(country.vec[i], 47)
country_panel <- append(country_panel, x)
}
View(country_panel)
years_panel <- rep(1970:2016, 170)
years_panel
gdp_panel <- c()
for (i in 1:170)
{
x = gapminder[i,]
x = x[-c(1:3)]
x = t(x)
gdp_panel <- append(gdp_panel, x)
}
gdp_panel

Do the same thing to variable population in sheet 2 and life expectancy in sheet 3.

#mengambil data untuk sheet populasi
pop_panel <- c()
for (i in 1:170)
{
x = gapminder1[i,]
x = x[-c(1:3)]
x = t(x)
pop_panel <- append(pop_panel, x)
}
pop_panel
#mengambil data life expectation untuk sheet 3
life_panel <- c()
for (i in 1:170) {
x = gapminder2[i,]
x = x[-c(1:3)]
x = t(x)
life_panel <- append(life_panel, x)
}
life_panel

The next steps, create a vector that contains only one column (column 6), and namely the region. Then repeat as many as 47 times, because of the years from 1970 to 2016.

region.vec <- gapminder3 [,6]
region.vec
region_panel <- c()
for (i in 1:170) {
x = rep(region.vec[i], 47)
region_panel <- append(region_panel, x)
}
region_panel

After getting 6 vectors; namely country_panel, years_panel, gdp_penel, pop_panel, life_panel, and region_panel, combine the vector into a data frame.

gapminder_frame <- data.frame(region_panel,country_panel, years_panel, gdp_panel, pop_panel, life_panel)
View(gapminder_frame)

Below is a snapshot of the original data looks like after loading the dataset into a dataframe.

Next, make a visualization of gapminder_frame ggplot with the name gap1. The x axis is the log of gdp_panel and the y axis is life_panel.

#membuat visual dengan plotlygap1 <- ggplot(gapminder_frame, aes(x = log(gdp_panel), y = life_panel)) + geom_point()
gap1

From the picture above there isn’t any information we can get. So, we need to make a layer based on the years with the name gap2.

gap2 <- ggplot(gapminder_frame, aes(x = log(gdp_panel), y = life_panel)) + geom_point(aes(frame = years_panel))
ggplotly(gap2)

The output above still can’t give the detailed information because all points are black so they cannot be categorized by country. So to distinguish plots by country, we can add the script “color = country_panel” with the name gap3.

gap3 <- ggplot(gapminder_frame, aes(x = log(gdp_panel), y = life_panel, color = country_panel)) + geom_point(aes(frame = years_panel))
ggplotly(gap3)

To see the final results, there are several conditions that must be met to get the visualization, namely:

Axis x (x axis) = gdp_panel
Axis y (y axis) = life_panel
Color (color plot) = country_panel
Size (plot size) = pop_panel
Shape (plot form) = region_panel by continent

gap4 <- ggplot(gapminder_frame, aes(x = log(gdp_panel), y = life_panel, color = country_panel)) + geom_point(aes(shape = region_panel,size = pop_panel,frame = years_panel))
ggplotly(gap4)

Yeay! This is how the animation look like. So beautiful, right?

Now you can enjoy your well deserved GIF animation!

See you on another topic!

--

--

--

Ku abadikan disini, karena aku paham betul bahwa ingatan manusia terbatas.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

READ/DOWNLOAD!> Outlier Analysis FULL BOOK PDF & FULL AUDIOBOOK

Small Business, Big Results: The Impact of Data Analytics and AI

Hybrid tables in Power BI — The ultimate guide!

Mastering Binary Search Tree — BST

Mobile Food Vendor Violations in NYC

5 Ways to Capture User Sentiments and Issues to Improve Products

Bayesian online change point detection — An intuitive understanding.

Introducing rorodata: Making data science work for you in production

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gifa Delyani Nursyafitri

Gifa Delyani Nursyafitri

Ku abadikan disini, karena aku paham betul bahwa ingatan manusia terbatas.

More from Medium

Data Tales: Art in Data Visualization

Data Science — Data Visualization

A Case for a Dual Axe(s) Attack

Tableau Maps: How to Use Geographic Shapefiles to Query Data