Saturday, January 27, 2024

Newspaper Graph Series

Newspaper Graph series

Newspaper Graphs series

Journalism is a profession which requires lot of story telling using graphs. Here I am going to re-create graphs published in newspapers. Today I have taken Indian express and tried to recreate some of the graphs. First let’s see the graphs as published in the newspaper:

Recreation in R using ggplot2 and friends packages

I am going to try my hand first with left-hand side graph. It seems many people like barplots but ofcourse they are straight-forward and very informative.

#install libraries 

library(tidyverse)

# Create datasets
States <- c("Gujarat", "Maharashtra", "Karnatka", "Rajasthan", "Kerala", 
            "Haryana", "Tamil Nadu", "Telangana", "Punjab", "MP")
rooftopSolar = c(2898.16, 1716.3, 1562.11, 1002.44, 512.67, 486.23, 
                 449.22, 343.78, 298.92, 296.02)

roofTopSolCap = tibble(Sates = States,
                       `Rooftop solar` = rooftopSolar)

# See there are two kinds of numeric labels one is in white colour the other one is # in black so let's create two variables to be used to separate them.
roofTopSolCap <- roofTopSolCap |> 
  mutate(gujLabel = case_when(
    States == "Gujarat" ~ 2898.16,
    .default = NULL
  )) |> 
  mutate(restLabel = case_when(
    States == "Gujarat" ~ NA,
    .default = `Rooftop solar`
  ))


# Make a bar plot

roofs <- roofTopSolCap |> 
  ggplot(aes(x = reorder(States, `Rooftop solar`), y = `Rooftop solar`))+
  geom_col(fill = '#83b315') +
  geom_text(aes(label = scales::comma(gujLabel, accuracy = 0.01)), 
            position = position_dodge(0.2),
            hjust = 1.2, # inside the bar
            colour = 'white',
            size = 4
  ) +
  geom_text(aes(label = scales::comma(restLabel, accuracy = 0.01)), 
            position = position_dodge(0.2),
            hjust = -0.08, # outside the bar
            colour = 'black',
            size = 4
  )+
  labs(x = "",
       y = "", 
       title = "TOP 10 STATES WITH HIGHEST \nROOFTOP SOLAR CAPACITY",
       caption = 'As on 31.12.2023 \n Numerical values are in MW') +
  coord_flip() +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.major.x = element_blank(),
    panel.background = element_rect(fill = 'white'),
    axis.ticks = element_blank(),
  ) +
  geom_vline(xintercept = c(0.5,1.5, 2.5,3.5,4.5,5.5,6.5,7.5,8.5,9.5,10.5), colour = 'black')+ # vlines between the states
  geom_hline(yintercept = c(0), colour = 'black')+ #set a onset line for bars - usually used in BBC reporting.
  scale_y_continuous(labels = NULL)
roofs

Ok this is an attempt but still the red inked labelling is not there which I think not worth the effort. Now, let’s deal with the other graph:

#Let's create the data first

Years <- c("Before Mar'14*", "2014-15", "2015-16", 
           "2016-17", "2017-18", "2018-19", "2019-20",
           "2020-21", "2021-22", "2022-23", "2023-24**")
solPowCap <- c(2821.91, 1171.62, 3130.36, 5658.63,
               9563.69, 6750, 6510.06, 5628.8, 12760.5,
               12783.8, 6539.12)
Years <- factor(Years, levels =  c("Before Mar'14*", "2014-15", "2015-16", "2016-17", "2017-18", "2018-19", "2019-20", "2020-21", "2021-22", "2022-23", "2023-24**"))
solPowerCap <- tibble(
  Years = Years,
  solPowCap = solPowCap
) # ok the data is now ready

# create bar chart

solPow <- solPowerCap |> 
  ggplot(aes(x = Years, y = solPowCap)) +
  geom_col(fill = "#86b2d6") +
  geom_text(aes(label = scales::comma(solPowCap, accuracy = 0.01)), 
            position = position_dodge(0.2),
            hjust = -0.08, # outside the bar
            colour = 'black',
            size = 4,
            angle = 90)+
  labs(x = "",
       y = "", 
       title = "YEAR WISE INSTALLED \nSOLAR POWER CAPACITY",
       caption = "*Total \n **TillDec'2023 \nSource: Ministry of New and Renewable Energy") +
  theme(panel.grid.major.y = element_blank(),
    panel.grid.major.x = element_blank(),
    panel.background = element_rect(fill = 'white'),
    axis.ticks = element_blank(),
    axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)
    ) + geom_hline(yintercept = 0) +  
  ylim(c(0, 23000)) 
  solPow

I don’t know what software they use but a powerful library of a freeware can also produce good plots.