Tidy Tuesday for 13-02-2024
Vikram Ranga
2024-02-17
library(tidyverse)
library(ggalt) #-> for smooth lines
The Context
The valentine’s day is a no brainer for anyone. I found a funny meme on social media for the occasion:
For English readers translation is: My neighbour aunty said that her daughter got flowers & chocolate for being topper in a coaching class and on 14th Feb they are taking her on tour :-p.
Tidy Tuesday this time put some data related to valentine’s day and one graph. So let’s check out the graph:
The Data
There are 3 data files:
#data1
giftAge <- read_csv('D:/R Script/Tidytuesday/13-02-2024/gifts_age.csv')
cat("Age wise Spending")
## Age wise Spending
giftAge |>
head(5)
## # A tibble: 5 × 9
## Age SpendingCelebrating Candy Flowers Jewelry GreetingCards EveningOut
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 18-24 51 70 50 33 33 41
## 2 25-34 40 62 44 34 33 37
## 3 35-44 31 58 41 29 42 30
## 4 45-54 19 60 37 20 42 31
## 5 55-64 18 50 32 13 43 29
## # ℹ 2 more variables: Clothing <dbl>, GiftCards <dbl>
#data2
giftGender <- read_csv('D:/R Script/Tidytuesday/13-02-2024/gifts_gender.csv')
cat("Gender wise Spending")
## Gender wise Spending
giftGender |>
head(5)
## # A tibble: 2 × 9
## Gender SpendingCelebrating Candy Flowers Jewelry GreetingCards EveningOut
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Men 27 52 56 30 37 33
## 2 Women 27 59 19 14 43 29
## # ℹ 2 more variables: Clothing <dbl>, GiftCards <dbl>
#data3
hisSpend <- read_csv('D:/R Script/Tidytuesday/13-02-2024/historical_spending.csv')
cat("Historical Spending")
## Historical Spending
hisSpend |>
head(5)
## # A tibble: 5 × 10
## Year PercentCelebrating PerPerson Candy Flowers Jewelry GreetingCards
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2010 60 103 8.6 12.3 21.5 5.91
## 2 2011 58 116. 10.8 12.6 26.2 8.09
## 3 2012 59 126. 10.8 13.5 29.6 6.93
## 4 2013 60 131. 11.6 13.5 30.9 8.32
## 5 2014 54 134. 10.8 15 30.6 7.97
## # ℹ 3 more variables: EveningOut <dbl>, Clothing <dbl>, GiftCards <dbl>
The Graph(s)
Now let’s try to make the graph(s):
hisSpend$Perc <- hisSpend$PercentCelebrating/100
spline_int <- as.data.frame(spline(hisSpend$Year,
hisSpend$PercentCelebrating))
p <- hisSpend |>
ggplot(aes(x = Year, y = Perc)) +
geom_hline(yintercept = seq(0.5, 0.64, 0.02),
colour = 'grey90',
linewidth = 0.4) +
geom_xspline(spline_shape = -0.4, size = 1, color = 'red')+
geom_point(shape = 20, colour = 'white', size = 4)+
geom_point(shape = 20, colour = 'red', size = 2)+
geom_text(aes(x = Year + 0.015,
y = Perc + 0.006,
label = scales::percent(Perc, accuracy = 1, trim = F)))+
scale_color_continuous(name = "",
label = "Percent Celebrating") +
scale_x_continuous(breaks = seq(2010, 2022, 1))+
scale_y_continuous(breaks = seq(0.50, 0.65, 1))+
labs(x = "",
y = "",
caption = "Source: NDR's 2024 Valentine's Day Spending Survey, conducted by Prosper Insights & Analytics")+
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text.y = element_blank(),
legend.position = 'bottom',
axis.ticks = element_blank(),
axis.text.x = element_text(angle = 30),
panel.border = element_blank(),
plot.caption = element_text(
colour = "grey50",
size = 6,
hjust = 0
)
)
p
It seems people used to celebrate it more than now. There is a sharp dip in 2014 and sharp rise in 2020 may be because of COVID, people needed a reason to celebrate. Overall, its declining! (The data is truncated, the above graph is from 2007 till 2024 and this one is subset of that)
Now, we are done with this graph lets explore rest two datasets.
The Others
# tidy the data using pivot_longer
giftAgeL <- giftAge |>
pivot_longer(
cols = !Age,
names_to = "Category",
values_to = "Number"
)
# turn numbers into percentages
giftAgeL <- giftAgeL |>
group_by(Age) |>
mutate(Perc = round((Number/sum(Number)), 2))
#plot
giftAgeL |>
ggplot(aes(x = Age, y = Perc, fill = Category, group = Category)) +
geom_col(position = 'dodge') +
# lets label the bars otherwise info. is not clear
geom_text(aes(label = scales::percent(Perc, accuracy = 1)),
position = position_dodge(width = 0.9), # this is important because dodge bars we have plotted
size = 3,
color = 'black',
angle = 90, # plot the text vertically
hjust = 1.2) +
scale_fill_manual(values = c('#66c2a5','#fc8d62','#8da0cb','#e78ac3','#a6d854','#ffd92f','#e5c494','#b3b3b3'), name = "")+
theme_bw()+
theme(legend.position = 'top',
axis.text.x = element_text(angle = 30), # x-axis label turn them to 30 degree
panel.grid = element_blank(), # remove grid lines
panel.border = element_blank(), # remove border
axis.ticks = element_blank(), # remove ticks
axis.text.y = element_blank())+ # remove y axis labels coz we have printed y axis value as %
labs(x = "",
y = "")+ # Remove axis labels
guides(fill = guide_legend(nrow = 1)) # put all the legends in one row
Now, see the graph -> it seems as we grow old importance of greeting cards increases. Candy remains popular at all age group - no wonders there :-)
No comments:
Post a Comment