2018年5月3日 星期四

R 的視覺化之一:風格美學篇

ggplot2 應該是R社群內最受歡迎的繪圖套件之一。繪圖除了結構清楚,ggplot2 還有就是有很多的風格模版,例如,華爾街日報,經濟學人等等。本文介紹如何使用主題套件ggthemes和其他package套入入風格,程式碼內風格指令反藍,圖形的資料結構反紅,其餘是內建的設定選項。範例資料檔是 tips.csv,這是200筆紀錄小費金額的數據,餐廳發現每張帳單的小費,差距頗大。因此想知道,有哪些因素影響了小費額度。資料如圖


tip =小費金額,美金
total_bill =帳單金額, 美金
sex = 結帳者性別
smoker=結帳者是否吸煙
day=用餐日
time=用餐時段
                         size=帳單顧客人頭數(table size)

require("ggplot2")
tips<-read.csv("tips.csv")
.df <- data.frame(y = tips$tip, x = tips$day, z = tips$sex)

第1個圖是ggplot2內建的盒鬚圖。
.plot <- ggplot(data = .df, aes(x = factor(x), y = y, fill = z)) + 
  stat_boxplot(geom = "errorbar", position = position_dodge(width = 0.9), 
  width = 0.5) +  geom_boxplot(position = position_dodge(width = 0.9)) + 
  xlab("day") + ylab("tip") +  labs(fill = "sex") +
  theme_bw(base_size = 14, base_family = "sans")
print(.plot)
ggplot2 內建風格

第2個圖是經濟學人
.plot <- ggplot(data = .df, aes(x = factor(x), y = y, fill = z)) + 
  stat_boxplot(geom = "errorbar", position = position_dodge(width = 0.9), 
  width = 0.5) +  geom_boxplot(position = position_dodge(width = 0.9)) + 
  xlab("day") + ylab("tip") +  labs(fill = "sex")
  ggthemes::theme_economist(base_size = 14, base_family = "sans")
print(.plot)


經濟學人風格


第3個圖是華爾街日報風格,必須由第三方套件 RcmdrPlugin.KMggplot2 呼叫
.plot <- ggplot(data = .df, aes(x = factor(x), y = y, fill = z)) + 
  stat_boxplot(geom = "errorbar", position = position_dodge(width = 0.9), 
  width = 0.5) +  geom_boxplot(position = position_dodge(width = 0.9)) + 
  xlab("day") + ylab("tip") +  labs(fill = "sex") +
  RcmdrPlugin.KMggplot2::theme_wsj2(base_size = 14, base_family = "sans")
print(.plot)
華爾街日報風格


第4個圖是著名的Few風格,是視覺化大師 Stephen Few的設計風格,Few著有Show me the numbers-- Designing Tables and Graphs to Enlighten一書,超級一流。
.plot <- ggplot(data = .df, aes(x = factor(x), y = y, fill = z)) + 
  stat_boxplot(geom = "errorbar", position = position_dodge(width = 0.9), 
  width = 0.5) +  geom_boxplot(position = position_dodge(width = 0.9)) + 
  xlab("day") + ylab("tip") +  labs(fill = "sex") +
  ggthemes::theme_few(base_size = 14, base_family = "sans")
print(.plot)
Few風格

第5個圖是著名的網站538風格,fivethirtyeight是預測高手Nate Silver所創立,Nate著有Noices and Signals一書,中譯「精準預測」
.plot <- ggplot(data = .df, aes(x = factor(x), y = y, fill = z)) + 
  stat_boxplot(geom = "errorbar", position = position_dodge(width = 0.9), 
  width = 0.5) +  geom_boxplot(position = position_dodge(width = 0.9)) + 
  xlab("day") + ylab("tip") +  labs(fill = "sex") +
  ggthemes::theme_fivethirtyeight(base_size = 14, base_family = "sans")  
print(.plot)  
538風格

最後一個是利用 igray 風格呈現新的資料結構,把前面的圖,再依照星期和吸煙與否做成2x2=4格。
.df <- data.frame(y = tips$tip, x = tips$day,z = tips$sex, s = tips$time, t = tips$smoker)
.plot <- ggplot(data = .df, aes(x = factor(x), y = y, fill = z)) + 
  stat_boxplot(geom = "errorbar", position = position_dodge(width = 0.9), 
  width = 0.5) + 
  geom_boxplot(position = position_dodge(width = 0.9)) + 
  facet_grid(s ~ t) + 
  xlab("day") +  ylab("tip") + labs(fill = "sex") + 
  ggthemes::theme_igray(base_size = 14, base_family = "sans") + 
  theme(panel.spacing = unit(0.3, "lines"))
print(.plot)
 igray 風格

沒有留言:

張貼留言