一条命令轻松绘制CNS顶级配图-ggpubr

本文转载自“EasyChart”,己获授权。本平台编辑对内容进行测试、修改和补充。 Hadley Wickham创建的可视化包ggplot2可以流畅地进行优美的可视化,但是如果要通过ggplot2定制一套图形,尤其是...


Hadley Wickham创建的可视化包ggplot2可以流畅地进行优美的可视化,但是如果要通过ggplot2定制一套图形,尤其是适用于杂志期刊等出版物的图形,对于那些没有深入了解ggplot2的人来说就有点困难了,ggplot2的部分语法是很晦涩的。为此Alboukadel Kassambara创建了基于ggplot2的可视化包ggpubr用于绘制符合出版物要求的图形。

安装及加载ggpubr包

# 直接从CRAN安装
install.packages("ggpubr", repo="http://cran.us.r-project.org")

# GitHub上安装最新版本
install.packages("devtools", repo="http://cran.us.r-project.org")
library(devtools)
install_github("kassambara/ggpubr")

# 安装完之后直接加载就行:
library(ggpubr)

ggpubr可绘制图形

ggpubr可绘制大部分我们常用的图形,下面逐个介绍。

分布图(Distribution)

带有均值线和地毯线的密度图

#构建数据集
set.seed(123)
df <- data.frame( sex=factor(rep(c("f", "M"), each=200)), 
                  weight=c(rnorm(200, 55), rnorm(200, 58)))
# 预览数据格式
head(df) 
# 绘制密度图
ggdensity(df, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex",
          palette = c("#00AFBB", "#E7B800"))

attachments-2017-12-CLprv1KJ5a433a334b351.png

图1. 密度图展示不同性别分组下体重的分布,X轴为体重,Y轴为自动累计的密度,X轴上添加地毯线进一步呈现样本的分布;按性别分别组标记轮廓线颜色,再按性别填充色展示各组的分布,使用palette自定义颜色,是不是很舒服。

带有均值线和边际地毯线的直方图

gghistogram(df, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex",
            palette = c("#00AFBB", "#E7B800"))

attachments-2017-12-vWxjlNAd5a433a4335030.png

图2. 带有均值线和边际地毯线的直方图,只是把密度比例还原为了原始数据counts值

箱线/小提琴图(barplot/violinplot)

箱线图+分组形状+统计

#加载数据集ToothGrowth
data("ToothGrowth")
df1 <- ToothGrowth
head(df1)
p <- ggboxplot(df1, x="dose", y="len", color = "dose", 
               palette = c("#00AFBB", "#E7B800", "#FC4E07"), 
               add = "jitter", shape="dose")#增加了jitter点,点shapedose映射
p

attachments-2017-12-HWn826or5a433a5324a6c.png

图3. 箱线图按组着色,同时样本点标记不同形状可以一步区分组或批次

箱线图+分组形状+统计

# 增加不同组间的p-value值,可以自定义需要标注的组间比较
my_comparisons <- list(c("0.5", "1"), c("1", "2"), c("0.5", "2"))
p+stat_compare_means(comparisons = my_comparisons)+ #不同组间的比较 
  stat_compare_means(label.y = 50)

attachments-2017-12-eI3Vlk3H5a433a650e8c1.png图4. stat_compare_means添加组间比较连线和统计P值

内有箱线图的小提琴图+星标记

ggviolin(df1, x="dose", y="len", fill = "dose", 
         palette = c("#00AFBB", "#E7B800", "#FC4E07"), 
         add = "boxplot", add.params = list(fill="white"))+ 
  stat_compare_means(comparisons = my_comparisons, label = "p.signif")+#label这里表示选择显著性标记(星号) 
  stat_compare_means(label.y = 50)

attachments-2017-12-y520NhI95a433a76677c7.png

图5. ggviolin绘制小提琴图, add = "boxplot"中间再添加箱线图,stat_compare_means中,设置lable="p.signif",即可添加星添加组间比较连线和统计P值按星分类。

条形/柱状图(barplot)

data("mtcars")
df2 <- mtcars
df2$cyl <- factor(df2$cyl)
df2$name <- rownames(df2) #添加一行name
head(df2[, c("name", "wt", "mpg", "cyl")])
ggbarplot(df2, x="name", y="mpg", fill = "cyl", color = "white", 
          palette = "npg", #杂志nature的配色 
          sort.val = "desc", #下降排序 
          sort.by.groups=FALSE, #不按组排序 
          x.text.angle=60)

attachments-2017-12-NubKT0465a433a8823b4a.png

图6. 柱状图展示不同车的速度,按cyl为分组信息进行填充颜色,颜色按nature配色方法(支持 ggsci包中的本色方案,如: "npg", "aaas", "lancet", "jco", "ucscgb", "uchicago", "simpsons" and "rickandmorty"),按数值降序排列。

# 按组进行排序
ggbarplot(df2, x="name", y="mpg", fill = "cyl", color = "white", 
          palette = "aaas", #杂志Science的配色 
          sort.val = "asc", #上升排序,区别于desc,具体看图演示 
          sort.by.groups=TRUE,x.text.angle=60) #按组排序 x.text.angle=90

attachments-2017-12-qybZ4siM5a433a9663514.png

图7. 由上图中颜色改为Sciences配色方案(为什么感觉nature和sciences的配色方案没有文章里的看着舒服呢?),按组升序排布,且调整x轴标签60度角防止重叠。

偏差图

偏差图展示了与参考值之间的偏差

df2$mpg_z <- (df2$mpg-mean(df2$mpg))/sd(df2$mpg) # 相当于Zscore标准化,减均值,除标准差
df2$mpg_grp <- factor(ifelse(df2$mpg_z<0, "low", "high"), levels = c("low", "high"))
head(df2[, c("name", "wt", "mpg", "mpg_grp", "cyl")])
ggbarplot(df2, x="name", y="mpg_z", fill = "mpg_grp", color = "white", 
          palette = "jco", sort.val = "asc", sort.by.groups = FALSE, 
          x.text.angle=60, ylab = "MPG z-score", xlab = FALSE, legend.title="MPG Group")

attachments-2017-12-gxtZ2HQO5a43615a04892.png


图8. 基于Zscore的柱状图,就是原始值减均值,再除标准差。按jco杂志配色方案,升序排列,不按组排列。

坐标轴变换

ggbarplot(df2, x="name", y="mpg_z", fill = "mpg_grp", color = "white", 
          palette = "jco", sort.val = "desc", sort.by.groups = FALSE, 
          x.text.angle=90, ylab = "MPG z-score", xlab = FALSE, 
          legend.title="MPG Group", rotate=TRUE, ggtheme = theme_minimal()) # rotate设置x/y轴对换

attachments-2017-12-4JBjRCmB5a43611bac236.png

图9. rotate=TRUE翻转坐标轴,柱状图秒变条形图

棒棒糖图(Lollipop chart)

棒棒图可以代替条形图展示数据

ggdotchart(df2, x="name", y="mpg", color = "cyl", 
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), 
           sorting = "ascending", 
           add = "segments", ggtheme = theme_pubr())

attachments-2017-12-CFY45J3p5a433acd33336.png图10. 柱状图太多了单调,改用棒棒糖图添加多样性

可以自设置各种参数

ggdotchart(df2, x="name", y="mpg", color = "cyl", 
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), 
           sorting = "descending", add = "segments", rotate = TRUE, 
           group = "cyl", dot.size = 6, 
           label = round(df2$mpg), font.label = list(color="white", 
           size=9, vjust=0.5), ggtheme = theme_pubr())

attachments-2017-12-xMYCB9jJ5a433af7686ac.png

图11. 棒棒糖图简单调整,rotate = TRUE转换坐标轴, dot.size = 6调整糖的大小,label = round()添加糖心中的数值,font.label进一步设置字体样式

棒棒糖偏差图

ggdotchart(df2, x="name", y="mpg_z", color = "cyl", 
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), 
           sorting = "descending", add = "segment", 
           add.params = list(color="lightgray", size=2), 
           group = "cyl", dot.size = 6, label = round(df2$mpg_z, 1), 
           font.label = list(color="white", size=9, vjust=0.5), 
           ggtheme = theme_pubr())+ geom_line(yintercept=0, linetype=2, color="lightgray")

attachments-2017-12-ZSbLtzKL5a433b9c653db.png

图12. 同柱状图类似,用Z-score的值代替原始值绘图。

Cleveland点图

ggdotchart(df2, x="name", y="mpg", color = "cyl", 
           palette = c("#00AFBB", "#E7B800", "#FC4E07"), 
           sorting = "descending", 
           rotate = TRUE, dot.size = 2, y.text.col=TRUE, 
           ggtheme = theme_pubr())+ theme_cleveland()

attachments-2017-12-NbjWzMoB5a433bc3e4bcf.png

图13. theme_cleveland()主题可设置为Cleveland点图样式

我测试的工作环境

sessionInfo()

R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.2     ggpubr_0.1.6.999 magrittr_1.5     ggplot2_2.2.1    devtools_1.13.4 

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14     bindr_0.1        munsell_0.4.3    colorspace_1.3-2 R6_2.2.2         rlang_0.1.4      httr_1.3.1      
 [8] plyr_1.8.4       dplyr_0.7.4      tools_3.4.1      grid_3.4.1       gtable_0.2.0     git2r_0.19.0     withr_2.1.0     
[15] lazyeval_0.2.1   digest_0.6.12    assertthat_0.2.0 tibble_1.3.4     ggsignif_0.4.0   ggsci_2.8        purrr_0.2.4     
[22] curl_3.0         memoise_1.1.0    glue_1.2.0       labeling_0.3     compiler_3.4.1   scales_0.5.0     pkgconfig_2.0.1


  • 发表于 2017-12-27 12:36
  • 阅读 ( 9845 )
  • 分类:其他组学

0 条评论

请先 登录 后评论
不写代码的码农
刘永鑫

工程师

64 篇文章

作家榜 »

  1. 祝让飞 118 文章
  2. 柚子 91 文章
  3. 刘永鑫 64 文章
  4. admin 57 文章
  5. 生信分析流 55 文章
  6. SXR 44 文章
  7. 张海伦 31 文章
  8. 爽儿 25 文章