在《UpSetR:多数据集绘图可视化处理利器》中我们介绍了 UpSetR 的一些概念和绘图基础参数使用,今天我们来学习一下 UpSetR 的 queries 和 attribute.plots 这两个高级参数的使用。
queries 参数里面的每一个 list 都是由四个部分组成:query, param, color, active。
示例1:内置的交集查询
本示例展示了如何通过内置的交集查询(intersection query)intersects 去查找并展示特定的交集元素(elements in specific intersections)。本示例中 active query 的颜色来源于 UpSetR 中默认的调色板。
pset(movies, queries = list(list(query = intersects, params = list("Drama", "Comedy", "Action"), color = "orange", active = T),
list(query = intersects, params = list("Drama"), color = "red", active = F),
list(query = intersects, params = list("Action", "Drama"), active = T)))
示例2:内置的元素查询
本示例展示了如何通过内置的元素查询(element query)函数 elements 可视化展示特定的元素在交集中是如何分布的。
upset(movies, queries = list(list(query = elements, params = list("AvgRating", 3.5, 4.1), color = "blue", active = T),
list(query = elements, params = list("ReleaseDate", 1980, 1990, 2000), color = "red", active = F)))
示例3:使用表达参数进行交集和元素子集查询
本示例展示如何通过使用 expression 参数获取交集和元素查询的子集(subset the results of element and intersection queries)。
upset(movies, queries = list(list(query = intersects, params = list("Action", "Drama"), active = T),
list(query = elements, params = list("ReleaseDate", 1980, 1990, 2000), color = "red", active = F)),
expression = "AvgRating > 3 & Watches > 100")
示例4:自定义查询
Creating a custom query to operate on the rows of the data.Myfunc <- function(row, release, rating) { data <- (row["ReleaseDate"] %in% release) & (row["AvgRating"] > rating)}
Applying the created query to the queries parameter.upset(movies, queries = list(list(query = Myfunc, params = list(c(1970, 1980, 1990, 1999, 2000), 2.5), color = "blue", active = T)))
示例5:使用查询图例
UpSetR 可以通过使用 query.legend 添加 queries 的图例。query.legend 的位置可以在头部(top)或者底部(bottom);我们也可以使用 query.name 参数在 queries 中给每一个 query 自定义指定的名称。
upset(movies, query.legend = "top", queries = list(list(query = intersects,
params = list("Drama", "Comedy", "Action"), color = "orange", active = T,
query.name = "Funny action"), list(query = intersects, params = list("Drama"), color = "red", active = F), list(query = intersects, params = list("Action", "Drama"), active = T, query.name = "Emotional action")))
示例6:queries 绘图总结
综合示例1——示例5,绘制图形如下:
pset(movies, query.legend = "bottom", queries = list(list(query = Myfunc, params = list(c(1970, 1980, 1990, 1999, 2000), 2.5), color = "orange", active = T), list(query = intersects, params = list("Action", "Drama"), active = F), list(query = elements, params = list("ReleaseDate", 1980, 1990, 2000), color = "red", active = F, query.name = "Decades")), expression = "AvgRating > 3 & Watches > 100")
attribute.plots 主要是用于添加属性图,内置有柱形图、散点图、热图等。该参数被分解成 3 部分:gridrows, plots, 以及 ncols。
示例1:直方图
本示例展示了如何在 UpSetR 中添加一个内置直方属性图。如果 main.bar.color 未指定为黑色,则包含在黑色 intersection size 柱状条中的元素将在属性图中表示为灰色。
upset(movies, main.bar.color = "black", queries = list(list(query = intersects, params = list("Drama"), active = T)), attribute.plots = list(gridrows = 50, plots = list(list(plot = histogram, x = "ReleaseDate", queries = F), list(plot = histogram, x = "AvgRating", queries = T)), ncols = 2))
示例2:散点图
本示例展示了如何在 UpSetR 中添加一个内置散点属性图。需要注意的是,在本示例中使用了 query.legend。
upset(movies, main.bar.color = "black", queries = list(list(query = intersects, params = list("Drama"), color = "red", active = F), list(query = intersects, params = list("Action", "Drama"), active = T), list(query = intersects, params = list("Drama", "Comedy", "Action"), color = "orange", active = T)), attribute.plots = list(gridrows = 45, plots = list(list(plot = scatter_plot, x = "ReleaseDate", y = "AvgRating", queries = T), list(plot = scatter_plot, x = "AvgRating", y = "Watches", queries = F)), ncols = 2), query.legend = "bottom")
示例3:自定义属性图
myplot <- function(mydata, x, y) { plot <- (ggplot(data = mydata, aes_string(x = x, y = y, colour = "color")) + geom_point() + scale_color_identity() + theme(plot.margin = unit(c(0, 0, 0, 0), "cm")))}another.plot <- function(data, x, y) { data$decades <- round_any(as.integer(unlist(data[y])), 10, ceiling) data <- data[which(data$decades >= 1970), ] myplot <- (ggplot(data, aes_string(x = x)) + geom_density(aes(fill = factor(decades)), alpha = 0.4) + theme(plot.margin = unit(c(0, 0, 0, 0), "cm"), legend.key.size = unit(0.4, "cm")))}
使用上面定义的 myplot 应用于 UpSetR 绘图。
upset(movies, main.bar.color = "black", queries = list(list(query = intersects, params = list("Drama"), color = "red", active = F), list(query = intersects, params = list("Action", "Drama"), active = T), list(query = intersects, params = list("Drama", "Comedy", "Action"), color = "orange", active = T)), attribute.plots = list(gridrows = 45, plots = list(list(plot = myplot, x = "ReleaseDate", y = "AvgRating", queries = T), list(plot = another.plot, x = "AvgRating", y = "ReleaseDate", queries = F)), ncols = 2))
示例4:属性图绘图总结
综合示例 1 的内置直方图、示例 2 的内置散点图,以及示例 3 的自定义属性图,绘图如下:
pset(movies, main.bar.color = "black", mb.ratio = c(0.5, 0.5), queries = list(list(query = intersects, params = list("Drama"), color = "red", active = F), list(query = intersects, params = list("Action", "Drama"), active = T), list(query = intersects, params = list("Drama", "Comedy", "Action"), color = "orange", active = T)), attribute.plots = list(gridrows = 50, plots = list(list(plot = histogram, x = "ReleaseDate", queries = F), list(plot = scatter_plot, x = "ReleaseDate", y = "AvgRating", queries = T), list(plot = myplot, x = "AvgRating", y = "Watches", queries = F)), ncols = 3))
示例5:箱线图
箱线图(Box plots)可以展示所有交集的点属性分布,交集箱线图可以一次性最多展示 2 个箱线图的总体情况。boxplot.summary 参数接收包含 1 个或者 2 个属性名称的向量数据(vector)。
upset(movies, boxplot.summary = c("AvgRating", "ReleaseDate"))
关于 UpSetR 包的使用就介绍到这里,该包的其他一些用法,如 Incorporating Set Metadata 可以参考官方文档,或者查看 UpSetR 在 GitHub 的源码。
▍本文版权(图片和文字)属于“生信科技爱好者”(微信公众号:BioInit),禁止二次转载。部分图片来源于网络,如有侵权请联系删除。