【科研绘图系列】R语言绘制SCI论文图合集
- 开源代码
- 2025-08-28 10:27:02

禁止商业或二改转载,仅供自学使用,侵权必究,如需截取部分内容请后台联系作者!
文章目录 介绍加载R包数据下载Load dataFigure 1Fig 1B: functional assays adhensionFIG 1C: Functional assays OPK Figure 2Fig 2C: Settings and function for heatmapFig 2C (continued): automate comparisons for IgG (1 month)Fig 2C (continued): Automate comparisons for IgA (1week)Related to Fig 2C - Supp Figure 1: heatmaps for IgG 1wk and IgA 1 mo.Fig 2D and E: Column graphs of the #antigens each individual responded to for serum IgGFigure 2D and E (continued): Column graph of the #antigens each individual responded to for saliva IgAFig 2F: graphing significant antigens raw valuesRelated to Fig 2F - Supplementary Fig S3 and S4: Graph each antigen seperately - raw titresFig 2G : Heatmap of correlation coefficients between fold change in IgA and IgG. Fig 3: Luminex data over 3 monthsFig 4: luminex compare responses to pharyngitis childrenFigure 5: Luminex: Correlation of pre-challenge titre with 1 month fold change:Fig 6 and 7: multivariate analysis - data preparationFig 6: Multidimensional scalingFig 6 (continued) : compare MDS1Fig 7: Correlation with outcome variablesFigure 7 (continued) : sPLS-DA系统信息 介绍【科研绘图系列】R语言绘制SCI论文图合集
加载R包 library(ggplot2) library(tidyverse) library(reshape2) library(ggbeeswarm) library(pheatmap) library(RColorBrewer) library(viridis) library(vegan) library(ggpubr) library(gghalves) library(Polychrome) library(uwot) library(mixOmics) library(cowplot) 数据下载所需要的数据的下载链接:
百度网盘链接: 从百度网盘下载提取码: t2hs Load data load("RawData.RData") Figure 1The following code can be used to generate figures for the saliva adhesion assay and opsonic phagocytosis assay (OPK)
Fig 1B: functional assays adhension #split out timepoints Adhesion.pre <- data[,c("id","Pharyngitis","Adehesion_percent_Pre_saliva")] Adhesion.post <- data[,c("id","Pharyngitis","Adehesion_percent_1_week_saliva")] #define new timepoint variable Adhesion.pre$time_point<- "Pre.challenge" Adhesion.post$time_point<- "1week" #rename vars names(Adhesion.pre) <- c("id" ,"Pharyngitis","Adhension Percent", "time_point") names(Adhesion.post) <- c("id" ,"Pharyngitis","Adhension Percent", "time_point") #bind back together Adhesion <- rbind(Adhesion.pre, Adhesion.post) Adhesion$time_point <- factor(Adhesion$time_point, levels = c("Pre.challenge","1week")) #make new grouping variable for use with graphing. Adhesion$Group <- factor(paste(Adhesion$Pharyngitis,Adhesion$time_point, sep = "_"), levels = c("no pharyngitis_Pre.challenge", "no pharyngitis_1week", "pharyngitis_Pre.challenge", "pharyngitis_1week")) #plot the data ggplot(Adhesion, aes(Group,Adhesion$`Adhension Percent`,fill = Pharyngitis))+ geom_violin(aes(alpha = time_point), draw_quantiles = c(0.5), scale= "width", position = "dodge")+ geom_beeswarm(size = 3, alpha = 0.75, cex =1, shape = 21)+ geom_line(aes(group = id, color = Pharyngitis), size = 0.35, alpha = 0.8)+ scale_fill_manual(values = c("blue", "red"))+ scale_alpha_manual(values = c(0.15, 0.5))+ scale_color_manual(values = c("blue", "red"))+ ylab("Percent Adhesion")+ xlab("")+ scale_y_continuous(expand = c(0, 0))+ coord_cartesian(ylim = c(0,200))+ custom_theme() #Save the plot to a PDF file ggsave(file = "Adhesion.pdf", width = 4, height = 4 ) # calculate p-value for pharyngitis group, pre and post, paired Wilcoxon-signed rank test wilcox.test(Adhesion[Adhesion$Pharyngitis == "pharyngitis" & Adhesion$time_point == "Pre.challenge",]$`Adhension Percent`, Adhesion[Adhesion$Pharyngitis == "pharyngitis"& Adhesion$time_point == "1week",]$`Adhension Percent`, paired = T) # calculate p-value for no pharyngitis group, pre and post, paired Wilcoxon-signed rank test wilcox.test(Adhesion[Adhesion$Pharyngitis == "no pharyngitis" & Adhesion$time_point == "Pre.challenge",]$`Adhension Percent`, Adhesion[Adhesion$Pharyngitis == "no pharyngitis"& Adhesion$time_point == "1week",]$`Adhension Percent`, paired = T) # calculate p-value for pre challenge, pharyngitis vs no pharyngitis, Mann-Whitney test wilcox.test(Adhesion[Adhesion$Pharyngitis == "pharyngitis" & Adhesion$time_point == "Pre.challenge",]$`Adhension Percent`, Adhesion[Adhesion$Pharyngitis == "no pharyngitis"& Adhesion$time_point == "Pre.challenge",]$`Adhension Percent`, paired = F) # calculate p-value for 1 month, pharyngitis vs no pharyngitis, Mann-Whitney test wilcox.test(Adhesion[Adhesion$Pharyngitis == "no pharyngitis" & Adhesion$time_point == "1week",]$`Adhension Percent`, Adhesion[Adhesion$Pharyngitis == "pharyngitis"& Adhesion$time_point == "1week",]$`Adhension Percent`, paired = F) #remove dataframes that wont be used again rm(Adhesion, Adhesion.post, Adhesion.pre) FIG 1C: Functional assays OPK #Split out time-points into seperate dataframe OPK.pre <- data[,c("id","Pharyngitis","Opsonic_index_pre")] OPK.post <- data[,c("id","Pharyngitis","Opsonic_index_1mo")] # Add time pont variable OPK.pre$time_point<- "Pre.challenge" OPK.post$time_point<- "1month" #rename columns for consistency names(OPK.pre) <- c("id" ,"Pharyngitis","Opsonic_index", "time_point") names(OPK.post) <- c("id" ,"Pharyngitis","Opsonic_index", "time_point") #bind the the pre and post timepoints back together OPK <- rbind(OPK.pre, OPK.post) # Make time_point a factor with specified levels OPK$time_point <- factor(OPK$time_point, levels = c("Pre.challenge","1month")) # Create a binary variable to distinguish two individuals with OPK response OPK$responders <- ifelse(OPK$id %in% c("SN010", "SN013"), "Y", "N") # Create a new grouping variable that distinguishes based on responders and pharyngitis variables OPK$Group <- factor(paste(OPK$responders, OPK$Pharyngitis, OPK$time_point, sep = "_"), levels = c("Y_no pharyngitis_Pre.challenge", "Y_no pharyngitis_1month", "Y_pharyngitis_Pre.challenge", "Y_pharyngitis_1month", "N_no pharyngitis_Pre.challenge", "N_no pharyngitis_1month", "N_pharyngitis_Pre.challenge", "N_pharyngitis_1month")) #plot data ggplot(OPK, aes(Group, Opsonic_index, fill = Pharyngitis))+ geom_col(stat="identity",position = "dodge", aes(alpha = time_point))+ geom_beeswarm(size = 3, alpha = 0.75, cex =0.5, shape = 21)+ scale_fill_manual(values = c("blue", "red"))+ scale_alpha_manual(values = c(0.15, 0.5))+ ylab("OPK index")+ xlab("")+ scale_y_continuous(expand = c(0, 0))+ coord_cartesian(ylim = c(0,100))+ custom_theme()+ theme( axis.text.x = element_text(colour = "black", size = 17, angle = 90)) # change orientation of X-axis text # Save the plot to a PDF file ggsave(file = "OPK.pdf", width = 8, height = 7 ) #remove datasets that dont get used again rm(OPK, OPK.pre, OPK.post) Figure 2The following code can be used to generate figures for the the analysis of ELISA data shown in figure 2 and linked supplementary figures
#Fig 2B: Heatmap Saliva IgA and Serum IgG and Saliva IgA Note: Column order in heatmap was manually altered in Adobe Illustrator
#use this for both pheatmap plotting objects breaksList <- seq(from =-2.5, to = 2.5, by = 0.1) #Generate new dataframe subsetting into pre-challenge and 1 month data for IgA plotting <- data_long[data_long$timepoint %in% c("Pre.challenge") & data_long$Type == "IgA",] plotting_1 <- data_long[data_long$timepoint %in% c("1week") & data_long$Type == "IgA",] #add other variables back in plotting <- left_join(plotting, outcome, by = "id") plotting_1 <- left_join(plotting_1, outcome, by = "id") #bind them back together plotting <- rbind(plotting, plotting_1) plotting <-(plotting[complete.cases(plotting[,colnames(plotting) %in% Antigen.Order]),]) #transpose plotting_t <- t(plotting[,c(colnames(plotting) %in% Antigen.Order)]) #add column names colnames(plotting_t) <- paste(plotting$id, plotting$timepoint, sep = "_") #generate annotation variables for heatmap mat_col <- data.frame(outcome=as.factor(ifelse(plotting$id %in% pharyngitis, "pharyngitis", "no pharyngitis"))) rownames(mat_col) <- colnames(plotting_t) mat_colors <- list(outcome = c("blue","red")) names(mat_colors$outcome) <- levels(mat_col$outcome) #Plot data pdf(file = "heatmap_raw_titres_IgA.pdf", height = 4, width = 6) pheatmap(plotting_t, breaks = breaksList, color = viridis(50), cellwidth = 2.5, cellheight =9, border_color = "black", scale = "row", cluster_rows = F, cluster_cols = F, clustering_distance_cols = "correlation", clustering_method = "ward.D2", annotation_col = mat_col, annotation_colors = mat_colors, annotation_legend = TRUE, show_rownames = T, show_colnames = F, fontsize = 8, display_numbers = F, number_format = "%.2f") dev.off() # Now lets plot IgG #Generate new dataframe subsetting into pre-challenge and 1 month data plotting <- data_long[data_long$timepoint %in% c("Pre.challenge") & data_long$Type == "IgG",] plotting_1 <- data_long[data_long$timepoint %in% c("1month") & data_long$Type == "IgG",] #add other variables back in plotting <- left_join(plotting, outcome, by = "id") plotting_1 <- left_join(plotting_1, outcome, by = "id") #bind then back together plotting <- rbind(plotting, plotting_1) #remove the two individuals for IgG without 1 month data plotting <-(plotting[complete.cases(plotting[,colnames(plotting) %in% Antigen.Order]),]) #transform the data plotting_t <- t(plotting[,c(colnames(plotting) %in% Antigen.Order)]) colnames(plotting_t) <- paste(plotting$id, plotting$timepoint, sep = "_") colnames(plotting_t) <- paste(plotting$id, plotting$timepoint, sep = "_") #generate annotation variables for heatmap mat_col <- data.frame(outcome=as.factor(ifelse(plotting$id %in% pharyngitis, "pharyngitis", "no pharyngitis"))) rownames(mat_col) <- colnames(plotting_t) mat_colors <- list(outcome = c("blue","red")) names(mat_colors$outcome) <- levels(mat_col$outcome) pdf(file = "heatmap_raw_titres_IgG.pdf", height = 4, width = 6) pheatmap(plotting_t, breaks = breaksList, color = viridis(length(breaksList)),, cellwidth = 2.5, cellheight =9, border_color = "black", scale = "row", cluster_rows = F, cluster_cols = F, clustering_distance_cols = "correlation", clustering_method = "ward.D2", annotation_col = mat_col, annotation_colors = mat_colors, annotation_legend = TRUE, show_rownames = T, show_colnames = F, fontsize = 8, display_numbers = F, number_format = "%.2f") dev.off() #remove dataframes/variables no longer needed rm(mat_col, mat_colors, plotting, plotting_1, plotting_t, breaksList) Fig 2C: Settings and function for heatmap # define colour scheme breaksList = seq(from = 0.5, to = 1.5, by = 0.005) heatpal <- c("seagreen","white", "mediumpurple3") # heatmap function: heatIT <- function(heated) { pheatmap(heated, kmeans_k = NA, breaks = breaksList, color = colorRampPalette(heatpal)(length(breaksList)), border_color = "black", cellwidth = 11, cellheight =11, # display_numbers = test_labels, number_format = "%", scale = "none", cluster_rows = F, cluster_cols = F, number_color = "grey", show_rownames = T, show_colnames = F, fontsize = 8, display_numbers = F, filename = NA, silent = FALSE) } Fig 2C (continued): automate comparisons for IgG (1 month) #Select IgG variable names Ig.Names <- names(data[,32:150]) #Select within Ig.names for these variables at the following time-points PreChallenge <- Ig.Names[grepl("_Pre.challenge", Ig.Names)] OneMonth <- Ig.Names[grepl("1month", Ig.Names)] OneMonth <- OneMonth[!grepl("1monthFC", OneMonth)] FoldChange <- Ig.Names[grepl("1monthFC", Ig.Names)] #subset data into each timepoint PreChallenge <- data[,PreChallenge] OneMonth <- data[,OneMonth] FoldChange <- data[,FoldChange] #tidy variable names names(PreChallenge) <- gsub("IgG_", "", names(PreChallenge)) names(PreChallenge) <- gsub("_Pre.challenge", "", names(PreChallenge)) names(OneMonth) <- gsub("IgG_", "", names(OneMonth)) names(OneMonth) <- gsub("_1month", "", names(OneMonth)) names(FoldChange) <- gsub("IgG_", "", names(FoldChange)) names(FoldChange) <- gsub("_1monthFC", "", names(FoldChange)) #add outcome variable PreChallenge$Pharyngitis <- data$Pharyngitis OneMonth$Pharyngitis <- data$Pharyngitis FoldChange$Pharyngitis <- data$Pharyngitis Postchallenge = matrix(data = 0, nrow = 17, ncol = 11) #initialize matrix for(i in 1:17){ Postchallenge[i,1] <- median(FoldChange[c(1:19),i],na.rm = T) #median FC pharyngitis participants Postchallenge[i,2] <- median(FoldChange[c(20:25),i],na.rm = T) #median FC non pharyngitis participants Postchallenge[i,3] <- wilcox.test(FoldChange[c(1:19),i],FoldChange[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value Postchallenge[i,4] <- wilcox.test(PreChallenge[c(1:19),i],OneMonth[c(1:19),i], paired = T, alternative = "two.sided", na.rm = T)$p.value Postchallenge[i,5] <- wilcox.test(PreChallenge[c(20:25),i],OneMonth[c(20:25),i], paired = T, alternative = "two.sided", na.rm = T)$p.value Postchallenge[i,6] <- wilcox.test(PreChallenge[c(1:19),i],PreChallenge[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value Postchallenge[i,7] <- wilcox.test(OneMonth[c(1:19),i],OneMonth[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value } for(i in 1:17){ Postchallenge[i,c(8:9)] <- p.adjust(Postchallenge[i,c(4:5)], method = "fdr" ) Postchallenge[i,c(10:11)] <- p.adjust(Postchallenge[i,c(6:7)], method = "fdr" ) } #make into a data-frame Postchallenge<- as.data.frame(Postchallenge) #define column names names(Postchallenge) <- c("FC_Pharyn", "FC_NoPharyn","p_FcvsFC","p_0vs1_pharyn","p_0vs1_nopharyn","p_0vs0", "p_1vs1","p_0vs1_pharyn.adj","p_0vs1_nopharyn.adj","p_0vs0.adj", "p_1vs1.adj") #name rows with antigen names rownames(Postchallenge) <- names(FoldChange)[1:17] #name rows with antigen names Postchallenge <- Postchallenge[c(Antigen.Order,additional),]#reorder # export table of FC and p-values to use to annotate figures. write.csv(file = "Postchallenge.IgG.csv", Postchallenge) # Make heatmap - Main antigens heated <- (Postchallenge[Antigen.Order,c(2,1)]) pdf(file = "IgG_FcHeatmap.pdf", width = 3, height = c((nrow(heated)/12)+1)) heatIT(heated) dev.off() # Make heatmap - additional antigens heated <- (Postchallenge[additional,c(2,1)]) pdf(file = "IgG_FcHeatmap.additional.pdf", width = 3, height = c((nrow(heated)/12)+1)) heatIT(heated) dev.off() Fig 2C (continued): Automate comparisons for IgA (1week) # Select IgA variables Ig.Names <- names(data[,151:201]) #subset variable names for each timepoint PreChallenge <- Ig.Names[grepl("_Pre.challenge", Ig.Names)] # extract variable names OneWeek <- Ig.Names[grepl("1week", Ig.Names)] # extract variable names OneWeek <- OneWeek[!grepl("1weekFC", OneWeek)] # extract variable names FoldChange <- Ig.Names[grepl("1weekFC", Ig.Names)] # extract variable names #subset data by names PreChallenge <- data[,PreChallenge] #extract prechallenge data OneWeek <- data[,OneWeek] #extract fold change data FoldChange <- data[,FoldChange] #extract fold change data #tidy variable names names(PreChallenge) <- gsub("IgA_", "", names(PreChallenge)) names(PreChallenge) <- gsub("_Pre.challenge", "", names(PreChallenge)) names(OneWeek) <- gsub("IgA_", "", names(OneWeek)) names(OneWeek) <- gsub("_1week", "", names(OneWeek)) names(FoldChange) <- gsub("IgA_", "", names(FoldChange)) names(FoldChange) <- gsub("_1weekFC", "", names(FoldChange)) #ADD pharyngitis variable PreChallenge$Pharyngitis <- data$Pharyngitis OneWeek$Pharyngitis <- data$Pharyngitis FoldChange$Pharyngitis <- data$Pharyngitis #double check that column names are identical names(FoldChange) == names(PreChallenge) names(PreChallenge) == names(OneWeek) #generate fold change and p-value matrix Postchallenge.IgA = matrix(data = 0, nrow = 17, ncol = 11) #initialize matrix for(i in 1:17){ Postchallenge.IgA[i,1] <- median(FoldChange[c(1:19),i],na.rm = T) #median FC pharyngitis participants Postchallenge.IgA[i,2] <- median(FoldChange[c(20:25),i],na.rm = T) #median FC non pharyngitis participants Postchallenge.IgA[i,3] <- wilcox.test(FoldChange[c(1:19),i],FoldChange[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value Postchallenge.IgA[i,4] <- wilcox.test(PreChallenge[c(1:19),i],OneWeek[c(1:19),i], paired = T, alternative = "two.sided", na.rm = T)$p.value Postchallenge.IgA[i,5] <- wilcox.test(PreChallenge[c(20:25),i],OneWeek[c(20:25),i], paired = T, alternative = "two.sided", na.rm = T)$p.value Postchallenge.IgA[i,6] <- wilcox.test(PreChallenge[c(1:19),i],PreChallenge[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value Postchallenge.IgA[i,7] <- wilcox.test(OneWeek[c(1:19),i],OneWeek[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value } for(i in 1:17){ Postchallenge.IgA[i,c(8:9)] <- p.adjust(Postchallenge.IgA[i,c(4:5)], method = "fdr" ) Postchallenge.IgA[i,c(10:11)] <- p.adjust(Postchallenge.IgA[i,c(6:7)], method = "fdr" ) } #make into a dataframe Postchallenge.IgA<- as.data.frame(Postchallenge.IgA) #name columns names(Postchallenge.IgA) <- c("FC_Pharyn", "FC_NoPharyn","p_FcvsFC","p_0vs1_pharyn","p_0vs1_nopharyn","p_0vs0", "p_1vs1","p_0vs1_pharyn.adj","p_0vs1_nopharyn.adj","p_0vs0.adj", "p_1vs1.adj") #name rows rownames(Postchallenge.IgA) <- names(OneWeek)[1:17] #export p-values for annoating graph. write.csv(file = "Postchallenge.IgA.csv", Postchallenge.IgA) #generate heatmap for IgA main antigens heated <- (Postchallenge.IgA[Antigen.Order,c(2,1)]) pdf(file = "IgA_FcHeatmap.pdf", width = 3, height = c((nrow(heated)/12)+1)) heatIT(heated) dev.off() #generate heatmap for IgA additional antigens heated <- (Postchallenge.IgA[additional,c(2,1)]) pdf(file = "IgA_FcHeatmap.additional.pdf", width = 3, height = c((nrow(heated)/12)+1)) heatIT(heated) dev.off() Related to Fig 2C - Supp Figure 1: heatmaps for IgG 1wk and IgA 1 mo. #IgA 1 month data. used only for Supp fig x: FigS1_IgA$Mrp24 <- FigS1_IgA$Mrp24 +1 FigS1_IgA$P145 <-FigS1_IgA$P145 +1 pre <- FigS1_IgA[FigS1_IgA$X %in% "pre",] mo1 <- FigS1_IgA[FigS1_IgA$X %in% "1mo",] Foldchange_1mo <- mo1[,c(3:19)]/ pre[,c(3:19)] mo1$Saliva.IgA.AU <- pre$Saliva.IgA.AU # add participant ID Foldchange_1mo$Saliva.IgA.AU <- pre$Saliva.IgA.AU #keep only the 18 pharyngitis participants pre <- pre[pre$Saliva.IgA.AU %in% pharyngitis,] mo1 <- mo1[mo1$Saliva.IgA.AU %in% pharyngitis,] Foldchange_1mo <- Foldchange_1mo[pre$Saliva.IgA.AU %in% pharyngitis,] #generate fold change and p-value matrix Postchallenge.IgA.1month = matrix(data = 0, nrow = 17, ncol = 3) #initialize matrix for(i in 1:17){ Postchallenge.IgA.1month[i,1] <- median(Foldchange_1mo[,i],na.rm = T) #median FC pharyngitis Postchallenge.IgA.1month[i,2] <- wilcox.test(pre[,i+2],mo1[,i+2], paired = T, alternative = "two.sided", na.rm = T)$p.value Postchallenge.IgA.1month[i,3] <- p.adjust(Postchallenge.IgA.1month[i,2], method = "fdr" ) } #make into a dataframe Postchallenge.IgA.1month<- as.data.frame(Postchallenge.IgA.1month) #name columns names(Postchallenge.IgA.1month) <- c("FC_Pharyn", "p_0vs1_pharyn","p_0vs1_nopharyn.adj") #name rows rownames(Postchallenge.IgA.1month) <- names(Foldchange_1mo[,1:17]) #export p-values for annoating graph. write.csv(file = "Postchallenge.IgA.1month.csv", Postchallenge.IgA.1month) #generate heatmap for IgA main antigens heated <- (Postchallenge.IgA.1month[Antigen.Order,]) pdf(file = "IgA_FcHeatmap_1mo.pdf", width = 3, height = c((nrow(heated)/12)+1)) heatIT(heated) dev.off() #generate heatmap for IgA additional antigens heated <- (Postchallenge.IgA.1month[additional,]) pdf(file = "IgA_FcHeatmap.additional_1mo.pdf", width = 3, height = c((nrow(heated)/12)+1)) heatIT(heated[,]) dev.off() # Supp Fig 1 (continued) IgG 1 week. Ig.Names <- names(data[,32:150]) PreChallenge <- Ig.Names[grepl("_Pre.challenge", Ig.Names)] OneWeek <- Ig.Names[grepl("1week", Ig.Names)] OneWeek <- OneWeek[!grepl("1weekFC", OneWeek)] FoldChange <- Ig.Names[grepl("1weekFC", Ig.Names)] PreChallenge <- data[,PreChallenge] OneWeek <- data[,OneWeek] FoldChange <- data[,FoldChange] names(PreChallenge) <- gsub("IgG_", "", names(PreChallenge)) names(PreChallenge) <- gsub("_Pre.challenge", "", names(PreChallenge)) PreChallenge$Pharyngitis <- data$Pharyngitis names(OneWeek) <- gsub("IgG_", "", names(OneWeek)) names(OneWeek) <- gsub("_1week", "", names(OneWeek)) OneWeek$Pharyngitis <- data$Pharyngitis names(FoldChange) <- gsub("IgG_", "", names(FoldChange)) names(FoldChange) <- gsub("_1weekFC", "", names(FoldChange)) FoldChange$Pharyngitis <- data$Pharyngitis names(FoldChange) == names(PreChallenge) # double check names(PreChallenge) == names(OneWeek) # double check Postchallenge_1wk = matrix(data = 0, nrow = 17, ncol = 11) #initialize matrix for(i in 1:17){ Postchallenge_1wk[i,1] <- median(FoldChange[c(1:19),i],na.rm = T) #median FC pharyngitis participants Postchallenge_1wk[i,2] <- median(FoldChange[c(20:25),i],na.rm = T) #median FC non pharyngitis participants Postchallenge_1wk[i,3] <- wilcox.test(FoldChange[c(1:19),i],FoldChange[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value Postchallenge_1wk[i,4] <- wilcox.test(PreChallenge[c(1:19),i],OneWeek[c(1:19),i], paired = T, alternative = "two.sided", na.rm = T)$p.value Postchallenge_1wk[i,5] <- wilcox.test(PreChallenge[c(20:25),i],OneWeek[c(20:25),i], paired = T, alternative = "two.sided", na.rm = T)$p.value Postchallenge_1wk[i,6] <- wilcox.test(PreChallenge[c(1:19),i],PreChallenge[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value Postchallenge_1wk[i,7] <- wilcox.test(OneWeek[c(1:19),i],OneWeek[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value } for(i in 1:17){ Postchallenge_1wk[i,c(8:9)] <- p.adjust(Postchallenge_1wk[i,c(4:5)], method = "fdr" ) Postchallenge_1wk[i,c(10:11)] <- p.adjust(Postchallenge_1wk[i,c(6:7)], method = "fdr" ) } #make into a data-frame Postchallenge_1wk<- as.data.frame(Postchallenge_1wk) #define column names names(Postchallenge_1wk) <- c("FC_Pharyn", "FC_NoPharyn","p_FcvsFC","p_0vs1_pharyn","p_0vs1_nopharyn","p_0vs0", "p_1vs1","p_0vs1_pharyn.adj","p_0vs1_nopharyn.adj","p_0vs0.adj", "p_1vs1.adj") rownames(Postchallenge_1wk) <- names(FoldChange)[1:17] #name rows with antigen names Postchallenge_1wk <- Postchallenge_1wk[c(Antigen.Order,additional),]#reorder # export table of FC and p-values to use to annotate figures. write.csv(file = "Postchallenge.IgG_1wk.csv", Postchallenge_1wk) # Make heatmap - Main antigens heated <- (Postchallenge_1wk[Antigen.Order,c(2,1)]) pdf(file = "IgG_FcHeatmap.pdf_1wk.pdf", width = 3, height = c((nrow(heated)/12)+1)) heatIT(heated) dev.off() # Make heatmap - additional antigens heated <- (Postchallenge_1wk[additional,c(2,1)]) pdf(file = "IgG_FcHeatmap.additional_1wk.pdf", width = 3, height = c((nrow(heated)/12)+1)) heatIT(heated) dev.off() Fig 2D and E: Column graphs of the #antigens each individual responded to for serum IgG # Specifying the antigens of interest Filtered_Ags <- c("IgG_M75_1monthFC", "IgG_GAC_1monthFC", "IgG_SLO_1monthFC", "IgG_ScpA_1monthFC", "IgG_SPYAD_1monthFC", "IgG_SPYCEP_1monthFC", "IgG_ADI_1monthFC", "IgG_Enn336_1monthFC", "IgG_Mrp24_1monthFC", "IgG_TF_1monthFC", "IgG_T25_1monthFC") #remove the two individuals without 1 month data FC.vars <- data[!data$id %in% c("SN017","SN075"),] # Initializing a binary matrix to store response data binary <- matrix(nrow = 23, ncol = 11) # Populating the binary matrix based on a 25% increase threshold for(i in 1:11){ binary[,i] <- if_else(FC.vars[, Filtered_Ags[i]] > 1.25, 1, 0) } # Converting the binary matrix to a data frame binary <- as.data.frame(binary) names(binary) <- Filtered_Ags # Counting the number of positive responses (value == 1) for each row binary$count.1 <- apply(binary, 1, function(x) length(which(x == 1))) # Setting row names to match the IDs row.names(binary) <- FC.vars$id # Adding the Pharyngitis column from the original dataset binary$Pharyngitis <- FC.vars$Pharyngitis # Adding a new column to the original dataset to tally IgG responses data$Tally_IgG <- ifelse(data$id %in% rownames(binary), binary$count.1, NA) # Creating a subset for specific responders to use in MDS analysis (Fig 6) four.responders <- binary[,c("IgG_SPYCEP_1monthFC","IgG_GAC_1monthFC" ,"IgG_SLO_1monthFC","IgG_ScpA_1monthFC")] four.responders$count <- apply(four.responders, 1, function(x) length(which(x==1))) four.responders$count_2 <- ifelse(four.responders$count <2, "no", "yes") four.responders <- rownames(four.responders[four.responders$count >2,]) # plot results (Fig2E IgG) ggplot(binary, aes(Pharyngitis, count.1, fill = Pharyngitis))+ geom_violin(draw_quantiles = c(0.5), scale= "width", alpha = 0.25)+ geom_beeswarm(size = 3, alpha = 0.75, cex =5, shape = 21)+ scale_fill_manual(values = c("blue", "red"))+ ylab("Antigen response (#)")+ xlab("")+ stat_compare_means(method = "wilcox.test", tip.length = 0.005, bracket.size = 0.5, vjust = -1, paired = F, aes(label = paste0("p = ", ..p.format..)))+ coord_cartesian( ylim = c(0, 11))+ scale_y_continuous(breaks = seq(0, 11, len = 12))+ custom_theme() #Saving the plot to a PDF file ggsave(file = "Antigen_response_25pc_IgG_Tally.pdf", width = 3.25, height = 4 ) # Calculate the count of positive responses for each antigen in the "pharyngitis" group Antigen.Tally <- apply(binary[binary$Pharyngitis == "pharyngitis",1:11], 2, function(x) length(which(x==1))) # Define and clean antigen names Antigen.Tally.Names <-Filtered_Ags Antigen.Tally.Names <- gsub("_1monthFC", "", Antigen.Tally.Names) Antigen.Tally.Names <- gsub("IgG_", "", Antigen.Tally.Names) # Create a data frame for antigen tally Antigen.Tally <- data.frame(Antigen = Antigen.Tally.Names, Count = Antigen.Tally) Antigen.Tally$Antigen <- factor(Antigen.Tally$Antigen, levels = Antigen.Order) Antigen.Tally$Percent <- (Antigen.Tally$Count/nrow(binary[binary$Pharyngitis == "pharyngitis",]))*100 Antigen.Tally$Pharyngitis <- "pharyngitis" # Repeat the same process for the "no pharyngitis" group Antigen.Tally.np <- apply(binary[binary$Pharyngitis == "no pharyngitis",1:11], 2, function(x) length(which(x==1))) # Create a data frame for antigen tally in the "no pharyngitis" group Antigen.Tally.np <- data.frame(Antigen = Antigen.Tally.Names, Count = Antigen.Tally.np) Antigen.Tally.np$Antigen <- factor(Antigen.Tally.np$Antigen, levels = Antigen.Order) Antigen.Tally.np$Percent <- (Antigen.Tally.np$Count/nrow(binary[binary$Pharyngitis == "no pharyngitis",]))*100 Antigen.Tally.np$Pharyngitis <- "no pharyngitis" # Combine both groups into a single data frame Antigen.Tally<- rbind(Antigen.Tally, Antigen.Tally.np) Antigen.Tally$Pharyngitis <- factor(Antigen.Tally$Pharyngitis, levels = c("no pharyngitis","pharyngitis")) # Reorder antigens by count for visualization Antigen.Tally.Order<- Antigen.Tally.Names[rev(order(Antigen.Tally$Count))] Antigen.Tally$Antigen <- factor(Antigen.Tally$Antigen, levels = Antigen.Order) # Graph % and faceted by outcome (Fig2D IgG) ggplot(Antigen.Tally, aes(x = Antigen, y = Percent, fill = Pharyngitis)) + geom_col(alpha = 0.75, color= "black") + geom_text(aes(label=Count), nudge_y = 3)+ scale_fill_manual(values = c("blue","red"))+ xlab("")+ ylab("Responders (%)")+ scale_y_continuous(expand = c(0, 0))+ coord_cartesian( ylim = c(0, 100))+ facet_grid(. ~ Pharyngitis)+ custom_theme()+ theme(axis.text.x=element_text(angle = 90, vjust = 0.5, hjust=1)) #Save the percentage plot to a PDF file ggsave(file = "Percent.responders.core.pdf", width = 5, height = 4) rm(binary) rm(FC.vars) Figure 2D and E (continued): Column graph of the #antigens each individual responded to for saliva IgA # Specifying the antigens of interest Filtered_Ags <- c("IgA_M75_1weekFC", "IgA_GAC_1weekFC", "IgA_SLO_1weekFC", "IgA_ScpA_1weekFC", "IgA_SPYAD_1weekFC", "IgA_SPYCEP_1weekFC", "IgA_ADI_1weekFC", "IgA_Enn336_1weekFC", "IgA_Mrp24_1weekFC", "IgA_TF_1weekFC", "IgA_T25_1weekFC") # build FC.vars dataframe FC.vars <- data # Initializing a binary matrix to store response data binary <- matrix(nrow = 25, ncol = 11) # Populating the binary matrix based on a 25% increase threshold for(i in 1:11){ binary[,i] <- if_else(FC.vars[, Filtered_Ags[i]] > 1.25, 1, 0) } # Converting the binary matrix to a data frame binary <- as.data.frame(binary) names(binary) <- Filtered_Ags # Counting the number of positive responses (value == 1) for each row binary$count.1 <- apply(binary, 1, function(x) length(which(x == 1))) # Setting row names to match the IDs row.names(binary) <- FC.vars$id # Adding the Pharyngitis column from the original dataset binary$Pharyngitis <- FC.vars$Pharyngitis # Adding a new column to the original dataset to tally IgA responses data$Tally_IgA <- ifelse(data$id %in% rownames(binary), binary$count.1, NA) # plot results (Fig2E IgA) ggplot(binary, aes(Pharyngitis, count.1, fill = Pharyngitis))+ geom_violin(draw_quantiles = c(0.5), scale= "width", alpha = 0.25)+ geom_beeswarm(size = 3, alpha = 0.75, cex =5, shape = 21)+ scale_fill_manual(values = c("blue", "red"))+ ylab("Antigen response (#)")+ xlab("")+ stat_compare_means(method = "wilcox.test", tip.length = 0.005, bracket.size = 0.5, vjust = -1, paired = F, aes(label = paste0("p = ", ..p.format..)))+ coord_cartesian( ylim = c(0, 11))+ scale_y_continuous(breaks = seq(0, 11, len = 12))+ custom_theme() #Saving the plot to a PDF file ggsave(file = "Antigen_response_25pc_IgA_Tally.pdf", width = 3.25, height = 4 ) # Calculate the count of positive responses for each antigen in the "pharyngitis" group Antigen.Tally <- apply(binary[binary$Pharyngitis == "pharyngitis",1:11], 2, function(x) length(which(x==1))) # Define and clean antigen names Antigen.Tally.Names <-Filtered_Ags Antigen.Tally.Names <- gsub("_1weekFC", "", Antigen.Tally.Names) Antigen.Tally.Names <- gsub("IgA_", "", Antigen.Tally.Names) # Create a data frame for antigen tally Antigen.Tally <- data.frame(Antigen = Antigen.Tally.Names, Count = Antigen.Tally) Antigen.Tally$Antigen <- factor(Antigen.Tally$Antigen, levels = Antigen.Order) Antigen.Tally$Percent <- (Antigen.Tally$Count/nrow(binary[binary$Pharyngitis == "pharyngitis",]))*100 Antigen.Tally$Pharyngitis <- "pharyngitis" # Repeat the same process for the "no pharyngitis" group Antigen.Tally.np <- apply(binary[binary$Pharyngitis == "no pharyngitis",1:11], 2, function(x) length(which(x==1))) # Create a data frame for antigen tally in the "no pharyngitis" group Antigen.Tally.np <- data.frame(Antigen = Antigen.Tally.Names, Count = Antigen.Tally.np) Antigen.Tally.np$Antigen <- factor(Antigen.Tally.np$Antigen, levels = Antigen.Order) Antigen.Tally.np$Percent <- (Antigen.Tally.np$Count/nrow(binary[binary$Pharyngitis == "no pharyngitis",]))*100 Antigen.Tally.np$Pharyngitis <- "no pharyngitis" # Combine both groups into a single data frame Antigen.Tally<- rbind(Antigen.Tally, Antigen.Tally.np) Antigen.Tally$Pharyngitis <- factor(Antigen.Tally$Pharyngitis, levels = c("no pharyngitis","pharyngitis")) # Reorder antigens by count for visualization Antigen.Tally.Order<- Antigen.Tally.Names[rev(order(Antigen.Tally$Count))] Antigen.Tally$Antigen <- factor(Antigen.Tally$Antigen, levels = Antigen.Order) # Graph % and faceted by outcome (Fig2D IgA) ggplot(Antigen.Tally, aes(x = Antigen, y = Percent, fill = Pharyngitis)) + geom_col(alpha = 0.75, color= "black") + geom_text(aes(label=Count), nudge_y = 3)+ scale_fill_manual(values = c("blue","red"))+ xlab("")+ ylab("Responders (%)")+ scale_y_continuous(expand = c(0, 0))+ coord_cartesian( ylim = c(0, 100))+ facet_grid(. ~ Pharyngitis)+ custom_theme()+ theme(axis.text.x=element_text(angle = 90, vjust = 0.5, hjust=1)) #Save the percentage plot to a PDF file ggsave(file = "Percent.responders.core.pdf", width = 5, height = 4) #remove objects and dataframes that wont be used again rm(binary) rm(FC.vars) rm(Filtered_Ags) Fig 2F: graphing significant antigens raw values # Define a function for plotting plot_antigen <- function(data, antigen, antibody_type, timepoints, file_suffix) { # Filter data plotting <- data %>% filter(Antigen == antigen, timepoint %in% timepoints, Type == antibody_type) # Define group variable plotting$Group <- factor(paste(plotting$Pharyngitis, plotting$timepoint, sep= "_"), levels = c(paste("no pharyngitis", timepoints, sep="_"), paste("pharyngitis", timepoints, sep="_"))) plotting$id <- factor(plotting$id) # Plot the graph p <- ggplot(plotting, aes(Group, value, fill = Pharyngitis)) + geom_violin(aes(alpha = timepoint), draw_quantiles = c(0.5), scale= "width", position = "dodge") + geom_beeswarm(size = 3, alpha = 0.75, cex =2, shape = 21) + geom_line(aes(group = id, color = Pharyngitis), size = 0.35, alpha = 0.8) + scale_fill_manual(values = c("blue", "red")) + scale_alpha_manual(values = c(0.15, 0.5)) + scale_color_manual(values = c("blue", "red")) + ylab(paste(antigen, "(ELISA AU)")) + xlab("") + coord_cartesian(ylim = c(0, max(plotting$value, na.rm = TRUE) * 1.25)) + custom_theme() # Save the plot ggsave(file = paste(antigen, file_suffix, ".pdf", sep = "_"), plot = p, width = 4, height = 4) } # Prepare data for IgG rep.plotting_igg <- data_melt %>% filter(timepoint %in% c("Pre.challenge", "1month"), Type == "IgG") # Antigens to plot for IgG antigens_igg <- c("SPYCEP", "SLO", "ScpA", "GAC", "TF") # Plot IgG antigens for (antigen in antigens_igg) { plot_antigen(rep.plotting_igg, antigen, "IgG", c("Pre.challenge", "1month"), "rawtitres") } # Prepare IgA data rep.plotting_iga <- data_melt %>% filter(timepoint %in% c("Pre.challenge", "1week"), Type == "IgA") # Antigens to plot for IgA antigens_iga <- c("GAC", "M75.HVR") # Plot IgA antigens for (antigen in antigens_iga) { plot_antigen(rep.plotting_iga, antigen, "IgA", c("Pre.challenge", "1week"), "IgA.rawtitres") } #remove variables and dataframes not used again rm(rep.plotting_iga, rep.plotting_igg, antigens_iga, antigens_igg) Related to Fig 2F - Supplementary Fig S3 and S4: Graph each antigen seperately - raw titres #make a group variable with short names for the graph rep.plotting <- data_melt[data_melt$timepoint %in% c("Pre.challenge","1month")& data_melt$Type == "IgG",] rep.plotting$Group <- factor(paste(rep.plotting$Pharyngitis, rep.plotting$timepoint, sep= "_"), levels = c("no pharyngitis_Pre.challenge", "no pharyngitis_1month", "pharyngitis_Pre.challenge", "pharyngitis_1month")) rep.plotting$Group <- ifelse(rep.plotting$Group == "no pharyngitis_Pre.challenge", "NP.Pre", ifelse(rep.plotting$Group == "no pharyngitis_1month", "NP.1mo",ifelse(rep.plotting$Group == "pharyngitis_Pre.challenge", "P.Pre", "P.1mo"))) rep.plotting$Group <- factor(rep.plotting$Group, levels = c("NP.Pre","NP.1mo","P.Pre", "P.1mo" )) rep.plotting$id <- factor(rep.plotting$id) #make the list Antigen.order.all <- c(Antigen.Order, additional) rep.plotting$Antigen <- factor(rep.plotting$Antigen , levels = Antigen.order.all ) antigen_levels <- unique(levels(rep.plotting$Antigen)) removeAgs <- c("GAC", "SLO", "ScpA", "SPYCEP") # These are already plotted is_in_removeAgs <- antigen_levels %in% removeAgs antigen_levels <- antigen_levels[!is_in_removeAgs] #IgG data first plot_list1 <- lapply(antigen_levels, function(antigen) { # Subset the data for the current level of 'Antigen' subset_df <- rep.plotting[rep.plotting$Antigen == antigen, ] ggplot(subset_df, aes(Group, value, fill = Pharyngitis))+ geom_violin(aes(alpha = timepoint), draw_quantiles = c(0.5), scale= "width", position = "dodge")+ geom_point(size = 3, alpha = 0.75, shape = 21)+ geom_line(aes(group = id, color = Pharyngitis), size = 0.35, alpha = 0.8)+ scale_fill_manual(values = c("blue", "red"))+ scale_alpha_manual(values = c(0.15, 0.5))+ scale_color_manual(values = c("blue", "red"))+ ylab(paste(antigen, "(ELISA AU)"))+ xlab("")+ coord_cartesian( ylim = c(0, max(subset_df$value, na.rm = T)*1.25))+ theme_bw()+ theme(legend.position = "none", axis.text.x=element_text(colour="black", size = 9), axis.text.y=element_text(colour="black", size = 9), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_text(size = 12)) }) #IgA data next rep.plotting <- data_melt[data_melt$timepoint %in% c("Pre.challenge","1week")& data_melt$Type == "IgA",] rep.plotting$Group <- factor(paste(rep.plotting$Pharyngitis, rep.plotting$timepoint, sep= "_"), levels = c("no pharyngitis_Pre.challenge", "no pharyngitis_1week", "pharyngitis_Pre.challenge", "pharyngitis_1week")) rep.plotting$Group <- ifelse(rep.plotting$Group == "no pharyngitis_Pre.challenge", "NP.Pre", ifelse(rep.plotting$Group == "no pharyngitis_1week", "NP.1wk",ifelse(rep.plotting$Group == "pharyngitis_Pre.challenge", "P.Pre", "P.1wk"))) rep.plotting$Group <- factor(rep.plotting$Group, levels = c("NP.Pre","NP.1wk","P.Pre", "P.1wk" )) rep.plotting$id <- factor(rep.plotting$id) rep.plotting$Antigen <- factor(rep.plotting$Antigen , levels = Antigen.order.all ) antigen_levels <- unique(levels(rep.plotting$Antigen)) removeAgs <- c("GAC", "M75.HVR") # These are already plotted is_in_removeAgs <- antigen_levels %in% removeAgs antigen_levels <- antigen_levels[!is_in_removeAgs] plot_list2 <- lapply(antigen_levels, function(antigen) { # Subset the data for the current level of 'Antigen' subset_df <- rep.plotting[rep.plotting$Antigen == antigen, ] ggplot(subset_df, aes(Group, value, fill = Pharyngitis))+ geom_violin(aes(alpha = timepoint), draw_quantiles = c(0.5), scale= "width", position = "dodge")+ geom_point(size = 3, alpha = 0.75, shape = 21)+ geom_line(aes(group = id, color = Pharyngitis), size = 0.35, alpha = 0.8)+ scale_fill_manual(values = c("blue", "red"))+ scale_alpha_manual(values = c(0.15, 0.5))+ scale_color_manual(values = c("blue", "red"))+ ylab(paste(antigen, "(ELISA AU)"))+ xlab("")+ coord_cartesian( ylim = c(0, max(subset_df$value, na.rm = T)*1.25))+ theme_bw()+ theme(legend.position = "none", axis.text.x=element_text(colour="black", size = 9), axis.text.y=element_text(colour="black", size = 9), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_text(size = 12)) }) # Step 2: Use lapply() to generate a list of ggplots for each 'Antigen' level combined_plot1<- plot_grid(plotlist= plot_list1[1:9], ncol = 3, align = 'h') combined_plot2<- plot_grid(plotlist = plot_list1[10:17], ncol = 3, align = 'h') combined_plot3<- plot_grid(plotlist = plot_list2[1:9], ncol = 3, align = 'h') combined_plot4<- plot_grid(plotlist = plot_list2[10:17], ncol = 3, align = 'h') # Now, use ggsave to save the combined plot to a file ggsave("combined_plots1_IgG.pdf", plot = combined_plot1, width = 8, height = 8, dpi = 300) ggsave("combined_plots2_IgG.pdf", plot = combined_plot2, width = 8, height = 8, dpi = 300) ggsave("combined_plots3_IgA.pdf", plot = combined_plot3, width = 8, height = 8, dpi = 300) ggsave("combined_plots4_IgA.pdf", plot = combined_plot4, width = 8, height = 8, dpi = 300) rm(plot_list1, plot_list2, combined_plot1, combined_plot2, combined_plot3, combined_plot4) Fig 2G : Heatmap of correlation coefficients between fold change in IgA and IgG. # heatmap function: breaksList = c(-1.0, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1) # a ramp of vales for log2 FC heatpal <- c("#2166AC" ,"#3F8DC0", "#80B9D8" ,"#BBDAEA" ,"#F7F7F7" ,"#F4A582", "#D6604D", "#B2182B", "#67001F") heatIT <- function(heated) {pheatmap(heated, kmeans_k = NA, breaks = breaksList, color = colorRampPalette(heatpal)(11), border_color = "black", cellwidth = 11, cellheight =11, # display_numbers = test_labels, number_format = "%", scale = "none", cluster_rows = F, cluster_cols = F, number_color = "grey", show_rownames = T, show_colnames = F, fontsize = 8, display_numbers = F, filename = NA, silent = FALSE) } #extract variables FC.Names <- names(data[,grepl("FC", names(data))]) #further subset variable names IgA <- FC.Names[grepl("IgA", FC.Names)] OneWeek <- FC.Names[grepl("week", FC.Names) ] OneWeek <- OneWeek[!grepl("IgA", OneWeek) ] OneMonth <- FC.Names[grepl("1month", FC.Names)] #subset data by timepoint IgA <- data[,IgA] OneWeek <- data[,OneWeek] OneMonth <- data[,OneMonth] #tidy variable names names(IgA) <- gsub("IgA_", "", names(IgA)) names(IgA) <- gsub("_1weekFC", "", names(IgA)) names(OneWeek) <- gsub("IgG_", "", names(OneWeek)) names(OneWeek) <- gsub("_1weekFC", "", names(OneWeek)) names(OneMonth) <- gsub("IgG_", "", names(OneMonth)) names(OneMonth) <- gsub("_1monthFC", "", names(OneMonth)) # keep main antigens only IgA <- IgA[,Antigen.Order] OneWeek <- OneWeek[,Antigen.Order] OneMonth <- OneMonth[,Antigen.Order] #initiate matrix Postchallenge = matrix(data = 0, nrow = 11, ncol = 4) #initialize matrix for(i in 1:11){ Postchallenge[i,1] <- cor.test(IgA[,i], OneWeek[,i], method = "spearman")$estimate #median FC pharyngitis participants Postchallenge[i,2] <- cor.test(IgA[,i], OneMonth[,i], method = "spearman")$estimate #median FC non pharyngitis participants Postchallenge[i,3] <- cor.test(IgA[,i], OneWeek[,i], method = "spearman")$p.value Postchallenge[i,4] <- cor.test(IgA[,i], OneMonth[,i], method = "spearman")$p.value } #make into a dataframe and name columns and rows Postchallenge<- as.data.frame(Postchallenge) names(Postchallenge) <- c("Rho.1week", "Rho.1month", "pvalOneweek", "pvalOneMonth") row.names(Postchallenge) <- names(IgA) #graph it heated <- (Postchallenge[Antigen.Order,c(1,2)]) pdf(file = "IgA.correlationFcHeatmap.pdf", width = 3, height = c((nrow(heated)/12)+1)) heatIT(heated) dev.off() #remove dataframes and variables no longer needed rm(mo1, heated, Postchallenge, Postchallenge.IgA, Postchallenge_1wk, Postchallenge.IgA.1month, rep.plotting_iga, breaksList, is_in_removeAgs,heatpal) Fig 3: Luminex data over 3 months #define plotting function PlotIT <-function(d){ggplot(data=d, aes(y=y)) + geom_line(aes(x=xj, group=id), color= "black", alpha = .8) + geom_point(data = d %>% filter(x=="1"), aes(x=xj), shape = 21, fill = 'darkgrey', size = 2.5, alpha = .7) + geom_point(data = d %>% filter(x=="2"), aes(x=xj), shape = 21, fill= '#F0E442', size = 2.5, alpha = .7) + geom_point(data = d %>% filter(x=="3"), aes(x=xj), shape = 21, fill= '#1FAF9D', size = 2.5, alpha = .7) + geom_point(data = d %>% filter(x=="4"), aes(x=xj), shape = 21, fill = '#521ADA', size = 2.5, alpha = .7) + geom_half_violin( data = d %>% filter(x=="1"),aes(x = x, y = y), draw_quantiles = c(0.5),position = position_nudge(x = 3.7), side = "l", fill = 'darkgrey') + geom_half_violin( data = d %>% filter(x=="2"),aes(x = x, y = y), draw_quantiles = c(0.5),position = position_nudge(x = 2.9), side = "l", fill = "#F0E442") + geom_half_violin( data = d %>% filter(x=="3"),aes(x = x, y = y), draw_quantiles = c(0.5),position = position_nudge(x = 2.1), side = "l", fill = '#1FAF9D') + geom_half_violin( data = d %>% filter(x=="4"),aes(x = x, y = y), draw_quantiles = c(0.5),position = position_nudge(x = 1.3), side = "l", fill = '#521ADA')+ #Define additional settings scale_x_continuous(breaks=c(1,2,3,4), labels=c("Pre", "1wk","1mo", "3mo"), limits=c(0.75, 5.3)) + xlab("") + ylab(paste(var, "log10 ug/mL")) + coord_cartesian(ylim= ylimits)+ ggtitle(paste(group, "over.time")) + custom_theme()+ theme(axis.text.x=element_text(angle = 90)) } # Function to adjust p-values p_values <- function(d, e) { p <- c( wilcox.test(d[d$x == 1,]$y, d[d$x == 2,]$y, paired = TRUE)$p.value, wilcox.test(d[d$x == 1,]$y, d[d$x == 3,]$y, paired = TRUE)$p.value, wilcox.test(d[d$x == 1,]$y, d[d$x == 4,]$y, paired = TRUE)$p.value ) p.adj.paired <- p.adjust(p, method = "fdr", n = length(p)) p <- c( wilcox.test(d[d$x == 1,]$y, e[e$x == 1,]$y, paired = FALSE)$p.value, wilcox.test(d[d$x == 2,]$y, e[e$x == 2,]$y, paired = FALSE)$p.value, wilcox.test(d[d$x == 3,]$y, e[e$x == 3,]$y, paired = FALSE)$p.value, wilcox.test(d[d$x == 4,]$y, e[e$x == 4,]$y, paired = FALSE)$p.value ) p.adj.groups <- p.adjust(p, method = "fdr", n = length(p)) return(list(paired = p.adj.paired, groups = p.adj.groups)) } #luminex data split out timepoints OneMonth <-luminex[luminex$timepoint == "1month",] PreChallenge <-luminex[luminex$timepoint == "Pre.challenge",] #remove the 2 individuals with missing one month datapoints PreChallenge <-PreChallenge[PreChallenge$id %in% OneMonth$id,] #Create normalised dataframe OneMonthFC <-OneMonth # to get the other vars OneMonthFC[,c(6:11)] <- OneMonth[,c(6:11)] / PreChallenge[,c(6:11)] # Keep data for participants with samples present at all time-points keep.Ids <- luminex %>% group_by(id) %>% tally() %>% filter(n == 4) %>% pull(id) #SpyCEP responders var <- "SPYCEP" #Split into group based on a 'non-response' of < 1.25 fold change SpyCEP.non.responders <- OneMonthFC[which(OneMonthFC$SpyCEP < 1.25),]$id #use this to generate new variable luminex$SPYCEP.Resp <- if_else(luminex$id %in% SpyCEP.non.responders, "no","yes") #define graph ylimits ylimits <- c(-0.25, 2.25) #graph SpyCEPresponders group <- "responders" d <- luminex[luminex$SPYCEP.Resp == "yes" & luminex$id %in% keep.Ids ,] d$x <-as.numeric(d$timepoint) d$xj <-jitter(d$x, amount = .09) d$y <-log(d$SpyCEP,10) PlotIT(d) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) #graph SpyCEP non-responders group <- "non-responders" e <- luminex[luminex$SPYCEP.Resp == "no"& luminex$id %in% keep.Ids ,] e$x <-as.numeric(e$timepoint) e$xj <-jitter(e$x, amount = .09) e$y <-log(e$SpyCEP,10) PlotIT(e) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) #write out p-values print(p_values(d,e)) #SLO var <- "SLO" #Split into group based on a 'non-response' of < 1.25 fold change SLO.non.responders <- OneMonthFC[which(OneMonthFC$SLO < 1.25),]$id luminex$SLO.Resp <- if_else(luminex$id %in% SLO.non.responders, "no","yes") #set graph limits ylimits <- c(0.5, 2.5) #graph responders group <- "responders" d <- luminex[luminex$SLO.Resp == "yes"& luminex$id %in% keep.Ids,] d$x <-as.numeric(d$timepoint) d$xj <-jitter(d$x, amount = .09) d$y <-log(d$SLO,10) PlotIT(d) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) group <- "non-responders" e <- luminex[luminex$SLO.Resp == "no"& luminex$id %in% keep.Ids ,] e$x <-as.numeric(e$timepoint) e$xj <-jitter(e$x, amount = .09) e$y <-log(e$SLO,10) PlotIT(e) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) #write out p-values print(p_values(d,e)) #SCPA var <- "SCPA" ylimits <- c(1, 2.5) SCPA.non.responders <- OneMonthFC[which(OneMonthFC$SCPA < 1.25),]$id luminex$SCPA.Resp <- if_else(luminex$id %in% SCPA.non.responders, "no","yes") luminex %>% filter(timepoint == "Pre.challenge" & id %in% keep.Ids) %>% count(SCPA.Resp) group <- "responders" d <- luminex[luminex$SCPA.Resp == "yes" & luminex$id %in% keep.Ids,] d$x <-as.numeric(d$timepoint) d$xj <-jitter(d$x, amount = .09) d$y <-log(d$SCPA,10) PlotIT(d) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) group <- "non-responders" e <- luminex[luminex$SCPA.Resp == "no"& luminex$id %in% keep.Ids ,] e$x <-as.numeric(e$timepoint) e$xj <-jitter(e$x, amount = .09) e$y <-log(e$SCPA,10) PlotIT(e) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) #write out p-values print(p_values(d,e)) ##GAC var <- "GAC" ylimits <- c(-1.5,1) GAC.non.responders <- OneMonthFC[which(OneMonthFC$GAC < 1.25),]$id luminex$GAC.Resp <- if_else(luminex$id %in% GAC.non.responders, "no","yes") luminex %>% filter(timepoint == "Pre.challenge"& id %in% keep.Ids) %>% count(GAC.Resp) group <- "responders" d <- luminex[luminex$GAC.Resp == "yes" & luminex$id %in% keep.Ids,] d$x <-as.numeric(d$timepoint) d$xj <-jitter(d$x, amount = .09) d$y <-log(d$GAC,10) PlotIT(d) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) group <- "non-responders" e <- luminex[luminex$GAC.Resp == "no" & luminex$id %in% keep.Ids,] e$x <-as.numeric(e$timepoint) e$xj <-jitter(e$x, amount = .09) e$y <-log(e$GAC,10) PlotIT(e) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) #write out p-values print(p_values(d,e)) #DNAseB var <- "DnaseB" DnaseB.non.responders <- OneMonthFC[which(OneMonthFC$DnaseB < 1.25),]$id luminex$Dnase.Resp <- if_else(luminex$id %in% DnaseB.non.responders, "no","yes") luminex %>% filter(timepoint == "Pre.challenge"& id %in% keep.Ids) %>% count(Dnase.Resp) ylimits <- c(-1,2.5) group <- "responders" d <- luminex[luminex$Dnase.Resp == "yes" & luminex$id %in% keep.Ids,] d$x <-as.numeric(d$timepoint) d$xj <-jitter(d$x, amount = .09) d$y <-log(d$DnaseB,10) PlotIT(d) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) group <- "non-responders" e <- luminex[luminex$Dnase.Resp == "no" & luminex$id %in% keep.Ids,] e$x <-as.numeric(e$timepoint) e$xj <-jitter(e$x, amount = .09) e$y <-log(e$DnaseB,10) PlotIT(e) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) #write out p-values print(p_values(d,e)) ##SPNA var <- "SpnA" SpnA.non.responders <- OneMonthFC[which(OneMonthFC$SpnA < 1.25),]$id luminex$SpnA.Resp <- if_else(luminex$id %in% SpnA.non.responders, "no","yes") luminex %>% filter(timepoint == "Pre.challenge"& id %in% keep.Ids) %>% count(SpnA.Resp) ylimits <- c(-1,2) group <- "responders" d <- luminex[luminex$SpnA.Resp == "yes" & luminex$id %in% keep.Ids,] d$x <-as.numeric(d$timepoint) d$xj <-jitter(d$x, amount = .09) d$y <-log(d$SpnA,10) PlotIT(d) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) group <- "non-responders" e <- luminex[luminex$SpnA.Resp == "no" & luminex$id %in% keep.Ids,] e$x <-as.numeric(e$timepoint) e$xj <-jitter(e$x, amount = .09) e$y <-log(e$SpnA,10) PlotIT(e) ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0) #write out p-values print(p_values(d,e)) rm(d, e, group, var, DnaseB.non.responders, GAC.non.responders, SCPA.non.responders, SLO.non.responders, SpnA.non.responders, SpyCEP.non.responders) Fig 4: luminex compare responses to pharyngitis children #join children dataset to pharyngitis participants from Challenge data luminex.keep <- luminex[ luminex$timepoint %in% c("Pre.challenge", "1month") & luminex$pharyngitis == "pharyngitis" & !luminex$id %in% c("SN017", "SN075"), c("id", "timepoint", "SCPA", "SpyCEP", "GAC", "SLO", "DnaseB", "SpnA") ] names(luminex.children) <- names(luminex.keep) luminex.keep <- rbind(luminex.keep, luminex.children) luminex.keep$timepoint <- factor(luminex.keep$timepoint, levels = c("Pre.challenge", "1month", "healthy_child", "pharyngitis_child")) # List of antigens antigens <- c("SpyCEP", "SLO", "SCPA", "GAC", "DnaseB", "SpnA") # Function to plot data for each antigen plot_antigen_children <- function(data, antigen) { d <- data d$x <- as.numeric(d$timepoint) d$xj <- jitter(d$x, amount = .09) d$y <- log(d[[antigen]], 10) p <- ggplot(d, aes(timepoint, y, fill = timepoint)) + geom_violin(aes(alpha = timepoint), draw_quantiles = c(0.5), scale = "width", position = "dodge") + geom_beeswarm(size = 3, alpha = 0.65, cex = 2, shape = 21) + scale_fill_manual(values = c("red", "red", "#FFC107", "darkorange3")) + scale_alpha_manual(values = c(0.15, 0.5, 0.5, 0.5)) + scale_color_manual(values = c("red", "red", "#FFC107", "darkorange3")) + ylab(antigen) + xlab("") + coord_cartesian(ylim = c(min(d$y, na.rm = TRUE) * 1.25, max(d$y, na.rm = TRUE) * 1.25)) + custom_theme() ggsave(file = paste(antigen, "children.pdf", sep = "_"), plot = p, width = 4, height = 5) } # Iterate through each antigen and plot for (antigen in antigens) { plot_antigen_children(luminex.keep, antigen) } #annotated p-value luminex healthy luminex_pvals = matrix(data = 0, nrow = 6, ncol = 12) #initialize matrix for(i in 1:6){ luminex_pvals[i,1] <- wilcox.test(luminex.keep[c(1:17),c(i+2)],luminex.keep[c(18:34),c(i+2)], paired = T, alternative = "two.sided", na.rm = T)$p.value luminex_pvals[i,2] <- wilcox.test(luminex.keep[c(1:17),c(i+2)],luminex.keep[c(35:49),c(i+2)], paired = F, alternative = "two.sided", na.rm = T)$p.value luminex_pvals[i,3] <- wilcox.test(luminex.keep[c(1:17),c(i+2)],luminex.keep[c(50:73),c(i+2)], paired = F, alternative = "two.sided", na.rm = T)$p.value luminex_pvals[i,4] <- wilcox.test(luminex.keep[c(18:34),c(i+2)],luminex.keep[c(35:49),c(i+2)], paired = F, alternative = "two.sided", na.rm = T)$p.value luminex_pvals[i,5] <- wilcox.test(luminex.keep[c(18:34),c(i+2)],luminex.keep[c(50:73),c(i+2)], paired = F, alternative = "two.sided", na.rm = T)$p.value luminex_pvals[i,6] <- wilcox.test(luminex.keep[c(35:49),c(i+2)],luminex.keep[c(50:73),c(i+2)], paired = F, alternative = "two.sided", na.rm = T)$p.value } for(i in 1:6){ luminex_pvals[i,c(7:12)] <- p.adjust(luminex_pvals[i,c(1:6)], method = "fdr" ) } luminex_pvals<- as.data.frame(luminex_pvals) names(luminex_pvals) <- c("prevspost","pre_vs_healthy","pre_vs_pharyn","post_vs_healthy","post_vs_pharyn", "healthy_vs_pharyn", "prevspost.adj","pre_vs_healthy.adj","pre_vs_pharyn.adj","post_vs_healthy.adj","post_vs_pharyn.adj", "healthy_vs_pharyn.adj") rownames(luminex_pvals) <- names(luminex.keep)[3:8] write.csv(file = "children.luminex.csv", luminex_pvals) rm(luminex_pvals, antigens, luminex.keep) Figure 5: Luminex: Correlation of pre-challenge titre with 1 month fold change: #define plotting function plot_and_save <- function(data, fc_data, antigen, file_name, xlim, ylim) { p <- ggplot(data, aes(y = log(data[[antigen]], 10), x = log(fc_data[[antigen]], 10))) + geom_point(shape = 21, cex = 3, alpha = 0.75, aes(fill = pharyngitis)) + scale_fill_manual(values = c("blue", "red")) + ylab("log10 Pre challenge") + xlab("log10 FC IgG 1month") + geom_smooth(method = 'lm', se = FALSE, colour = "black") + coord_cartesian(ylim = ylim, xlim = xlim) + custom_theme() ggsave(file = file_name, plot = p, width = 3, height = 3) # Spearman Correlation p-value cor_result <- cor.test(data[[antigen]], fc_data[[antigen]], method = "spearman") print(cor_result) return(cor_result) } antigens <- c("SpyCEP", "SLO", "SCPA", "GAC", "SpnA", "DnaseB") xlims <- list(c(-0.5, 2.5), c(-0.1, 0.8), c(-0.1, 0.8), c(-0.5, 1.5), c(-0.1, 1.5), c(-0.5, 2.5)) ylims <- list(c(-0.5, 2.5), c(0.5, 2.5), c(0.5, 2.5), c(-1.5, 1.5), c(-1.5, 1.65), c(-1, 2.5)) # Loop through each antigen and plot for (i in seq_along(antigens)) { antigen <- antigens[i] xlim <- xlims[[i]] ylim <- ylims[[i]] file_name <- paste("PrechallengevsFC_", antigen, "_luminex.pdf", sep = "") plot_and_save(PreChallenge, OneMonthFC, antigen, file_name, xlim, ylim) } # now repeat for M75 IgG and for ScpA IgA, bot ELISA data ggplot(data, aes(y = log(IgA_ScpA_Pre.challenge,10), x = log(data$IgA_ScpA_1weekFC,10)))+ geom_point(shape = 21, cex = 3, alpha = 0.75, aes(fill =Pharyngitis))+ scale_fill_manual(values = c("blue", "red"))+ ylab("log10 Pre challenge")+ xlab("log 10 FC IgG 1month")+ geom_smooth(method='lm', se = FALSE, colour = "black")+ coord_cartesian(ylim = c(0.5,2.5), xlim= c(-0.5, 0.5))+ custom_theme() ggsave(file = "PrechallengevsFC_IgA_Scpa.pdf", width = 3, height = 3 ) ggplot(data, aes(y = log(IgG_M75_Pre.challenge,10), x = log(data$IgG_M75_1weekFC,10)))+ geom_point(shape = 21, cex = 3, alpha = 0.75, aes(fill =Pharyngitis))+ scale_fill_manual(values = c("blue", "red"))+ ylab("log10 Pre challenge")+ xlab("log10 FC IgG 1month")+ geom_smooth(method='lm', se = FALSE, colour = "black")+ coord_cartesian(ylim = c(1,4), xlim= c(-0.25, 0.5))+ custom_theme() ggsave(file = "PrechallengevsFC_M75.pdf", width = 3, height = 3 ) rm(PreChallenge, OneMonth, OneMonthFC) Fig 6 and 7: multivariate analysis - data preparation #build a big dataframe of pre-challenge data variables, and substitute luminex for ELISA where antigens are in common. #keep baseline only. Substitute luminex for ELISA PreChallenge <- data[,grepl("Pre.challenge",names(data))] # add in clinical variables to include in Figure 7. PreChallenge <-cbind(PreChallenge, data[,c(204,1,2,208:ncol(data))]) #reorder variables PreChallenge <- PreChallenge[,c(36:ncol(PreChallenge), 1:35)] #create new dataframe for removing non-IgG variables PreChallenge.lumi <-luminex[luminex$timepoint == "Pre.challenge",c(1, 6:11)] PreChallenge <- PreChallenge[,!names(PreChallenge) %in% c("IgG_SPYCEP_Pre.challenge","IgG_GAC_Pre.challenge","IgG_ScpA_Pre.challenge","IgG_SLO_Pre.challenge")] PreChallenge <- left_join(PreChallenge, PreChallenge.lumi, by = "id") Fig 6: Multidimensional scaling set.seed(36) # ensures same plots each time for mds #remove anything thats not IgG PreChallenge_trim <- PreChallenge[,!grepl("IgA_", names(PreChallenge))] PreChallenge_trim <- PreChallenge_trim[,!grepl("Adehesion_percent_Pre_saliva", names(PreChallenge_trim))] # remove clinical variables and antigen response tally. Create the Prechallenge.z dataframe for use later PreChallenge.z <- PreChallenge_trim[,c(1,14,17:ncol(PreChallenge_trim))] #tidy names names(PreChallenge.z) <- gsub("IgG_", "", names(PreChallenge.z)) names(PreChallenge.z) <- gsub("_Pre.challenge", "", names(PreChallenge.z)) #Select one month data OneMonth <- data[,grepl("1month",names(data))] OneMonth <- OneMonth[,!grepl("FC",names(OneMonth))] OneMonth <-cbind(OneMonth, data[,c(1,219)]) #reorder OneMonth <- OneMonth[,c(18:ncol(OneMonth), 1:17)] OneMonth.lumi <-luminex[luminex$timepoint == "1month", c(1, 6:11)] #remove ELISA vars OneMonth <- OneMonth[,!names(OneMonth) %in% c("IgG_SPYCEP_1month","IgG_GAC_1month","IgG_ScpA_1month","IgG_SLO_1month")] OneMonth <- left_join(OneMonth, OneMonth.lumi, by = "id") #Create the OneMonth.z dataframe for use later One.Month.z <- OneMonth #Tidy names names(One.Month.z) <- gsub("IgG_", "", names(One.Month.z)) names(One.Month.z) <- gsub("_1month", "", names(One.Month.z)) #select 3 months data ThreeMonths <- data[,grepl("3months",names(data))] ThreeMonths <- ThreeMonths[,!grepl("FC",names(ThreeMonths))] ThreeMonths <-cbind(ThreeMonths, data[,c(1,219)]) #reorder ThreeMonths <- ThreeMonths[,c(18:ncol(ThreeMonths), 1:17)] ThreeMonths.lumi <-luminex[luminex$timepoint == "3months", c(1, 6:11)] #remove ELISA vars ThreeMonths <- ThreeMonths[,!names(ThreeMonths) %in% c("IgG_SPYCEP_3months","IgG_GAC_3months","IgG_ScpA_3months","IgG_SLO_3months")] ThreeMonths <- left_join(ThreeMonths, ThreeMonths.lumi, by = "id") #Create the Threemonths.z dataframe for use later ThreeMonths.z <- ThreeMonths names(ThreeMonths.z) <- gsub("IgG_", "", names(ThreeMonths.z)) names(ThreeMonths.z) <- gsub("_3months", "", names(ThreeMonths.z)) #double check names are consistent names(PreChallenge.z) == names(One.Month.z) names(ThreeMonths.z) == names(One.Month.z) #add timepoint variable PreChallenge.z$timepoint <- "Pre" One.Month.z$timepoint <- "1mo" ThreeMonths.z$timepoint <- "3mo" #bind together data.z <- rbind(PreChallenge.z, One.Month.z, ThreeMonths.z) data.z$timepoint <- factor(data.z$timepoint, levels =c("Pre","1mo", "3mo")) #reorder data.z <- data.z[,c(1, 2, 22, 3:21)] #Zscore transform data norm.study <- as.data.frame(data.z[,c(4:22)]) # define variables with which to normalise data using the 5-95% intervals of the data conf.Int <- matrix(nrow = ncol(norm.study), ncol =4) for(i in 1:19){ conf.Int[i,1] <- quantile(norm.study[,i], probs=c(0.05), na.rm= T) # 5th percentile conf.Int[i,2] <- quantile(norm.study[,i], probs=c(0.95), na.rm= T) # 95th Percentile conf.Int[i,3] <- mean(norm.study[,i], na.rm= T) # determine mean of data within 90th percentile conf.Int[i,4] <- sd(norm.study[,i], na.rm= T) # determine st dev of data within 90th percentile } #normalise the data for(i in 1:19){ norm.study[,i] <- (norm.study[,i] - conf.Int[i,3])/conf.Int[i,4] #normalise data using this mean and stdev } #replace transformed data data.z[,c(4:22)] <- norm.study #filter data for graphing plotting <- data.z plotting <- data.z[complete.cases(data.z[,c(4:22)]),] # 2 individuals with missing 1 month and 3 month samples are removed plotting <- plotting[plotting$timepoint %in% c("Pre", "1mo"),] Y <- plotting$Pharyngitis #outcome variable X <- plotting[,c(4:22)] #only numerical data Z <- plotting$timepoint #add in response variable (see line 562 ) plotting$four_responders <- if_else(plotting$id %in% four.responders, "Y", "N") #MDS data.cor <- cor(t(X), method = "spearman") # cluster d <- dist(data.cor) # determine distance matrix xcMDS <- monoMDS(d, k =2) # reduces dimensionality to 2 dimensions. #plot Multidimensional scaled data ggplot(data = plotting, aes(x = xcMDS$points[,1], y = xcMDS$points[,2]))+ theme_classic() + geom_point(show.legend = F, cex = 4, alpha = 0.75, aes(color = Y, fill = Y, shape = Z))+ scale_fill_manual(values = c("blue", "red"))+ scale_color_manual(values = c("blue", "red"))+ scale_shape_manual(values = c(21, 22, 23, 24, 25))+ guides(fill = FALSE, color = FALSE)+ ylab("MDS dim 2")+ xlab("MDS dim 1") ggsave("MDS.color.by.outcome.seed36.pdf", width = 3, height = 2.5) #plot MDS plot coloured by ID with connecting lines #define colour palette P25 = createPalette(25, c("#ff0000", "#00ff00", "#0000ff")) names(P25) <- NULL ggplot(data = plotting, aes(x = xcMDS$points[,1], y = xcMDS$points[,2]))+ theme_classic() + geom_line(aes(color = id), size =2)+ geom_point(show.legend = F, cex = 4, alpha = 0.75, aes( fill = id, shape = Z)) + scale_fill_manual(values = P25)+ scale_color_manual(values = P25)+ scale_shape_manual(values = c(21, 22, 23, 24, 25))+ guides(fill = FALSE, color = FALSE)+ ylab("MDS dim 2")+ xlab("MDS dim 1") ggsave("MDS.color.by.id.seed36.pdf", width = 3, height = 2.5) #plot MDS data coloured by four_responders ggplot(data = plotting, aes(x = xcMDS$points[,1], y = xcMDS$points[,2]))+ theme_classic() + geom_point(show.legend = F, cex = 4, alpha = 0.75, aes(fill = plotting$four_responders, shape = timepoint)) + scale_fill_manual(values = c("white", "dark green"))+ scale_shape_manual(values = c(21, 22, 23))+ # guides(fill = F)+ ylab("MDS dim 2")+ xlab("MDS dim 1") ggsave("MDS.color.by.count.pdf", width = 3, height = 2.5) #remove datasets not used again rm(PreChallenge.lumi, PreChallenge_trim, OneMonth.lumi, ThreeMonths.lumi, ThreeMonths, OneMonth, PreChallenge.z, One.Month.z, ThreeMonths.z, norm.study, X, Y, Z, data.cor, d, P25) Fig 6 (continued) : compare MDS1 # add new variable to plotting dataframe plotting$mds1 <- xcMDS$points[,1] #subset plotting data into each timepoint mds1.pre <- plotting[plotting$timepoint %in% "Pre",] mds1.post <- plotting[plotting$timepoint %in% "1mo",] #create a post-pre MDS1 variable mds1.post$mds1 <- mds1.post$mds1-mds1.pre[mds1.pre$id %in% mds1.post$id,]$mds1 #plots for 7B ggplot(mds1.pre, aes(Pharyngitis, mds1, fill = Pharyngitis))+ geom_violin(draw_quantiles = c(0.5), scale= "width", alpha = 0.25)+ geom_beeswarm(size = 3, alpha = 0.75, cex =5, shape = 21)+ scale_fill_manual(values = c("blue", "red"))+ ylab("NMDS dim 1")+ xlab("")+ stat_compare_means(method = "wilcox.test", tip.length = 0.005, bracket.size = 0.5, vjust = -1, paired = F, aes(label = paste0("p = ", ..p.format..)))+ coord_cartesian( ylim = c(-1.50, 1.5))+ theme_bw()+ theme(legend.position = "none", axis.text.x=element_text(colour="black", size = 14), axis.text.y=element_text(colour="black", size = 10), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_text(size = 14)) ggsave(file = "NMDS_pre.pdf", width = 2.25, height = 3 ) ggplot(mds1.post, aes(Pharyngitis, mds1, fill = Pharyngitis))+ geom_violin(draw_quantiles = c(0.5), scale= "width", alpha = 0.25)+ geom_beeswarm(size = 3, alpha = 0.75, cex =5, shape = 21)+ scale_fill_manual(values = c("blue", "red"))+ ylab("dim 1 change (1mo-Pre)")+ xlab("")+ stat_compare_means(method = "wilcox.test", tip.length = 0.005, bracket.size = 0.5, vjust = -1, paired = F, aes(label = paste0("p = ", ..p.format..)))+ coord_cartesian( ylim = c(-0.50, 1.5))+ theme_bw()+ theme(legend.position = "none", axis.text.x=element_text(colour="black", size = 14), axis.text.y=element_text(colour="black", size = 10), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_text(size = 14)) ggsave(file = "NMDS_post.pdf", width = 2.25, height = 3) # Fig 6C: mds.Corr = matrix(data = 0, nrow =19 , ncol = 2) #initialize matrix for(i in 1:19){ mds.Corr[i,1] <- cor.test(plotting[,c(i+3)], plotting$mds1, method = "spearman")$estimate mds.Corr[i,2] <- cor.test(plotting[,c(i+3)], plotting$mds1, method = "spearman")$p.value } rownames(mds.Corr) <- names(plotting[,c(4:22)]) mds.Corr <- as.data.frame(mds.Corr) mds.Corr$adj <- p.adjust(mds.Corr[,2], method = "fdr") mds.Corr <- mds.Corr[mds.Corr$adj <0.05,] mds.Corr <- mds.Corr[order(mds.Corr$V1),] heatpal <- c("seagreen","white", "mediumpurple3") #plot a heatmap or Rho values, but ignore columns 2 and 3. pdf(file = "Correlation.with.mds.pdf") pheatmap(mds.Corr[,], kmeans_k = NA, # breaks = breaksList, color = colorRampPalette(heatpal)(1000), border_color = "black", cellwidth = 11, cellheight =11, # display_numbers = test_labels, number_format = "%", scale = "none", cluster_rows = F, cluster_cols = F, number_color = "grey", show_rownames = T, show_colnames = T, fontsize = 8, display_numbers = F, filename = NA, silent = FALSE) dev.off() #remove variables no longer needed rm(plotting, mds1.pre, mds1.post, mds.Corr) Fig 7: Correlation with outcome variables #automate comparisons between pre-challenge immune variables and outcome vars Outcome.Corr = matrix(data = 0, nrow =37 , ncol = 16) #initialize matrix for(i in 1:37){ Outcome.Corr[i,1] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Time_to_diagnosis, method = "spearman")$estimate Outcome.Corr[i,2] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Peak_Ct, method = "spearman")$estimate Outcome.Corr[i,3] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Tonsil, method = "spearman")$estimate Outcome.Corr[i,4] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Adenopathy, method = "spearman")$estimate Outcome.Corr[i,5] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Temp_max, method = "spearman")$estimate Outcome.Corr[i,6] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$CRP, method = "spearman")$estimate Outcome.Corr[i,7] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Pain, method = "spearman")$estimate Outcome.Corr[i,8] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Headache, method = "spearman")$estimate Outcome.Corr[i,9] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Time_to_diagnosis, method = "spearman")$p.value Outcome.Corr[i,10] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Peak_Ct, method = "spearman")$p.value Outcome.Corr[i,11] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Tonsil, method = "spearman")$p.value Outcome.Corr[i,12] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Adenopathy, method = "spearman")$p.value Outcome.Corr[i,13] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Temp_max, method = "spearman")$p.value Outcome.Corr[i,14] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$CRP, method = "spearman")$p.value Outcome.Corr[i,15] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Pain, method = "spearman")$p.value Outcome.Corr[i,16] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Headache, method = "spearman")$p.value } #make into a dataframe and rename columns and rows Outcome.Corr <- as.data.frame(Outcome.Corr) names(Outcome.Corr) <- c("Rho_time","Rho_CT","Rho_Tonsil","Rho_Adeno","Rho_Temp","Rho_CRP","Rho_Pain","Rho_Head", "p_time","p_CT","p_Tonsil", "p_Adeno", "p_Temp","p_CRP","p_Pain","p_Head") rownames(Outcome.Corr) <- names(PreChallenge[,c(17:ncol(PreChallenge))]) # save a csv of the p-values write.csv(Outcome.Corr, file = "outcomeCorr2.csv") #Keep any variables with a p<0.05 for any correlation plotting <- Outcome.Corr[c("IgG_ADI_Pre.challenge","IgG_TF_Pre.challenge","IgG_J8_Pre.challenge","IgG_K4S2_Pre.challenge","DnaseB","IgG_SPYAD_Pre.challenge","IgA_P145_Pre.challenge","SpyCEP","SpnA","IgG_T25_Pre.challenge"),1:8] #simplify names rownames(plotting) <-c("ADI","TF","J8","K4S2","DnaseB","SpyAD","P145_IgA","SpyCEP","SpnA","T25") #reorder by correlation coefficient for 'Time to diagnosis' names(plotting) <- gsub("Rho_", "",names(plotting) ) # graph it heatpal <- c("seagreen","white", "mediumpurple3") pdf(file = "Correlation.with.outcome.pdf") pheatmap(plotting, kmeans_k = NA, color = colorRampPalette(heatpal)(1000), border_color = "black", cellwidth = 11, cellheight =11, # display_numbers = test_labels, number_format = "%", scale = "none", cluster_rows = F, cluster_cols = F, number_color = "grey", show_rownames = T, show_colnames = T, fontsize = 8, display_numbers = F, filename = NA, silent = FALSE) dev.off() #remove variables not used elsewhere: rm(plotting, Outcome.Corr, heatpal) Figure 7 (continued) : sPLS-DA set.seed(36) Y <- PreChallenge$Pharyngitis #outcome variable X <- PreChallenge[,c(17:ncol(PreChallenge))] #only numerical data #initial model srbct.splsda <- splsda(X, Y, ncomp = 10) # set ncomp to 10 for performance assessment later # plot the samples projected onto the first two components of the PLS-DA subspace plotIndiv(srbct.splsda , comp = 1:2, group = PreChallenge$Pharyngitis, ind.names = FALSE, # colour points by class ellipse = TRUE, # include 95% confidence ellipse for each class legend = TRUE, title = 'sPLS-DA untuned') # undergo performance evaluation in order to tune the number of components to use perf.splsda.srbct <- perf(srbct.splsda, validation = "Mfold", folds = 3, progressBar = F, auc = TRUE, nrepeat = 100) perf.splsda.srbct$choice.ncomp # what is the optimal value of components according to perf() # grid of possible keepX values that will be tested for each component list.keepX <- c(1:37) # undergo the tuning process to determine the optimal number of variables tune.splsda.srbct <- tune.splsda(X, Y, ncomp = 2, validation = 'Mfold', folds = 3, progressBar = F, dist = 'mahalanobis.dist', measure = "BER", test.keepX = list.keepX, nrepeat = 200, cpus=3) # allow for parallelisation to decrease runtime #run final model final.splsda <- splsda(X, Y, ncomp = 2, keepX = 5) pdf(file = "sPLSDA.2dim.pdf", width = 3, height = 3) plotIndiv(final.splsda , comp = 1:2, group = PreChallenge$Pharyngitis, ind.names = FALSE, # colour points by class ellipse = TRUE, ellipse.level = 0.95, # include 95% confidence ellipse for each class legend = F, title = 'sPLS-DA') dev.off() pdf(file = "sPLSDA.loadings.pdf", width = 4.5, height = 3) ContribComp1 <- plotLoadings(final.splsda, comp = 1, contrib = 'max', method = 'mean', xlim = c(-0.4, 0.8)) dev.off() selectVar(final.splsda, comp=1)$name[which(ContribComp1$GroupContrib == "no pharyngitis")] selectVar(final.splsda, comp=1)$name[which(ContribComp1$GroupContrib == "pharyngitis")] # Graph up raw data of results Select.var <- "IgG_TF_Pre.challenge" # Select.var <- "IgG_ADI_Pre.challenge" # Select.var <- "IgA_P145_Pre.challenge" # Select.var <- "SpnA" # Select.var <- "IgG_T25_Pre.challenge" ggplot(PreChallenge, aes(Pharyngitis, log(PreChallenge[,Select.var],10), fill = Pharyngitis))+ geom_violin(draw_quantiles = c(0.5), scale= "width", alpha = 0.25)+ geom_beeswarm(size = 3, alpha = 0.75, cex =5, shape = 21)+ scale_fill_manual(values = c("blue", "red"))+ ylab("")+ xlab("")+ stat_compare_means(method = "wilcox.test", tip.length = 0.005, bracket.size = 0.5, vjust = -1, paired = F, aes(label = paste0("p = ", ..p.format..)))+ coord_cartesian( ylim = c(0, 3))+ theme_bw()+ theme(legend.position = "none", axis.text.x=element_text(colour="black", size = 14), axis.text.y=element_text(colour="black", size = 10), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.title.y = element_text(size = 14)) ggsave(file = paste(Select.var, "Pre.challenge.pdf", sep = ""), width = 2, height = 4) 系统信息 R version 4.3.3 (2024-02-29) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Sonoma 14.2 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 time zone: Asia/Shanghai tzcode source: internal attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] cowplot_1.1.3 mixOmics_6.26.0 MASS_7.3-60.0.1 uwot_0.1.16 Matrix_1.6-5 Polychrome_1.5.1 gghalves_0.1.4 [8] ggpubr_0.6.0 vegan_2.6-4 lattice_0.22-6 permute_0.9-7 viridis_0.6.5 viridisLite_0.4.2 RColorBrewer_1.1-3 [15] pheatmap_1.0.12 ggbeeswarm_0.7.2 reshape2_1.4.4 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 [22] purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1 tidyverse_2.0.0 ggplot2_3.5.1 loaded via a namespace (and not attached): [1] tidyselect_1.2.1 vipor_0.4.7 fastmap_1.1.1 digest_0.6.35 timechange_0.3.0 lifecycle_1.0.4 [7] cluster_2.1.6 magrittr_2.0.3 compiler_4.3.3 rlang_1.1.4 tools_4.3.3 igraph_2.0.3 [13] utf8_1.2.4 yaml_2.3.8 knitr_1.49 ggsignif_0.6.4 rARPACK_0.11-0 scatterplot3d_0.3-44 [19] plyr_1.8.9 abind_1.4-5 BiocParallel_1.36.0 withr_3.0.2 grid_4.3.3 fansi_1.0.6 [25] colorspace_2.1-0 scales_1.3.0 cli_3.6.3 ellipse_0.5.0 rmarkdown_2.26 generics_0.1.3 [31] RSpectra_0.16-1 rstudioapi_0.16.0 tzdb_0.4.0 splines_4.3.3 parallel_4.3.3 matrixStats_1.2.0 [37] vctrs_0.6.5 carData_3.0-5 car_3.1-2 hms_1.1.3 rstatix_0.7.2 ggrepel_0.9.6 [43] beeswarm_0.4.0 glue_1.8.0 codetools_0.2-19 stringi_1.8.3 gtable_0.3.4 munsell_0.5.0 [49] pillar_1.9.0 htmltools_0.5.8 R6_2.5.1 evaluate_0.23 backports_1.4.1 broom_1.0.5 [55] corpcor_1.6.10 Rcpp_1.0.12 gridExtra_2.3 nlme_3.1-164 mgcv_1.9-1 xfun_0.50 [61] pkgconfig_2.0.3【科研绘图系列】R语言绘制SCI论文图合集由讯客互联开源代码栏目发布,感谢您对讯客互联的认可,以及对我们原创作品以及文章的青睐,非常欢迎各位朋友分享到个人网站或者朋友圈,但转载请说明文章出处“【科研绘图系列】R语言绘制SCI论文图合集”
下一篇
Spring接入DeepSeek