The raw data files were analyzed using MaxQuant to obtain protein identifications and their respective label-free quantification values using in-house standard parameters. Of note, the data were normalization based on the assumption that the majority of proteins do not change between the different conditions. Statistical analysis was performed using an in-house generated R script based on the ProteinGroup.txt file. First, contaminant proteins, reverse sequences and proteins identified “only by site” were filtered out. In addition, proteins that have been only identified by a single peptide and proteins not identified/quantified consistantly in same condition have been removed as well. The LFQ data was converted to log2 scale, samples were grouped by conditions and missing values were imputed using the ‘Missing not At Random’ (MNAR) method, which uses random draws from a left-shifted Gaussian distribution of 1.8 StDev (standard deviation) apart with a width of 0.3. Protein-wise linear models combined with empirical Bayes statistics were used for the differential expression analyses. The limma package from R Bioconductor was used to generate a list of differentially expressed proteins for each pair-wise comparison. A cutoff of the adjusted p-value of 0.05 (Benjamini-Hochberg method) along with a |log2 fold change| of 1 has been applied to determine significantly regulated proteins in each pairwise comparison.
A plot used the PCA method to emphasize the variation.
Correlation plot is similar to a heatmap, to visualise the relationship among different samples. The darker the stronger relevance between each sample.
Also called relative standard deviation (RSD), it is present as a histogram plot, and illustrates the degree of variation relative to the overall mean.
Protein quantified per sample (after pre-processing).
Protein overlap in all samples.
A heatmap for proteins with missing value in each dataset. Each row represent a protein with missing value in one or more replicate. Each replicate is clustered based on presence of missing values in the sample.
Protein expression distribution before and after imputation. The plot showing the effect of imputation on protein expression distribution.
A plot representing an overview of expression of all significant (differencially expressed) proteins (rows) in all samples (columns).
Plots illustrate statistically significant (P value) versus change magnitude (fold change) in different contrasts.