The aim of my analysis is to perform Limma in order to generate p values and fold changes for each protein between alive and dead patients. I have created 3 csv files with information on these proteins:
1) ExpressionMatrix: Contains protein abundance values as rows (1552). And each sample (alive and dead patients) as columns (81) 2) FeatureData: Contains each protein as rows (1552). And one other column (Description: This is a description of each protein) 3) PhenotypeData: Contains each sample as rows (81). And two other columns with the headings : Patient ID and State.
I understand that I need to create a Class to hold these different objects in other to make the Limma analysis more straightforward (I know this from using the Limma Tutorial from DataCamp). I did the boxplot function which uses the 3 different objects I'm using in other to confirm that all 3 objects are in the correct data format as the boxplot produced the expected result. But when I try to create a Class of all 3 objects, I keep getting the error:
Error in validObject(.Object) :
invalid class “ExpressionSet” object: 1: sampleNames differ between assayData and phenoData
invalid class “ExpressionSet” object: 2: sampleNames differ between phenoData and protocolData
I figure the issue is with the way I have structured objects "e" and "p"(ExpressionMatrix and PhenotypeData) but I am not sure how to fix it? Below is the code I have used from start to finish. Thank you for you help in advance!
>setwd("D:/sa825/Using Limma") > x<-read.csv("ExpressionMatrix.csv", stringsAsFactors = FALSE) > f<-read.csv("FeatureData.csv") > p<-read.csv("PhenotypeData.csv") > e<-as.matrix(x) > typeof(e)  "double" > class(e)  "matrix" > boxplot(e[1 , ] ~ p$State, main = f[1 , "Description"]) > source("https://bioconductor.org/biocLite.R") WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding: https://cran.rstudio.com/bin/windows/Rtools/ Installing package into ‘C:/Users/User/Documents/R/win-library/3.5’ (as ‘lib’ is unspecified) trying URL 'https://bioconductor.org/packages/3.7/bioc/bin/windows/contrib/3.5/BiocInstaller_1.30.0.zip' Content type 'application/zip' length 102191 bytes (99 KB) downloaded 99 KB package ‘BiocInstaller’ successfully unpacked and MD5 sums checked The downloaded binary packages are in C:\Users\User\AppData\Local\Temp\RtmpkLuJuJ\downloaded_packages Bioconductor version 3.7 (BiocInstaller 1.30.0), ?biocLite for help A newer version of Bioconductor is available for this version of R, ?BiocUpgrade for help > biocLite("Biobase") BioC_mirror: https://bioconductor.org Using Bioconductor 3.7 (BiocInstaller 1.30.0), R 3.5.3 (2019-03-11). Installing package(s) ‘Biobase’ also installing the dependency ‘BiocGenerics’ trying URL 'https://bioconductor.org/packages/3.7/bioc/bin/windows/contrib/3.5/BiocGenerics_0.26.0.zip' Content type 'application/zip' length 745077 bytes (727 KB) downloaded 727 KB trying URL 'https://bioconductor.org/packages/3.7/bioc/bin/windows/contrib/3.5/Biobase_2.40.0.zip' Content type 'application/zip' length 2413751 bytes (2.3 MB) downloaded 2.3 MB package ‘BiocGenerics’ successfully unpacked and MD5 sums checked package ‘Biobase’ successfully unpacked and MD5 sums checked The downloaded binary packages are in C:\Users\User\AppData\Local\Temp\RtmpkLuJuJ\downloaded_packages installation path not writeable, unable to update packages: boot, class, cluster, KernSmooth, lattice, MASS, Matrix, mgcv, nlme, nnet, rpart, spatial, survival > library(Biobase) Loading required package: BiocGenerics Loading required package: parallel Attaching package: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB The following objects are masked from ‘package:stats’: IQR, mad, sd, var, xtabs The following objects are masked from ‘package:base’: anyDuplicated, append, as.data.frame, basename, cbind, colMeans, colnames, colSums, dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which, which.max, which.min Welcome to Bioconductor Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'. > eset <- ExpressionSet(assayData = e, + phenoData = AnnotatedDataFrame(p), + featureData = AnnotatedDataFrame(f)) Error in validObject(.Object) : invalid class “ExpressionSet” object: 1: sampleNames differ between assayData and phenoData invalid class “ExpressionSet” object: 2: sampleNames differ between phenoData and protocolData > sessionInfo() R version 3.5.3 (2019-03-11) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363) Matrix products: default locale:  LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252  LC_MONETARY=English_United Kingdom.1252  LC_NUMERIC=C  LC_TIME=English_United Kingdom.1252 attached base packages:  parallel stats graphics grDevices utils datasets  methods base other attached packages:  Biobase_2.40.0 BiocGenerics_0.26.0 loaded via a namespace (and not attached):  compiler_3.5.3 tools_3.5.3