Good afternoon,
The aim of my analysis is to perform Limma in order to generate p values and fold changes for each protein between alive and dead patients. I have created 3 csv files with information on these proteins:
1) ExpressionMatrix: Contains protein abundance values as rows (1552). And each sample (alive and dead patients) as columns (81) 2) FeatureData: Contains each protein as rows (1552). And one other column (Description: This is a description of each protein) 3) PhenotypeData: Contains each sample as rows (81). And two other columns with the headings : Patient ID and State.
I understand that I need to create a Class to hold these different objects in other to make the Limma analysis more straightforward (I know this from using the Limma Tutorial from DataCamp). I did the boxplot function which uses the 3 different objects I'm using in other to confirm that all 3 objects are in the correct data format as the boxplot produced the expected result. But when I try to create a Class of all 3 objects, I keep getting the error: Error in validObject(.Object) :
invalid class “ExpressionSet” object: 1: sampleNames differ between assayData and phenoData
invalid class “ExpressionSet” object: 2: sampleNames differ between phenoData and protocolData
I figure the issue is with the way I have structured objects "e" and "p"(ExpressionMatrix and PhenotypeData) but I am not sure how to fix it? Below is the code I have used from start to finish. Thank you for you help in advance!
Shimon
>setwd("D:/sa825/Using Limma")
> x<-read.csv("ExpressionMatrix.csv", stringsAsFactors = FALSE)
> f<-read.csv("FeatureData.csv")
> p<-read.csv("PhenotypeData.csv")
> e<-as.matrix(x)
> typeof(e)
[1] "double"
> class(e)
[1] "matrix"
> boxplot(e[1 , ] ~ p$State, main = f[1 , "Description"])
> source("https://bioconductor.org/biocLite.R")
WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of
Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/
Installing package into ‘C:/Users/User/Documents/R/win-library/3.5’
(as ‘lib’ is unspecified)
trying URL 'https://bioconductor.org/packages/3.7/bioc/bin/windows/contrib/3.5/BiocInstaller_1.30.0.zip'
Content type 'application/zip' length 102191 bytes (99 KB)
downloaded 99 KB
package ‘BiocInstaller’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\User\AppData\Local\Temp\RtmpkLuJuJ\downloaded_packages
Bioconductor version 3.7 (BiocInstaller 1.30.0),
?biocLite for help
A newer version of Bioconductor is available for this
version of R, ?BiocUpgrade for help
> biocLite("Biobase")
BioC_mirror: https://bioconductor.org
Using Bioconductor 3.7 (BiocInstaller 1.30.0), R 3.5.3
(2019-03-11).
Installing package(s) ‘Biobase’
also installing the dependency ‘BiocGenerics’
trying URL 'https://bioconductor.org/packages/3.7/bioc/bin/windows/contrib/3.5/BiocGenerics_0.26.0.zip'
Content type 'application/zip' length 745077 bytes (727 KB)
downloaded 727 KB
trying URL 'https://bioconductor.org/packages/3.7/bioc/bin/windows/contrib/3.5/Biobase_2.40.0.zip'
Content type 'application/zip' length 2413751 bytes (2.3 MB)
downloaded 2.3 MB
package ‘BiocGenerics’ successfully unpacked and MD5 sums checked
package ‘Biobase’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\User\AppData\Local\Temp\RtmpkLuJuJ\downloaded_packages
installation path not writeable, unable to update
packages: boot, class, cluster, KernSmooth, lattice,
MASS, Matrix, mgcv, nlme, nnet, rpart, spatial,
survival
> library(Biobase)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind,
colMeans, colnames, colSums, dirname, do.call, duplicated,
eval, evalq, Filter, Find, get, grep, grepl, intersect,
is.unsorted, lapply, lengths, Map, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, Position,
rank, rbind, Reduce, rowMeans, rownames, rowSums, sapply,
setdiff, sort, table, tapply, union, unique, unsplit,
which, which.max, which.min
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages
'citation("pkgname")'.
> eset <- ExpressionSet(assayData = e,
+ phenoData = AnnotatedDataFrame(p),
+ featureData = AnnotatedDataFrame(f))
Error in validObject(.Object) :
invalid class “ExpressionSet” object: 1: sampleNames differ between assayData and phenoData
invalid class “ExpressionSet” object: 2: sampleNames differ between phenoData and protocolData
> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets
[7] methods base
other attached packages:
[1] Biobase_2.40.0 BiocGenerics_0.26.0
loaded via a namespace (and not attached):
[1] compiler_3.5.3 tools_3.5.3

Firstly:
Since you are using R > 3.5.0 Could you try using BiocManager for updating packages instead of
biocLite( ).biocLite( )was deprecated in favor ofBiocManager::install( ). Once this is installed, please check that all your packages are up-to-date and a valid version of Bioconductor.From R > 3.5.0 you should be using BiocManager.
Since we don't have access to your files,
You could check the rownames and colnames of the objects
e,AnnotatedDataFrame(p)andAnnotatedDataFrame(f). As the ERROR indicates the sampleNames need to be consistent so you can make sure none are missing/excluded or that a transpose might be necessary.Hi Shepherl,
Thank you so much for your response. I have used BiocManager as you instructed:
Then I re-tried the ExpressionSet function but it didn't work:
I used colnames and rownames as you suggested and I can see that the colnames and rownames are different. So I transposed the csv files that match
AnnotatedDataFrame(p)&AnnotatedDataFrame(f)to match "e". But with this new transposed files (p3 and f3), the boxplot function does not work and neither does the ExpressionSet function:I'm not really sure what to do. Can I send you access to the files I am working on? Or screenshots of how the data is structured
Thanks, Shimon