Search
Question: Really weird results when I filter this new data that I have, script has worked on two previous data sets.
0
2.5 years ago by
hakimelakhrass0 wrote:

So this is my script and I have used it in the past. The filtering and then getting my list of DEG. The list contains only pos_controls from affymetrix but no main probes. Which I do not understand because I thought I removed them with the filter step which works on my other 2 datasets. Thanks!

mydir <- "C:\\Users\\hakim\\Documents\\hanane_data"

#listing the files from directory using special CEL file read function
celList <- list.celfiles(mydir, full.names=TRUE)
#reading data from cellist and setting annotation package to approiate one for this microarray
#normalizing the data using RMA algorithm
normData <- rma(rawData, target="core")
#retreaving feature data
featureData(normData) <- getNetAffx(normData, "transcript")

#the respective experimental groups of your data
group <- factor(c((rep.int(0,4)),rep.int(1,4),rep.int(2,4)
,rep.int(3,4),rep.int(4,4),(rep.int(5,4)),rep.int(6,4),rep.int(7,4)))
#design and contrast matrix of the data
design <- model.matrix(~ 0 + group)
colnames(design) <- c("ID01","I01","ID005","I005", "ID3","I3","CtlID","CtlI")
contrast <- makeContrasts( "ID01-ID005","ID01-ID3","ID005-ID3","ID01-CtlID","ID005-CtlID","ID3-CtlID",
"I01-I005","I01-I3","I005-I3","I01-CtlI","I005-CtlI","I3-CtlI",
"ID01-I01","ID005-I005","ID3-I3","CtlI-CtlID",
levels= design )
eset <- getMainProbes(normData)
normData.filtered <- nsFilter(eset, require.entrez = FALSE,
remove.dupEntrez = FALSE)

normfit <-eBayes( contrasts.fit( lmFit(normData.filtered$eset, design), contrast) ) #getting the list of probes probeset.list <-topTable(normfit,coef="ID01-ID3",number=100000, adjust="BH", lfc=1) R version 3.2.3 (2015-12-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] hugenedeg_1.0 affycoretools_1.42.0 hugene20sttranscriptcluster.db_8.4.0 [4] org.Hs.eg.db_3.2.3 annotate_1.48.0 XML_3.98-1.3 [7] AnnotationDbi_1.32.3 genefilter_1.52.1 pd.hugene.2.0.st_3.14.1 [10] oligo_1.34.2 Biostrings_2.38.4 XVector_0.10.0 [13] IRanges_2.4.7 S4Vectors_0.8.11 oligoClasses_1.32.0 [16] limma_3.26.8 Biobase_2.30.0 BiocGenerics_0.16.1 [19] RSQLite_1.0.0 DBI_0.3.1 loaded via a namespace (and not attached): [1] Category_2.36.0 bitops_1.0-6 RColorBrewer_1.1-2 GenomeInfoDb_1.6.3 [5] gcrma_2.42.0 tools_3.2.3 affyio_1.40.0 KernSmooth_2.23-15 [9] rpart_4.1-10 Hmisc_3.17-2 colorspace_1.2-6 nnet_7.3-12 [13] gridExtra_2.0.0 GGally_1.0.1 DESeq2_1.10.1 bit_1.1-12 [17] preprocessCore_1.32.0 graph_1.48.0 rtracklayer_1.30.2 ggbio_1.18.5 [21] caTools_1.17.1 scales_0.3.0 affy_1.48.0 RBGL_1.46.0 [25] stringr_1.0.0 Rsamtools_1.22.0 foreign_0.8-66 R.utils_2.2.0 [29] AnnotationForge_1.12.2 dichromat_2.0-0 BSgenome_1.38.0 PFAM.db_3.2.2 [33] BiocInstaller_1.20.1 GOstats_2.36.0 hwriter_1.3.2 gtools_3.5.0 [37] BiocParallel_1.4.3 R.oo_1.20.0 acepack_1.3-3.3 VariantAnnotation_1.16.4 [41] RCurl_1.95-4.7 magrittr_1.5 GO.db_3.2.2 Formula_1.2-1 [45] futile.logger_1.4.1 Matrix_1.2-3 Rcpp_0.12.3 munsell_0.4.3 [49] R.methodsS3_1.7.1 stringi_1.0-1 edgeR_3.12.0 SummarizedExperiment_1.0.2 [53] zlibbioc_1.16.0 gplots_2.17.0 plyr_1.8.3 grid_3.2.3 [57] affxparser_1.42.0 gdata_2.17.0 ReportingTools_2.10.0 lattice_0.20-33 [61] splines_3.2.3 GenomicFeatures_1.22.13 locfit_1.5-9.1 knitr_1.12.3 [65] GenomicRanges_1.22.4 geneplotter_1.48.0 reshape2_1.4.1 codetools_0.2-14 [69] biomaRt_2.26.1 futile.options_1.0.0 RcppArmadillo_0.6.500.4.0 biovizBase_1.18.0 [73] latticeExtra_0.6-28 lambda.r_1.1.7 foreach_1.4.3 gtable_0.1.2 [77] reshape_0.8.5 ggplot2_2.0.0 xtable_1.8-2 ff_2.2-13 [81] survival_2.38-3 OrganismDbi_1.12.1 iterators_1.0.8 GenomicAlignments_1.6.3 [85] cluster_2.0.3 GSEABase_1.32.0  ADD COMMENTlink modified 2.5 years ago by James W. MacDonald47k • written 2.5 years ago by hakimelakhrass0 0 2.5 years ago by United States James W. MacDonald47k wrote: Debugging your own scripts is really up to you as an analyst. Things look OK to me however: > eset2 <- rma(read.celfiles(list.celfiles()) > annotation(eset2) [1] "pd.mogene.2.0.st" > featureData(eset2) <- getNetAffx(eset2, "transcript") > library(affycoretools) > eset2 <- getMainProbes(eset2) > table(pData(featureData(eset2))$category)
main normgene->exon
33793           1352             

And you should expect some normgene->exon probesets in this table, as they do double-duty as normgene->exon and main probesets on this array, and the Affy transcript csv labels them as normgene rather than main probesets.