Really weird results when I filter this new data that I have, script has worked on two previous data sets.
1
0
Entering edit mode
@hakimelakhrass-9661
Last seen 7.1 years ago

So this is my script and I have used it in the past. The filtering and then getting my list of DEG. The list contains only pos_controls from affymetrix but no main probes. Which I do not understand because I thought I removed them with the filter step which works on my other 2 datasets. Thanks!

mydir <- "C:\\Users\\hakim\\Documents\\hanane_data"

#listing the files from directory using special CEL file read function
celList <- list.celfiles(mydir, full.names=TRUE)
#reading data from cellist and setting annotation package to approiate one for this microarray
rawData <- read.celfiles(celList, pkgname='pd.hugene.2.0.st')
#normalizing the data using RMA algorithm
normData <- rma(rawData, target="core")
#retreaving feature data
featureData(normData) <- getNetAffx(normData, "transcript")

#the respective experimental groups of your data
group <- factor(c((rep.int(0,4)),rep.int(1,4),rep.int(2,4)
                  ,rep.int(3,4),rep.int(4,4),(rep.int(5,4)),rep.int(6,4),rep.int(7,4)))
#design and contrast matrix of the data
design <- model.matrix(~ 0 + group)
colnames(design) <- c("ID01","I01","ID005","I005", "ID3","I3","CtlID","CtlI")
contrast <- makeContrasts( "ID01-ID005","ID01-ID3","ID005-ID3","ID01-CtlID","ID005-CtlID","ID3-CtlID",
                           "I01-I005","I01-I3","I005-I3","I01-CtlI","I005-CtlI","I3-CtlI",
                           "ID01-I01","ID005-I005","ID3-I3","CtlI-CtlID",
                           levels= design )
eset <- getMainProbes(normData)
normData.filtered <- nsFilter(eset, require.entrez = FALSE,
                              remove.dupEntrez = FALSE)

normfit <-eBayes( contrasts.fit( lmFit(normData.filtered$eset, design), contrast) )

#getting the list of probes
probeset.list <-topTable(normfit,coef="ID01-ID3",number=100000, adjust="BH", lfc=1)

R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] hugenedeg_1.0                        affycoretools_1.42.0                 hugene20sttranscriptcluster.db_8.4.0
 [4] org.Hs.eg.db_3.2.3                   annotate_1.48.0                      XML_3.98-1.3                        
 [7] AnnotationDbi_1.32.3                 genefilter_1.52.1                    pd.hugene.2.0.st_3.14.1             
[10] oligo_1.34.2                         Biostrings_2.38.4                    XVector_0.10.0                      
[13] IRanges_2.4.7                        S4Vectors_0.8.11                     oligoClasses_1.32.0                 
[16] limma_3.26.8                         Biobase_2.30.0                       BiocGenerics_0.16.1                 
[19] RSQLite_1.0.0                        DBI_0.3.1                           

loaded via a namespace (and not attached):
 [1] Category_2.36.0            bitops_1.0-6               RColorBrewer_1.1-2         GenomeInfoDb_1.6.3        
 [5] gcrma_2.42.0               tools_3.2.3                affyio_1.40.0              KernSmooth_2.23-15        
 [9] rpart_4.1-10               Hmisc_3.17-2               colorspace_1.2-6           nnet_7.3-12               
[13] gridExtra_2.0.0            GGally_1.0.1               DESeq2_1.10.1              bit_1.1-12                
[17] preprocessCore_1.32.0      graph_1.48.0               rtracklayer_1.30.2         ggbio_1.18.5              
[21] caTools_1.17.1             scales_0.3.0               affy_1.48.0                RBGL_1.46.0               
[25] stringr_1.0.0              Rsamtools_1.22.0           foreign_0.8-66             R.utils_2.2.0             
[29] AnnotationForge_1.12.2     dichromat_2.0-0            BSgenome_1.38.0            PFAM.db_3.2.2             
[33] BiocInstaller_1.20.1       GOstats_2.36.0             hwriter_1.3.2              gtools_3.5.0              
[37] BiocParallel_1.4.3         R.oo_1.20.0                acepack_1.3-3.3            VariantAnnotation_1.16.4  
[41] RCurl_1.95-4.7             magrittr_1.5               GO.db_3.2.2                Formula_1.2-1             
[45] futile.logger_1.4.1        Matrix_1.2-3               Rcpp_0.12.3                munsell_0.4.3             
[49] R.methodsS3_1.7.1          stringi_1.0-1              edgeR_3.12.0               SummarizedExperiment_1.0.2
[53] zlibbioc_1.16.0            gplots_2.17.0              plyr_1.8.3                 grid_3.2.3                
[57] affxparser_1.42.0          gdata_2.17.0               ReportingTools_2.10.0      lattice_0.20-33           
[61] splines_3.2.3              GenomicFeatures_1.22.13    locfit_1.5-9.1             knitr_1.12.3              
[65] GenomicRanges_1.22.4       geneplotter_1.48.0         reshape2_1.4.1             codetools_0.2-14          
[69] biomaRt_2.26.1             futile.options_1.0.0       RcppArmadillo_0.6.500.4.0  biovizBase_1.18.0         
[73] latticeExtra_0.6-28        lambda.r_1.1.7             foreach_1.4.3              gtable_0.1.2              
[77] reshape_0.8.5              ggplot2_2.0.0              xtable_1.8-2               ff_2.2-13                 
[81] survival_2.38-3            OrganismDbi_1.12.1         iterators_1.0.8            GenomicAlignments_1.6.3   
[85] cluster_2.0.3              GSEABase_1.32.0 

 

microarray limma genefilter • 953 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States

Debugging your own scripts is really up to you as an analyst. Things look OK to me however:

> eset2 <- rma(read.celfiles(list.celfiles())
> annotation(eset2)
 [1] "pd.mogene.2.0.st" 
> featureData(eset2) <- getNetAffx(eset2, "transcript")
> library(affycoretools)    
> eset2 <- getMainProbes(eset2)
> table(pData(featureData(eset2))$category)
           main normgene->exon  
          33793           1352             

And you should expect some normgene->exon probesets in this table, as they do double-duty as normgene->exon and main probesets on this array, and the Affy transcript csv labels them as normgene rather than main probesets.

ADD COMMENT

Login before adding your answer.

Traffic: 792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6