Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.2 years ago
Dear all,
I am analyzing a set of Affymetrix Human Gene 2.0 ST arrays, this is
my first time working with this type of arrays so I have a few general
questions. I would very much appreciate any advice you could give.
(1) I have obtained different lists of differentially expressed genes
(using eBayes() from limma). In those lists, some control transcripts
are popping up (i.e normgene -> intron category among other
categories). I was not expecting this type of transcripts at this
point. In theory after normalization, no control transcripts should
appear, am I right? Have you experienced this?
I have read that one possibility is to use getMainProbes before
topTable selection but I wonder if there could be something wrong from
the beginning with my normalization process (I have used rma() ???
transcript level - from oligo). What is your opinion?
(2) This type of arrays also includes lincRNA transcripts and I am
interested in considering them for my analysis. The thing is that I am
using hugene20sttranscriptcluster.db for annotation and these lincRNA
are not included. Would this library be able to handle them?
(3) I tried to make my own annotation package thru makeDBPackage based
on .csv annotation file from Affy but I got an error???: Error in
`[.data.frame`(csvFile, , GenBank IDName) : undefined columns selected
I have already read in this mailing list that makeDBPackage may expect
a HGU133plus2 annotation ???style???. Would the library
annotationForge be able to handle this?
Many thanks in advance for any help!
Mar??a Maqueda
Biomedical Engineering Research Centre (CREB)
Universitat Polit??cnica de Catalunya (UPC)
-- output of sessionInfo():
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods base
other attached packages:
[1] human.db0_2.9.0 AnnotationForge_1.2.2
[3] hugene20sttranscriptcluster.db_2.12.1 org.Hs.eg.db_2.9.0
[5] AnnotationDbi_1.22.6 BiocInstaller_1.12.0
[7] limma_3.16.8 pd.hugene.2.0.st_3.8.0
[9] oligo_1.24.2 Biobase_2.20.1
[11] oligoClasses_1.22.0 BiocGenerics_0.6.0
[13] RSQLite_0.11.4 DBI_0.2-7
loaded via a namespace (and not attached):
[1] affxparser_1.32.3 affyio_1.28.0 annotate_1.38.0
[4] Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8
[7] ff_2.2-12 foreach_1.4.1 genefilter_1.42.0
[10] GenomicRanges_1.12.5 IRanges_1.18.4 iterators_1.0.6
[13] preprocessCore_1.22.0 splines_3.0.1 stats4_3.0.1
[16] survival_2.37-4 tools_3.0.1 XML_3.98-1.1
[19] xtable_1.7-1 zlibbioc_1.6.0
--
Sent via the guest posting facility at bioconductor.org.
Hi Jim, I've just seen there are many non-coding genes, such as miRNAs, in the lastest CSV annotation files on Affy website that are not present (or I'm not able to find) when using the annotation package on BioC.
As an example, the transcript ID TC01002905.hg.1 corresponds to the microRNA 137 according to the
HTA-2_0.na36.hg19.transcript.csv
annotation file, but I am not able to find it withlookUp
function if I use thehta20sttranscriptcluster.db
package from BioC.Could you tell me any way to circumvent this issue?
Thanks a lot