Entering edit mode
Rohmatul Fajriyah
▴
190
@rohmatul-fajriyah-5675
Last seen 10.2 years ago
Dear All,
I am using the Illumina Spike-in beadarrays data set, and
I have read the re-annotation for non-spikes and non controls from
here: http://compbio.sysbiol.cam.ac.uk/Resources/spike/data/NewAnnotat
ionM1.txt
and there are 46005 probes.
On the other hand, from the bead studio output (from the website
above), in SampleProbeProfile.txt file, there are 47456 probes.
Currently, I use the SampleProbeProfile.txt file and it gives me a
problem when I want to use the probe sequences on the data set.
There are some probes which I don't know how to get to know about
their sequences.
(To get the probe sequences, I mapped the probeID to its nuID and from
the nuID thenI retrieved the probe sequences. If this is not the right
thing to do, please let me know. ).
My questions:
a. what should I do to know about some probes sequence of the Illumina
Spike-in data set, please?
May be there is another source than the link above that I could
visit.
b. if there is no way to know about their probe sequences, is it
permissible to delete those probes from further analysis, please? Or
... should I use the new annotation only.
c. the spike in experiment of the Affymetrix is available at SpikeIn
package. Is there a similar package for the Illumina Spike in, please?
(I did not check it yet)
Below, I explain what I have done. It's quite long, apologise for
that.
Please let me know if I have made the mistake(s) or there are things
that I missed on the steps.
For any help, thank you very much in advance.
With kind regards,
R Fajriyah
======================================================================
===============================================
library(lumi)
library(lumiMouseIDMapping)
library(lumiMouseAll.db)
rbs<-read.table("~/.../spike_beadstudio_output/SampleProbeProfile.txt"
,header=T,sep="\t") #### regular probes from the website above
cns<-read.table("~/.../spike_beadstudio_output/ControlProbeProfile.txt
",header=T,sep="\t") #### control probes
narbs<-read.table("~/.../NewAnnotationM1.txt",header=T,sep="\t") ####
new annotation of the regular probes
> c(dim(rbs),dim(cns),dim(narbs)) #### dim of regular and control
probes
[1] 47456 194 1721 19446005 21
> csp<-cns[17:49,1:3] #### retrieve the spike-in probes only
> rbs1<-rbs[, 1:3] #### regular probes (non-spikes and non controls)
only
> rbss<-rbind(rbs1,csp)
> pidr<-rbss[,2] #### retrieve the probeID
> nuidr<-probeID2nuID(pidr, lib='lumiMouseIDMapping') #### I realized
that I could not find the nuID for some of the ProbesID
Warningmessage:
IngetChipInfo(IlluminaID, lib.mapping= lib.mapping, species= species,
:
SomeinputIDscannotbematched!
> nuidr1h<-probeID2nuID(pidr[1:100], lib='lumiMouseIDMapping') #### I
already checked that the first 100 probes, are okay
> nuidr1h15<-probeID2nuID(pidr[101:115], lib='lumiMouseIDMapping')
#### some probes after 100th don't have nuID
Warningmessage:
IngetChipInfo(IlluminaID, lib.mapping= lib.mapping, species= species,
:
SomeinputIDscannotbematched!
> nuidr1h15#### I checked which one ...
Search_key Target ProbeId Accession
Symbol nuID
2470008 NA NA NA NA NA
NA
107000471"9626096_327_rc""9626096_327_rc-S""107000471""" ""
"NqCSCZmBIJiCSnj68M"
104120113"9626096_327" "9626096_327-S" "104120113""" ""
"64PBQ0l592fe9mZ958"
101850133"9626096_331_rc""9626096_331_rc-S""101850133""" ""
"E8_jEasVqYdLc2ih6Y"
5690687 NA NA NA NA NA
NA
106450575"9626096_5_rc" "9626096_5_rc-S" "106450575""" ""
"BI1aGJVQOAhVYhqy0M"
104280750"9626096_5" "9626096_5-S" "104280750""" ""
"9j4cVt2qt.T_qnbWo0"
101240091"9626096_7" "9626096_7-S" "101240091""" ""
"xjeiFXlSoUWFRRaisI"
100840706"9626100_15" "9626100_15-S" "100840706""" ""
"oUNL2fdn3vZmfWd1ic"
101170112"9626100_20_rc" "9626100_20_rc-S" "101170112""" ""
"uoEmIUY9o5ASQin8_o"
770541 NA NA NA NA NA
NA
103710377"9626100_224_rc""9626100_224_rc-S""103710377""" ""
"NmZgSCYgmB4_vL0DCs"
103870242"9626100_224" "9626100_224-S" "103870242""" ""
"ZXz_BwUNL2fdn3vZmc"
104540048"9626100_230_rc""9626100_230_rc-S""104540048""" ""
"6dUKBJiFGPaOQEkIp8"
2190154 NA NA NA NA NA
NA
> nuidr101<-probeID2nuID(pidr[101], lib='lumiMouseIDMapping') #### I
tried a single mapping and it worked
> nuidr101
Search_Key ILMN_Gene Accession Symbol Probe_Id
Array_Address_Id nuID
2470008 "ILMN_194227" "BMP8B" "NM_007559.4" "Bmp8b" "ILMN_1229350"
"002470008" "cXghVQQPiHAVLKfdqE"
> nuidr105<-probeID2nuID(pidr[105], lib='lumiMouseIDMapping') #### it
worked too for this probeID
> nuidr105
Search_Key ILMN_Gene Accession Symbol Probe_Id
Array_Address_Id nuID
5690687 "ILMN_220120" "INPP5A" "NM_183144.1" "Inpp5a" "ILMN_2717861"
"005690687" "icS9ejqmimYvDtr9XI"
> nuidr1011<-probeID2nuID(pidr[111], lib='lumiMouseIDMapping') #### it
did not work for this probeID
Warningmessage:
IngetChipInfo(IlluminaID, lib.mapping= lib.mapping, species= species,
:
Nomatcheswerefound!
> nuidr1011
770541
NA
> nuidr1015<-probeID2nuID(pidr[115], lib='lumiMouseIDMapping') #### it
worked too for this probeID
> nuidr1015
Search_Key ILMN_Gene Accession Symbol Probe_Id
Array_Address_Id nuID
2190154 "ILMN_217818" "DNAJA2" "NM_019794.1" "Dnaja2" "ILMN_3003324"
"002190154" "x7NMU3eJcMkcixP3Ik"
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attachedbasepackages:
[1] parallel stats graphics grDevicesutils datasets methods
base
otherattachedpackages:
[1] lumiMouseAll.db_1.22.0 org.Mm.eg.db_2.10.1
lumiMouseIDMapping_1.10.0
[4] RSQLite_0.11.4 DBI_0.2-7
AnnotationDbi_1.24.0
[7] lumi_2.14.0 Biobase_2.22.0
BiocGenerics_0.8.0
loadedviaanamespace (andnotattached):
[1] affy_1.40.0 affyio_1.30.0 annotate_1.40.0
base64_1.1
[5] beanplot_1.1 BiocInstaller_1.12.0 biomaRt_2.18.0
Biostrings_2.30.0
[9] bitops_1.0-6 BSgenome_1.30.0 bumphunter_1.2.0
codetools_0.2-8
[13] colorspace_1.2-4 digest_0.6.3 doRNG_1.5.5
foreach_1.4.1
[17] genefilter_1.44.0 GenomicFeatures_1.14.0GenomicRanges_1.14.1
grid_3.0.2
[21] illuminaio_0.4.0 IRanges_1.20.0 iterators_1.0.6
itertools_0.1-1
[25] KernSmooth_2.23-10 lattice_0.20-24 limma_3.18.0
locfit_1.5-9.1
[29] MASS_7.3-29 Matrix_1.0-14 matrixStats_0.8.12
mclust_4.2
[33] methylumi_2.8.0 mgcv_1.7-27 minfi_1.8.0
multtest_2.18.0
[37] nleqslv_2.0 nlme_3.1-111 nor1mix_1.1-4
pkgmaker_0.17.4
[41] preprocessCore_1.24.0 R.methodsS3_1.5.2 RColorBrewer_1.0-5
RCurl_1.95-4.1
[45] registry_0.2 reshape_0.8.4 rngtools_1.2.3
Rsamtools_1.14.1
[49] rtracklayer_1.22.0 siggenes_1.36.0 splines_3.0.2
stats4_3.0.2
[53] stringr_0.6.2 survival_2.37-4 tools_3.0.2
XML_3.95-0.2
[57] xtable_1.7-1 XVector_0.2.0 zlibbioc_1.8.0
>
[[alternative HTML version deleted]]