Entering edit mode
                    Rohmatul Fajriyah
        
    
        ▴
    
    190
        @rohmatul-fajriyah-5675
        Last seen 11.2 years ago
        
    Dear All,
I am using the Illumina Spike-in beadarrays data set, and
I have read the re-annotation for non-spikes and non controls from
here: http://compbio.sysbiol.cam.ac.uk/Resources/spike/data/NewAnnotat
ionM1.txt
and there are 46005 probes.
On the other hand, from the bead studio output (from the website
above), in SampleProbeProfile.txt file, there are 47456 probes.
Currently, I use the SampleProbeProfile.txt file and it gives me a
problem when I want to use the probe sequences on the data set.
There are some probes which I don't know how to get to know about
their sequences.
(To get the probe sequences, I mapped the probeID to its nuID and from
the nuID thenI retrieved the probe sequences. If this is not the right
thing to do, please let me know. ).
My questions:
a. what should I do to know about some probes sequence of the Illumina
Spike-in data set, please?
    May be there is another source than the link above that I could
visit.
b. if there is no way to know about their probe sequences, is it
permissible to delete those probes from further analysis, please? Or
... should I use the new annotation only.
c. the spike in experiment of the Affymetrix is available at SpikeIn
package. Is there a similar package for the Illumina Spike in, please?
(I did not check it yet)
Below, I explain what I have done. It's quite long, apologise for
that.
Please let me know if I have made the mistake(s) or there are things
that I missed on the steps.
For any help, thank you very much in advance.
With kind regards,
R Fajriyah
======================================================================
===============================================
library(lumi)
library(lumiMouseIDMapping)
library(lumiMouseAll.db)
rbs<-read.table("~/.../spike_beadstudio_output/SampleProbeProfile.txt"
,header=T,sep="\t") #### regular probes from the website above
cns<-read.table("~/.../spike_beadstudio_output/ControlProbeProfile.txt
",header=T,sep="\t") #### control probes
narbs<-read.table("~/.../NewAnnotationM1.txt",header=T,sep="\t") ####
new annotation of the regular probes
> c(dim(rbs),dim(cns),dim(narbs)) #### dim of regular and control
probes
[1] 47456  194  1721  19446005    21
> csp<-cns[17:49,1:3] #### retrieve the spike-in probes only
> rbs1<-rbs[, 1:3] #### regular probes (non-spikes and non controls)
only
> rbss<-rbind(rbs1,csp)
> pidr<-rbss[,2] #### retrieve the probeID
> nuidr<-probeID2nuID(pidr, lib='lumiMouseIDMapping') #### I realized
that I could not find the nuID for some of the ProbesID
Warningmessage:
IngetChipInfo(IlluminaID, lib.mapping= lib.mapping, species= species,
:
  SomeinputIDscannotbematched!
> nuidr1h<-probeID2nuID(pidr[1:100], lib='lumiMouseIDMapping') #### I
already checked that the first 100 probes, are okay
> nuidr1h15<-probeID2nuID(pidr[101:115], lib='lumiMouseIDMapping')
#### some probes after 100th don't have nuID
Warningmessage:
IngetChipInfo(IlluminaID, lib.mapping= lib.mapping, species= species,
:
  SomeinputIDscannotbematched!
> nuidr1h15#### I checked which one ...
          Search_key       Target             ProbeId     Accession
Symbol nuID
2470008   NA               NA                 NA          NA        NA
NA
107000471"9626096_327_rc""9626096_327_rc-S""107000471"""        ""
"NqCSCZmBIJiCSnj68M"
104120113"9626096_327"    "9626096_327-S"    "104120113"""        ""
"64PBQ0l592fe9mZ958"
101850133"9626096_331_rc""9626096_331_rc-S""101850133"""        ""
"E8_jEasVqYdLc2ih6Y"
5690687   NA               NA                 NA          NA        NA
NA
106450575"9626096_5_rc"  "9626096_5_rc-S"  "106450575"""        ""
"BI1aGJVQOAhVYhqy0M"
104280750"9626096_5"      "9626096_5-S"      "104280750"""        ""
"9j4cVt2qt.T_qnbWo0"
101240091"9626096_7"      "9626096_7-S"      "101240091"""        ""
"xjeiFXlSoUWFRRaisI"
100840706"9626100_15"    "9626100_15-S"    "100840706"""        ""
"oUNL2fdn3vZmfWd1ic"
101170112"9626100_20_rc"  "9626100_20_rc-S"  "101170112"""        ""
"uoEmIUY9o5ASQin8_o"
770541    NA               NA                 NA          NA        NA
NA
103710377"9626100_224_rc""9626100_224_rc-S""103710377"""        ""
"NmZgSCYgmB4_vL0DCs"
103870242"9626100_224"    "9626100_224-S"    "103870242"""        ""
"ZXz_BwUNL2fdn3vZmc"
104540048"9626100_230_rc""9626100_230_rc-S""104540048"""        ""
"6dUKBJiFGPaOQEkIp8"
2190154   NA               NA                 NA          NA        NA
NA
> nuidr101<-probeID2nuID(pidr[101], lib='lumiMouseIDMapping') #### I
tried a single mapping and it worked
> nuidr101
     Search_Key   ILMN_Gene Accession    Symbol  Probe_Id
Array_Address_Id nuID
2470008 "ILMN_194227" "BMP8B"   "NM_007559.4" "Bmp8b" "ILMN_1229350"
"002470008"    "cXghVQQPiHAVLKfdqE"
> nuidr105<-probeID2nuID(pidr[105], lib='lumiMouseIDMapping') #### it
worked too for this probeID
> nuidr105
     Search_Key   ILMN_Gene Accession    Symbol   Probe_Id
Array_Address_Id nuID
5690687 "ILMN_220120" "INPP5A"  "NM_183144.1" "Inpp5a" "ILMN_2717861"
"005690687"    "icS9ejqmimYvDtr9XI"
> nuidr1011<-probeID2nuID(pidr[111], lib='lumiMouseIDMapping') #### it
did not work for this probeID
Warningmessage:
IngetChipInfo(IlluminaID, lib.mapping= lib.mapping, species= species,
:
  Nomatcheswerefound!
> nuidr1011
770541
    NA
> nuidr1015<-probeID2nuID(pidr[115], lib='lumiMouseIDMapping') #### it
worked too for this probeID
> nuidr1015
     Search_Key   ILMN_Gene Accession    Symbol   Probe_Id
Array_Address_Id nuID
2190154 "ILMN_217818" "DNAJA2"  "NM_019794.1" "Dnaja2" "ILMN_3003324"
"002190154"    "x7NMU3eJcMkcixP3Ik"
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attachedbasepackages:
[1] parallel  stats    graphics  grDevicesutils    datasets  methods
base
otherattachedpackages:
[1] lumiMouseAll.db_1.22.0    org.Mm.eg.db_2.10.1
lumiMouseIDMapping_1.10.0
[4] RSQLite_0.11.4            DBI_0.2-7
AnnotationDbi_1.24.0
[7] lumi_2.14.0              Biobase_2.22.0
BiocGenerics_0.8.0
loadedviaanamespace (andnotattached):
 [1] affy_1.40.0            affyio_1.30.0          annotate_1.40.0
base64_1.1
 [5] beanplot_1.1          BiocInstaller_1.12.0  biomaRt_2.18.0
Biostrings_2.30.0
 [9] bitops_1.0-6          BSgenome_1.30.0        bumphunter_1.2.0
codetools_0.2-8
[13] colorspace_1.2-4      digest_0.6.3          doRNG_1.5.5
foreach_1.4.1
[17] genefilter_1.44.0      GenomicFeatures_1.14.0GenomicRanges_1.14.1
grid_3.0.2
[21] illuminaio_0.4.0      IRanges_1.20.0        iterators_1.0.6
itertools_0.1-1
[25] KernSmooth_2.23-10    lattice_0.20-24        limma_3.18.0
locfit_1.5-9.1
[29] MASS_7.3-29            Matrix_1.0-14          matrixStats_0.8.12
mclust_4.2
[33] methylumi_2.8.0        mgcv_1.7-27            minfi_1.8.0
multtest_2.18.0
[37] nleqslv_2.0            nlme_3.1-111           nor1mix_1.1-4
pkgmaker_0.17.4
[41] preprocessCore_1.24.0  R.methodsS3_1.5.2      RColorBrewer_1.0-5
RCurl_1.95-4.1
[45] registry_0.2          reshape_0.8.4          rngtools_1.2.3
Rsamtools_1.14.1
[49] rtracklayer_1.22.0    siggenes_1.36.0        splines_3.0.2
stats4_3.0.2
[53] stringr_0.6.2          survival_2.37-4        tools_3.0.2
XML_3.95-0.2
[57] xtable_1.7-1          XVector_0.2.0          zlibbioc_1.8.0
>
        [[alternative HTML version deleted]]
                    
                
                