tilingArray- normalizeByReference
2
0
Entering edit mode
@anjan-purkayastha-3096
Last seen 8.2 years ago
hi all, a rookie tilingArray question. i'm having trouble creating the perfect match index vector. for my probe set (n= 6490), each probe is a unique, perfect match to its target. i have created a probe index file that contains the integers: 1, 2....6490 (one index per line). i scanned this file as: pm_vector <- scan("PerfectMatchIndex", 0) i used the pm_vector in a following command: vac_normalized_exp_data <- normalizeByReference(vac_exp_set, vac_hyb_set, pm= pm_vector, background= bg_vector, cutoffQuantile= 0) this gives me an error: 'pm' must be an integer vector with values between 1 and 6490. any idea what the problem could be? tia anjan -- =========================================== anjan purkayastha, phd bioinformatics analyst whitehead institute for biomedical research nine cambridge center cambridge, ma 02142 purkayas [at] wi [dot] mit [dot] edu 703.740.6939
probe tilingArray probe tilingArray • 780 views
0
Entering edit mode
@joern-toedling-1244
Last seen 8.2 years ago
Hello Anjan, well, the error message indicates that your vector pm_vector is not usable by the function. Is it really an integer vector. What is the output of str(pm_vector) ? Probably, it would be better to read in the file by something like pm_vector <- read.delim("PerfectMatchIndex", skip=<put number="" of="" header="" lines="" here!="">)$V1 Again, check the result using str(pm_vector) If you want to use other functionalities of package tilingArray, you should consider creating a 'probeAnno' environment, which holds the mapping between probes and genomic match positions, for your array. Please see the script "makeProbeAnno.R" in the package's "scripts" directory for details. Once you have such a probeAnno object, the function "PMindex" can be used to obtain a numeric vector that you can use as "pm" in the normalization function. Please see the vignette "assessNorm" of package tilingArray for more details. Best regards, Joern Anjan Purkayastha wrote: > hi all, > a rookie tilingArray question. > i'm having trouble creating the perfect match index vector. > for my probe set (n= 6490), each probe is a unique, perfect match to > its target. > i have created a probe index file that contains the integers: 1, > 2....6490 (one index per line). > i scanned this file as: pm_vector <- scan("PerfectMatchIndex", 0) > > i used the pm_vector in a following command: > vac_normalized_exp_data <- normalizeByReference(vac_exp_set, > vac_hyb_set, pm= pm_vector, background= bg_vector, cutoffQuantile= 0) > > this gives me an error: > 'pm' must be an integer vector with values between 1 and 6490. > > any idea what the problem could be? > tia > anjan > -- Joern Toedling EMBL - European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD United Kingdom Phone +44(0)1223 492566 Email toedling at ebi.ac.uk ADD COMMENT 0 Entering edit mode joern, thanks. i've solved the problem by wrapping an as. integer function around the pm_vector. so here is my command: vac_normalized_data= normalizeByReference(vac_exp_set, vac_hyb_set, pm= as.integer(pm_vector), background= as.integer(bg_vector)) however my story does not end here. i get a further set of errors after this command. Error in validObject(.Object) : invalid class "vsnInput" object: 'subsample' must be a numeric vector of length 1 with values between 0 and 6488. In addition: Warning message: In normalizeByReference(vac_exp_set, vac_hyb_set, pm = as.integer(pm_vector), : 'some strata of background contain fewer than 5000 features, are you sure this is alright? Error in exprs(vsnMatrix(xn, lts.quantile = 0.95, subsample = as.integer(2e+05), : error in evaluating the argument 'object' in selecting a method for function 'exprs' the first part of the error message: vsnInput" object: 'subsample' must be a numeric vector of length 1 with values between 0 and 6488, does not make much sense to me. if i understand the second part of the error message about background strata correctly, there are less than 5000 features in some background strata. that is understandable as i have only 1275 probes in the background set. there are a total of 6488 probes in my set. any ideas on how to troubleshoot this? thanks. anjan Joern Toedling wrote: > Hello Anjan, > > well, the error message indicates that your vector pm_vector is not > usable by the function. Is it really an integer vector. What is the > output of > str(pm_vector) > ? > Probably, it would be better to read in the file by something like > pm_vector <- read.delim("PerfectMatchIndex", skip=<put number="" of="" header=""> lines here!>)$V1 > Again, check the result using > str(pm_vector) > If you want to use other functionalities of package tilingArray, you > should consider creating a 'probeAnno' environment, which holds the > mapping between probes and genomic match positions, for your array. > Please see the script "makeProbeAnno.R" in the package's "scripts" > directory for details. Once you have such a probeAnno object, the > function "PMindex" can be used to obtain a numeric vector that you can > use as "pm" in the normalization function. > Please see the vignette "assessNorm" of package tilingArray for more > details. > > Best regards, > Joern > > Anjan Purkayastha wrote: > >> hi all, >> a rookie tilingArray question. >> i'm having trouble creating the perfect match index vector. >> for my probe set (n= 6490), each probe is a unique, perfect match to >> its target. >> i have created a probe index file that contains the integers: 1, >> 2....6490 (one index per line). >> i scanned this file as: pm_vector <- scan("PerfectMatchIndex", 0) >> >> i used the pm_vector in a following command: >> vac_normalized_exp_data <- normalizeByReference(vac_exp_set, >> vac_hyb_set, pm= pm_vector, background= bg_vector, cutoffQuantile= 0) >> >> this gives me an error: >> 'pm' must be an integer vector with values between 1 and 6490. >> >> any idea what the problem could be? >> tia >> anjan >> >> > > -- =========================================== anjan purkayastha, phd bioinformatics analyst whitehead institute for biomedical research nine cambridge center cambridge, ma 02142 purkayas [at] wi [dot] mit [dot] edu 703.740.6939
0
Entering edit mode
Anjan, looking at the source code of the function normalizeByReference, there is indeed the line vsnMatrix(xn, lts.quantile = 0.95, subsample = as.integer(2e+05)) , and since your matrix only has 6488 lines, this cannot work. Without knowing more details about your array platform and study, I cannot say whether or not the function "normalizeByReference" is really applicable to your array platform. However, we should augment the function anyway by chaning this line to something like vsnMatrix(xn, lts.quantile = 0.95, subsample = min(nrow(xn), as.integer(2e+05))) . We are going to make this change and it will be included in the development version of tilingArray within a few days. For the moment, you will have to work with a thus modified local copy of the function. Sorry for the inconvenience. Regards, Joern Anjan Purkayastha wrote: > joern, > thanks. i've solved the problem by wrapping an as. integer function > around the pm_vector. so here is my command: > > vac_normalized_data= normalizeByReference(vac_exp_set, vac_hyb_set, > pm= as.integer(pm_vector), background= as.integer(bg_vector)) > > however my story does not end here. i get a further set of errors > after this command. > Error in validObject(.Object) : > invalid class "vsnInput" object: 'subsample' must be a numeric vector > of length 1 with values between 0 and 6488. > In addition: Warning message: > In normalizeByReference(vac_exp_set, vac_hyb_set, pm = > as.integer(pm_vector), : > 'some strata of background contain fewer than 5000 features, are you > sure this is alright? > Error in exprs(vsnMatrix(xn, lts.quantile = 0.95, subsample = > as.integer(2e+05), : > error in evaluating the argument 'object' in selecting a method for > function 'exprs' > > the first part of the error message: vsnInput" object: 'subsample' > must be a numeric vector of length 1 with values between 0 and 6488, > does not make much sense to me. > if i understand the second part of the error message about background > strata correctly, there are less than 5000 features in some background > strata. that is understandable as i have only 1275 probes in the > background set. there are a total of 6488 probes in my set. > > any ideas on how to troubleshoot this? > thanks. > anjan > > > -- Joern Toedling EMBL - European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD United Kingdom Phone +44(0)1223 492566 Email toedling at ebi.ac.uk
0
Entering edit mode
@anjan-purkayastha-3096
Last seen 8.2 years ago
Anjan Purkayastha wrote: > joern, > thanks. > let me modify a local version and see the results. is it ok if i work > on this problem with you? i think dr. huber's method to genomic dna > hybridization based approach suits our purpose the best. > to give you a brief idea about our platform: we have built a whole > genome tiling array for the vaccinia virus (ds DNA virus, genome size: > ca 190kb); there are about 19000 overlapping probes, 60 bp in length, > for each strand. for the purposes of normalization i have removed > overlapping probes and am left with 6488 non-overlapping, contiguous > members. > > anyway, i'll rerun the procedure with the revised script and see what > i come up with . > thank you for your help. > warm regards, > anjan > > > > > Joern Toedling wrote: >> Anjan, >> looking at the source code of the function normalizeByReference, there >> is indeed the line >> vsnMatrix(xn, lts.quantile = 0.95, subsample = as.integer(2e+05)) , >> and since your matrix only has 6488 lines, this cannot work. >> Without knowing more details about your array platform and study, I >> cannot say whether or not the function "normalizeByReference" is really >> applicable to your array platform. However, we should augment the >> function anyway by chaning this line to something like >> vsnMatrix(xn, lts.quantile = 0.95, subsample = min(nrow(xn), >> as.integer(2e+05))) . >> We are going to make this change and it will be included in the >> development version of tilingArray within a few days. For the moment, >> you will have to work with a thus modified local copy of the function. >> Sorry for the inconvenience. >> >> Regards, >> Joern >> >> >> Anjan Purkayastha wrote: >> >>> joern, >>> thanks. i've solved the problem by wrapping an as. integer function >>> around the pm_vector. so here is my command: >>> >>> vac_normalized_data= normalizeByReference(vac_exp_set, vac_hyb_set, >>> pm= as.integer(pm_vector), background= as.integer(bg_vector)) >>> >>> however my story does not end here. i get a further set of errors >>> after this command. >>> Error in validObject(.Object) : >>> invalid class "vsnInput" object: 'subsample' must be a numeric vector >>> of length 1 with values between 0 and 6488. >>> In addition: Warning message: >>> In normalizeByReference(vac_exp_set, vac_hyb_set, pm = >>> as.integer(pm_vector), : >>> 'some strata of background contain fewer than 5000 features, are you >>> sure this is alright? >>> Error in exprs(vsnMatrix(xn, lts.quantile = 0.95, subsample = >>> as.integer(2e+05), : >>> error in evaluating the argument 'object' in selecting a method for >>> function 'exprs' >>> >>> the first part of the error message: vsnInput" object: 'subsample' >>> must be a numeric vector of length 1 with values between 0 and 6488, >>> does not make much sense to me. >>> if i understand the second part of the error message about background >>> strata correctly, there are less than 5000 features in some background >>> strata. that is understandable as i have only 1275 probes in the >>> background set. there are a total of 6488 probes in my set. >>> >>> any ideas on how to troubleshoot this? >>> thanks. >>> anjan >>> >>> >>> >>> >> >> > > -- =========================================== anjan purkayastha, phd bioinformatics analyst whitehead institute for biomedical research nine cambridge center cambridge, ma 02142 purkayas [at] wi [dot] mit [dot] edu 703.740.6939