Search
0
6.9 years ago by
Stefanie Ververs70 wrote:
modified 6.9 years ago • written 6.9 years ago by Stefanie Ververs70
0
6.9 years ago by
Oscar Flores50
Oscar Flores50 wrote:
0
6.9 years ago by
Stefanie Ververs70 wrote:
0
6.9 years ago by
Stefanie Ververs70 wrote:
On 11/07/2011 02:53 AM, Stefanie Ververs wrote: > Hi Martin, > > I just ran my script with another dataset but got the same error (even > if on a different line). > I read the description of readAligned and ScanBamParam, but I still > don't get how to provide this parameters to readAligned. (I see that I > can create an object/instance of ScanBamParam, but then? Is it something > like "alignedReads <- readAligned(dir, pattern=filename, type="BAM", > param=myScanBamParamObject)? There is no example and I am *really* new yes, for example dir <- system.file("extdata", package="Rsamtools") param <- ScanBamParam(simpleCigar=TRUE,, reverseComplement=TRUE) aln <- readAligned(dir, "ex1.bam", type="BAM", param=param) > to R, sorry ;) Then I would try parsing the file with more specified > arguments.. > > I got some more information about my file/dataset: > > 17709538 + 0 in total (QC-passed reads + QC-failed reads) > 0 + 0 duplicates > 120774 + 0 mapped (0.68%:-nan%) > 17709538 + 0 paired in sequencing > 8854769 + 0 read1 > 8854769 + 0 read2 > 120774 + 0 properly paired (0.68%:-nan%) > 120774 + 0 with itself and mate mapped > 0 + 0 singletons (0.00%:-nan%) > 0 + 0 with mate mapped to a different chr > 0 + 0 with mate mapped to a different chr (mapQ>=5) > > > And the error is (this time) in row /120775./ > /(Fehler in solveUserSEW0(start = start, end = end, width = width) : > solving row 120775: range cannot be determined from the supplied > arguments (too many NAs) > Calls: RangedData -> is -> IRanges -> solveUserSEW0 -> .Call/) Again it appears to be the last record. Can you provide a simpler example? For instance, from your original message you said that you input the data as alignedReads <- readAligned(dir, pattern=filename, type="BAM") but that the error occurred at reads_pair = processReads(nucleosome_htseq, type="paired", fragmentLen=fragment_len) but there is no obvious connection between 'alignedReads' and the arguments to 'processReads'. Also, what is the output of idx = is.na(position(alignedReads)) & is.na(width(alignedReads)) sum(idx) ? If sum(idx) != 0, then try using alignedReads[!idx]. And finally, after library(ShortRead) library(nucleR) what is the result of sessionInfo() ? > > > > On 03.11.2011 14:39, Martin Morgan wrote: >> On 11/03/2011 06:14 AM, Oscar Flores wrote: >>> So this error happens here, no? >>> >>> res = RangedData(IRanges(start=position(ar),width=width(ar)), >>> strand=strand(ar),space=ar at chromosome) >> >> better to use the accessor chromosome(ar). The error >> >> > Fehler in solveUserSEW0(start = start, end = end, width = width) : >> > solving row 16512893: range cannot be determined from the supplied >> > arguments (too many NAs) >> >> suggests that position(ar)[16512893] and / or width(ar)[16512893] is >> NA. You could filter these out, e.g., ar[!is.na(position(ar)) & >> !is.na(position(ar))] or identify why these are read in in the first >> place using the 'param' argument as described on ?readAligned. >> >> Martin >> >>> >>> If this is the case the problem is not in nucleR, maybe there are some >>> rows in a strange format in the AlignedRead (could be due the multiple >>> format changes) that may avoid the conversion to the RangedData. Maybe I >>> can detect them and skip those cases, but I would need to see what's >>> happening in that odd case. >>> >>> Let me know if there's something I can do. >>> >>> Regards, >>> >>> Oscar >>> >>> >>> El 03/11/2011 13:58, Stefanie Ververs escribi?: >>>> Hi Oscar, >>>> >>>> thanks for your quick answer - I think I would have contacted you, if >>>> there were no answers on the bioconductor-mailinglist. >>>> >>>> I just tried the workaround as you suggested, but I got the same error >>>> again: >>>> Fehler in solveUserSEW0(start = start, end = end, width = width) : >>>> solving row 16512893: range cannot be determined from the supplied >>>> arguments (too many NAs) >>>> Calls: RangedData -> is -> IRanges -> solveUserSEW0 -> .Call >>>> >>>> I'll think about how to show you the data (it's hosted and processed >>>> with Galaxy, so it might be possible to share it.) >>>> >>>> Regards, >>>> >>>> Steffi >>>> >>>> On 03.11.2011 13:16, Oscar Flores wrote: >>>>> Hi Stefanie, >>>>> >>>>> I'm the developer of nucleR, so let's see if I can help you ;) >>>>> >>>>> After the processing, processReads converts the input data to a >>>>> RangedData >>>>> object for a easier manipulation later, so this error is occurs at >>>>> the last >>>>> step of the call, but data can be messed in previous steps. It's >>>>> hard to tell what is happening without having a look to the input >>>>> data, >>>>> which I guess is huge... >>>>> >>>>> I would like to have a look to the raw data, but I know it is >>>>> difficult >>>>> to send it if it's not in a public repository. Maybe you can >>>>> contact me >>>>> directly about that (oflores at mmb.pcb.ub.es)... >>>>> >>>>> Meanwhile, if you want to try a workaround, you can directly >>>>> convert the >>>>> imported reads to RangedData format (which is the other format >>>>> supported >>>>> by processReads): >>>>> >>>>> (being "ar" your imported AlignedReads object) >>>>> >>>>> res = RangedData(IRanges(start=position(ar),width=width(ar)), >>>>> strand=strand(ar),space=ar at chromosome) >>>>> >>>>> reads_pair = processReads(res, type="paired", >>>>> fragmentLen=fragment_len) >>>>> >>>>> This should work, but will be nice to have a look to your data >>>>> to fix a possible problem in the AlignedReads method. >>>>> >>>>> Regards, >>>>> >>>>> Oscar >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >> >> > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 ADD REPLYlink written 6.9 years ago by Martin Morgan ♦♦ 22k On 07.11.2011 19:52, Martin Morgan wrote: > On 11/07/2011 02:53 AM, Stefanie Ververs wrote: >> Hi Martin, >> >> I just ran my script with another dataset but got the same error (even >> if on a different line). >> I read the description of readAligned and ScanBamParam, but I still >> don't get how to provide this parameters to readAligned. (I see that I >> can create an object/instance of ScanBamParam, but then? Is it something >> like "alignedReads <- readAligned(dir, pattern=filename, type="BAM", >> param=myScanBamParamObject)? There is no example and I am *really* new > > yes, for example > > dir <- system.file("extdata", package="Rsamtools") > param <- ScanBamParam(simpleCigar=TRUE,, reverseComplement=TRUE) > aln <- readAligned(dir, "ex1.bam", type="BAM", param=param) I tried this: /param <- ScanBamParam(simpleCigar=TRUE, reverseComplement=TRUE) alignedReads <- readAligned(dir, pattern=filename, type="BAM", param=param)/ and got a warning: /Warnmeldung: UserArgumentMismatch using 'qname' 'flag' 'rname' 'strand' 'pos' 'mapq' 'seq' 'qual' for 'bamWhat(param)' / (I did not test with more arguments by now.) > >> to R, sorry ;) Then I would try parsing the file with more specified >> arguments.. >> >> I got some more information about my file/dataset: >> >> 17709538 + 0 in total (QC-passed reads + QC-failed reads) >> 0 + 0 duplicates >> 120774 + 0 mapped (0.68%:-nan%) >> 17709538 + 0 paired in sequencing >> 8854769 + 0 read1 >> 8854769 + 0 read2 >> 120774 + 0 properly paired (0.68%:-nan%) >> 120774 + 0 with itself and mate mapped >> 0 + 0 singletons (0.00%:-nan%) >> 0 + 0 with mate mapped to a different chr >> 0 + 0 with mate mapped to a different chr (mapQ>=5) >> >> >> And the error is (this time) in row /120775./ >> /(Fehler in solveUserSEW0(start = start, end = end, width = width) : >> solving row 120775: range cannot be determined from the supplied >> arguments (too many NAs) >> Calls: RangedData -> is -> IRanges -> solveUserSEW0 -> .Call/) > > Again it appears to be the last record. Can you provide a simpler > example? For instance, from your original message you said that you > input the data as I got some smaller input file and will try to use it for NucleR, too. By now: > > alignedReads <- readAligned(dir, pattern=filename, type="BAM") > > but that the error occurred at > > reads_pair = processReads(nucleosome_htseq, type="paired", > fragmentLen=fragment_len) > > but there is no obvious connection between 'alignedReads' and the > arguments to 'processReads'. Also, what is the output of This was because of copying and changing some names, alignedReads and nucleosome_htseq are the same. (I checked and ran again, still the same error.) > > idx = is.na(position(alignedReads)) & is.na(width(alignedReads)) > sum(idx) > > ? If sum(idx) != 0, then try using alignedReads[!idx]. And finally, after sum(idx) is 0. > > library(ShortRead) > library(nucleR) > > what is the result of > > sessionInfo() > > ? /The sessionInfo-Output: R version 2.13.2 (2011-09-30) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=de_DE.UTF-8 [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] nucleR_0.99.4 Biobase_2.12.2 ShortRead_1.10.4 [4] Rsamtools_1.4.3 lattice_0.19-33 Biostrings_2.20.4 [7] GenomicRanges_1.4.8 IRanges_1.10.6 loaded via a namespace (and not attached): [1] grid_2.13.2 hwriter_1.3/ > >> >> >> >> On 03.11.2011 14:39, Martin Morgan wrote: >>> On 11/03/2011 06:14 AM, Oscar Flores wrote: >>>> So this error happens here, no? >>>> >>>> res = RangedData(IRanges(start=position(ar),width=width(ar)), >>>> strand=strand(ar),space=ar@chromosome) >>> >>> better to use the accessor chromosome(ar). The error >>> >>> > Fehler in solveUserSEW0(start = start, end = end, width = width) : >>> > solving row 16512893: range cannot be determined from the supplied >>> > arguments (too many NAs) >>> >>> suggests that position(ar)[16512893] and / or width(ar)[16512893] is >>> NA. You could filter these out, e.g., ar[!is.na(position(ar)) & >>> !is.na(position(ar))] or identify why these are read in in the first >>> place using the 'param' argument as described on ?readAligned. >>> >>> Martin >>> >>>> >>>> If this is the case the problem is not in nucleR, maybe there are some >>>> rows in a strange format in the AlignedRead (could be due the multiple >>>> format changes) that may avoid the conversion to the RangedData. >>>> Maybe I >>>> can detect them and skip those cases, but I would need to see what's >>>> happening in that odd case. >>>> >>>> Let me know if there's something I can do. >>>> >>>> Regards, >>>> >>>> Oscar >>>> >>>> >>>> El 03/11/2011 13:58, Stefanie Ververs escribió: >>>>> Hi Oscar, >>>>> >>>>> thanks for your quick answer - I think I would have contacted you, if >>>>> there were no answers on the bioconductor-mailinglist. >>>>> >>>>> I just tried the workaround as you suggested, but I got the same >>>>> error >>>>> again: >>>>> Fehler in solveUserSEW0(start = start, end = end, width = width) : >>>>> solving row 16512893: range cannot be determined from the supplied >>>>> arguments (too many NAs) >>>>> Calls: RangedData -> is -> IRanges -> solveUserSEW0 -> .Call >>>>> >>>>> I'll think about how to show you the data (it's hosted and processed >>>>> with Galaxy, so it might be possible to share it.) >>>>> >>>>> Regards, >>>>> >>>>> Steffi >>>>> >>>>> On 03.11.2011 13:16, Oscar Flores wrote: >>>>>> Hi Stefanie, >>>>>> >>>>>> I'm the developer of nucleR, so let's see if I can help you ;) >>>>>> >>>>>> After the processing, processReads converts the input data to a >>>>>> RangedData >>>>>> object for a easier manipulation later, so this error is occurs at >>>>>> the last >>>>>> step of the call, but data can be messed in previous steps. It's >>>>>> hard to tell what is happening without having a look to the input >>>>>> data, >>>>>> which I guess is huge... >>>>>> >>>>>> I would like to have a look to the raw data, but I know it is >>>>>> difficult >>>>>> to send it if it's not in a public repository. Maybe you can >>>>>> contact me >>>>>> directly about that (oflores@mmb.pcb.ub.es)... >>>>>> >>>>>> Meanwhile, if you want to try a workaround, you can directly >>>>>> convert the >>>>>> imported reads to RangedData format (which is the other format >>>>>> supported >>>>>> by processReads): >>>>>> >>>>>> (being "ar" your imported AlignedReads object) >>>>>> >>>>>> res = RangedData(IRanges(start=position(ar),width=width(ar)), >>>>>> strand=strand(ar),space=ar@chromosome) >>>>>> >>>>>> reads_pair = processReads(res, type="paired", >>>>>> fragmentLen=fragment_len) >>>>>> >>>>>> This should work, but will be nice to have a look to your data >>>>>> to fix a possible problem in the AlignedReads method. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Oscar >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor@r-project.org >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>> >>> >>> >> > > [[alternative HTML version deleted]]
El 09/11/2011 13:38, Stefanie Ververs escribi?: > I got some smaller input file and will try to use it for NucleR, too. > By now: The problem you have with nucleR is in the last part of processReads, but seems it is not related a malfunction of nucleR but is due the creation of a RangedData object with <na> values. See: > RangedData(IRanges(NA)) Error in solveUserSEW0(start = start, end = end, width = width) : solving row 1: range cannot be determined from the supplied arguments (too many NAs) I don't have experience working with BAM files, but could be that you have an extra blank line at the end of your file and this creates a NA record? Which is the output of: > alignedReads <- readAligned(dir, pattern=filename, type="BAM") > tail(alignedReads) Also, FYI, if you type "options(error=recover)" you will be able to see the calling stack of the different functions when an error ocurres. You can then look at the variables in that namespace (with "ls") and take a look at them, maybe this could help...