Question: parameter "refSeqName" in function "getReadCountsFromBAM"
0
gravatar for Mohammad Alkhamis
3.3 years ago by
Canada/ Victoria, BC / University of Victoria
Mohammad Alkhamis0 wrote:

Hello everyone,

I think that the argument "refSeqName" in the function "getReadCountsFromBAM" is a bit misleading. While looking at the source code in "getReadCountsFromBAM.R", I noticed the following code line:

for (i in 1:length(refSeqName)) 

It turned out that I am able to use the function with multiple reference sequence names like this, for example:

bamDataRanges <- getReadCountsFromBAM(BAMFiles,
                                      sampleNames=paste("Sample",1:2),
                                      refSeqName=c("1", "2"),
                                      WL=100,
                                      mode="paired", 
                                      parallel=12)

which will give an output in the console as following:

Identified the following reference sequences:  1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y,MT,GL000207.1,GL000226.1 [...]
Using 1, 2 as reference.
Using indexed BAM files.

 

Though, the documentation (quoted below) suggests that the function is only capable of handing a single reference sequence name. This is also the case with the examples that come with the cn.mops package.

"refSeqName:    Name (Names?) of the reference sequence (sequences?) that should be analyzed. The name (names?) must appear in the header of the BAM file. If it is not given the function will select the first reference sequence that appears in the header of the BAM files."

 

Thanks for accepting my comment.

cn.mops • 665 views
ADD COMMENTlink modified 2.8 years ago by Günter Klambauer540 • written 3.3 years ago by Mohammad Alkhamis0
Answer: parameter "refSeqName" in function "getReadCountsFromBAM"
2
gravatar for Günter Klambauer
3.3 years ago by
Austria
Günter Klambauer540 wrote:
Hello Malkhamis, Thanks for bringing this to my attention. Indeed users are often struggling with that parameter. I will make changes to the documentation as suggested. Regards, Günter On 2016-08-26 21:31, malkhamis [bioc] wrote: > Activity on a post you are following on support.bioconductor.org > <https: support.bioconductor.org=""> > > User malkhamis <https: support.bioconductor.org="" u="" 10862=""/> wrote > Question: parameter "refSeqName" in function "getReadCountsFromBAM" > <https: support.bioconductor.org="" p="" 86536=""/>: > > Hello everyone, > > I think that the argument "|refSeqName|" in the function > "|getReadCountsFromBAM|" is a bit misleading. While looking at the > source code in "getReadCountsFromBAM.R", I noticed the following code > line: > > for (i in 1:length(refSeqName)) > > It turned out that I am able to use the function with multiple > reference sequence names like this, for example: > > bamDataRanges <- getReadCountsFromBAM(BAMFiles, > sampleNames=paste("Sample",1:2), > *refSeqName=c("1", "2"),* > WL=100, > mode="paired", > parallel=12) > > which will give an output in the console as following: > > /Identified the following reference sequences: > 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y,MT,GL000207.1,GL000226.1 > [...] Using 1, 2 as reference. Using indexed BAM files./ > > Though, the documentation (quoted below) suggests that the function is > only capable of handing a single reference sequence name. This is also > the case with the examples that come with the cn.mops package. > > "refSeqName: Name *(Names?)* of the reference sequence > *(sequences?)* that should be analyzed. The name *(names?)* must > appear in the header of the BAM file. If it is not given the function > will select the first reference sequence that appears in the header of > the BAM files." > > Thanks for accepting my comment. > > ------------------------------------------------------------------------ > > Post tags: cn.mops > > You may reply via email or visit > parameter "refSeqName" in function "getReadCountsFromBAM" > -- Günter Klambauer, PhD Institute of Bioinformatics Johannes Kepler University, Linz, Austria http://www.bioinf.jku.at/people/klambauer/
ADD COMMENTlink written 3.3 years ago by Günter Klambauer540
Answer: parameter "refSeqName" in function "getReadCountsFromBAM"
2
gravatar for Günter Klambauer
2.8 years ago by
Austria
Günter Klambauer540 wrote:

Just to make it visible also here... Something like the following typically works for human genomes and runs the function for chr1-22:

X <- getReadCountsFromBAM(BAMFiles,refSeqName=as.character(1:22),WL=1000, mode="paired")

If your sequence names are "chr1", "chr2" rather than "1", "2", then:

X <- getReadCountsFromBAM(BAMFiles,refSeqName=paste0("chr",1:22),WL=1000, mode="paired")

should work.

ADD COMMENTlink written 2.8 years ago by Günter Klambauer540

Thanks a lot Gunter, that helps tremendously ! especially when running cnMOPS on our computer cluster ;)

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by Bogdan580
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 188 users visited in the last hour