Question

refSeqName issue in cn.MOPS

0

Entering edit mode

ashkot09 • 0

@ashkot09-9452

Last seen 8.3 years ago

I was wondering if someone could help.

I recently came across cn.MOPS and it's very good. While I was trying to use it I noticed that it asks for refSeqName and initially it led me to believe it is looking for the name of the reference file. But I don't think that is the case so then what should be used there:

I downloaded 3 bam files from 1000 genomes

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00096/alignment/HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00096/alignment/HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam.bai

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00099/alignment/HG00099.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00099/alignment/HG00099.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam.bai

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00100/alignment/HG00100.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00100/alignment/HG00100.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.bam.bai

and used to following code:

startMops <- function() {

library(cn.mops)

BAMFiles <- list.files(pattern=".bam$")

bamDataRanges <- getReadCountsFromBAM(BAMFiles,refSeqName="c:/users/ashwin/documents/human_g1k_v37",mode="unpaired",WL=5000)

res <- cn.mops(bamDataRanges)

plot(res,which=1)

data(cn.mops)

result <- calcIntegerCopyNumbers(cn.mops(XRanges))

segm <- as.data.frame(segmentation(result))

cnvs <- as.data.frame(cnvs(result))

cnvregions <- as.data.frame(cnvr(result))

write.csv(segm,file="segmentation.csv")

write.csv(cnvs,file="cnvs.csv")

write.csv(cnvregions,file="cnvr.csv")

return(1)

}

but i get error about the refSeqName.

Can you please let me know what I could be using.

Thanks in advance.

CNV cn.mops • 1.7k views

ADD COMMENT • link updated 8.3 years ago by Günter Klambauer ▴ 540 • written 8.3 years ago by ashkot09 • 0

score 2 · Accepted Answer · 2016-01-04

2

Entering edit mode

Günter Klambauer ▴ 540

@gunter-klambauer-5426

Last seen 3.3 years ago

Austria

Dear Ashwin,

The parameter "refSeqName" relates to the names of the chromosomes as they are named in the BAM file header. Typically, these are "chr1", "chr2",... or "1", "2",... depending on the version of the reference genome.

If you want to analyze chromosome 1, then use:

bamDataRanges <- getReadCountsFromBAM(BAMFiles,refSeqName="chr1",mode="unpaired",WL=5000)

If you want to analyze chromosome 1-22, X and Y, then use:

bamDataRanges <- getReadCountsFromBAM(BAMFiles,refSeqName=c(paste0("chr",1:22),"chrX","chrY"),mode="unpaired",WL=5000)

Alternatively, you can try:

bamDataRanges <- getReadCountsFromBAM(BAMFiles,refSeqName="1",mode="unpaired",WL=5000)

bamDataRanges <- getReadCountsFromBAM(BAMFiles,refSeqName=c(paste0("",1:22),"X","Y"),mode="unpaired",WL=5000)

Otherwise, your code seems to be okay. I would recommend to use more samples than just 3.

Regards,

Günter

ADD COMMENT • link 8.3 years ago Günter Klambauer ▴ 540

0

Entering edit mode

And for what it's worth the sequence names ('levels') in the bam files can be discovered with seqlevels(BamFileList(BAMFiles)).

ADD REPLY • link 8.3 years ago Martin Morgan 25k

0

Entering edit mode

where is the method seqlevels located?

ADD REPLY • link 8.3 years ago ashkot09 • 0

0

Entering edit mode

The function "seqlevels" is a function of "GenomeInfoDb" and should be automatically available when you load cn.mops by "library(cn.mops)".

ADD REPLY • link 8.3 years ago Günter Klambauer ▴ 540

0

Entering edit mode

Thanks for your help, Martin!

ADD REPLY • link 8.3 years ago Günter Klambauer ▴ 540

0

Entering edit mode

Thanks Gunter, this has worked.

ADD REPLY • link 8.3 years ago ashkot09 • 0