Entering edit mode
SimonNoël
▴
450
@simonnoel-3455
Last seen 10.2 years ago
Hi,
You say that one of the warning was because I was only having SNPs
from
chr1. I decided to add the 2 first SNPs I have from chrd but I get
a new
error...
> myids <- c("rs7547453", "rs2840542", "rs1999527",
"rs4648545",
"rs10915459", "rs16838750", "rs12128230", "rs4637157",
"rs11900053",
"rs999999999")
> mysnps <- makeGRangesFromRefSNPids(myids)
Errorin solveUserSEW0(start = start, end = end, width = width) :
solving row 8: range cannot be determined from the supplied
arguments
(too many NAs)
In addition: Warning messages:
1: In ans_locs[!is.na(myrows)] <- locs$loc[myrows] :
number of items to replace is not a multiple of replacement
length
2: In ans_locs[!is.na(myrows)] <- locs$loc[myrows] :
number of items to replace is not a multiple of replacement
length
But if I only try with the 2 first SNPs on chr2, I have
> myids <- c("rs4637157", "rs11900053", "rs999999999")
> mysnps <- makeGRangesFromRefSNPids(myids)
Warning message:
In ans_locs[!is.na(myrows)] <- locs$loc[myrows] :
number of items to replace is not a multiple of replacement
length
> mysnps
GRangeswith 3 ranges and 1 elementMetadata value
seqnames ranges strand | RefSNP_id
<rle> <iranges> <rle> | <character>
[1] ch2 [29443, 29443] * | rs4637157
[2] ch2 [36787, 36787] * | rs11900053
[3] unknown [ 0, 0] * | rs999999999
seqlengths
ch2 unknown
NA NA
So is that mean that I will have to go chr by chr and split my big
file?
Now for the problem of changing ch1 to chr1
> seqnames(mysnps)
'factor' Rle of length 8 with 2 runs
Lengths: 7 1
Values : ch1 unknown
Levels(2): ch1 unknown
> seqnames(mysnps) <- sub("ch", "chr", seqnames(mysnps))
> seqnames(mysnps)
'factor' Rle of length 8 with 2 runs
Lengths: 7 1
Values : chr1 unknown
Levels(2): chr1 unknown
> map <- as.matrix(findOverlaps(mysnps, tx))
Message d'avis :
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
Only some seqnames from 'query' and 'subject' were not identical
> mapExon <- as.matrix(findOverlaps(mysnps, txExon))
Message d'avis :
In .local(query, subject, maxgap, minoverlap, type, select, ...) :
Only some seqnames from 'query' and 'subject' were not identical
>
> mapped_genes <- values(tx)$gene_id[map[, 2]]
> mapped_snps <- rep.int(values(mysnps)$RefSNP_id[map[,
1]],
elementLengths(mapped_genes))
> snp2gene <-
unique(data.frame(snp_id=mapped_snps,
gene_id=unlist(mapped_genes)))
> rownames(snp2gene) <- NULL
> snp2gene[1:4, ]
snp_id gene_id
1 rs7547453 6497
2 rs2840542 79906
3 rs1999527 63976
4 rs4648545 7161
So now it's working on my computer:) but I am only able to do SNPs
from one
chromosome as I say.
On the super computer, it still doesn't work and on
my
computer, it still taking a lot of time. What isn't working is
> txdb <- makeTranscriptDbFromUCSC(genome="hg19",
tablename="refGene")
Downloadthe refGene table ... OK
Downloadthe refLink table ... OK
Extractthe 'transcripts' data frame ... OK
Extractthe 'splicings' data frame ... OK
Downloadand preprocess the 'chrominfo' data frame ... OK
Preparethe 'metadata' data frame ... OK
Makethe TranscriptDb object ... Error in
.writeMetadataTable(conn,
metadata) : subscript out of bounds
In addition: There were 50 or more warnings (use warnings() to see
the first
50)
Simon No??l
CdeC