Hi all,
I am using the R package VariantAnnotation which I like very much.
I hope you can help me with an issue though - I have a problem concerning the summarizeVariants function when using a multi-sample vcf.
I have 6 samples in my vcf file.
# reading vcf
chromInfo <- seqinfo(scanVcfHeader(fl))
vcf <- readVcf(fl, chromInfo)
The samples were read in fine - see examples below:
info(vcf)
DataFrame with 2 rows and 18 columns
AC AF AN BaseQRankSum ClippingRankSum DP DS FS HaplotypeScore InbreedingCoeff MLEAC MLEAF MQ MQ0 MQRankSum QD
<IntegerList> <NumericList> <integer> <numeric> <numeric> <integer> <logical> <numeric> <numeric> <numeric> <IntegerList> <NumericList> <numeric> <integer> <numeric> <numeric>
1 5 0.833 6 -0.354 -0.825 40 FALSE NA NA NA NA NA NA 0 2.003 NA
2 4 1 4 NA NA 26 FALSE 0 NA NA 2 1 NA 0 NA NA
ReadPosRankSum set
<numeric> <character>
1 1.061 variant4-variant5-variant6
2 NA variant5-variant6
# transcripts
tx <- transcripts(ann2)
txlst <- splitAsList(tx, seq_len(length(tx)))
# summarize
promoterVar.sum <- summarizeVariants(txlst, vcf, PromoterVariants())
head(assays(promoterVar.sum)$counts)
colSums(assays(promoterVar.sum)$counts)
# example output
Sample1 Sample2 Sample3 Sample4 Sample5 Sample6
25 51 51 51 51 51 51
26 7 7 7 7 7 7
2 44 44 44 44 44 44
3 97 97 97 97 97 97
27 97 97 97 97 97 97
28 97 97 97 97 97 97
As you can see the output is for all samples the same - which is not the case!
This happens also when I use findOverlaps or any other mode.
Am I doing something wrong? I very much appreciate your help!
Kind Regards,
Julia
Not sure about that issue, but it would be easier to get the transcript annotations like this: