Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 9.6 years ago
Dear list,
I have a question about the arguments of the align function in the
Rsubread package. I have mapped my RNA-seq SOLiD data (single-end, 16
samples, 50bp long reads, human) with Rsubread using the align()
function in 3 versions:
-default parameters (1)
-unique=TRUE and tieBreakQS=TRUE (2)
-unique=TRUE (3)
For my surprise, the percentage of mapped reads is ordered like this:
(1)>(2)>(3), for all samples.
Why is it, that when unique and tieBreakQS=TRUE (2) is used, I get
more mapped reads than only with unique (3)? tieBreaksQS argument
should only decide, when two reads are equally optimally aligned,
which read has to be kept. I expected something like this:
(1)>(2)=(3) approximately.
Where is my reasoning mistake?
On the other hand, after the counting procedure using the
featureCounts() function (gtf only with genes), I retrieved in some
samples more genes with alignment (2) than (1). I thought that the set
of mapped reads of (2) should be contained in (1)? Is this also wrong?
It does not happen in many samples, and the difference is not that
big, but is unexpected for me.
So, if anyone could help and sees where my thinking mistake is, I
would be very thankful!
Cheers,
Luc??a
-- output of sessionInfo():
sessionInfo()
+R version 2.15.0 (2012-03-30)
Platform: x86_64-redhat-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_ES.UTF-8 LC_COLLATE=es_ES.UTF-8
[5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rsubread_1.6.3
loaded via a namespace (and not attached):
[1] tools_2.15.0
--
Sent via the guest posting facility at bioconductor.org.