comparing transcripts expression between conditions
1
0
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 15 days ago
Germany

hello,

I'm analyzing a RNAseq data set with three different outcomes (`favorable`, `intermediate` and `poor`; no control). I am specifically interested in the expression of certain transcripts (e.g. stat3 transcripts).

I would like to show, that there is a significantly stronger expression (=read counts) of stat3a compared to stat3b between two outcomes.

After trying DEXSeq and cuffdiff, which only give me the comparison of a specific transcript with itself between two conditions, I decided to try and do a t-test on the results from the `salmon` quantification run. 

I have used `salmon` to quantify my data using the quasi-alignment method and extracted the results for my two transcripts.

I than read them into R and did a t-test to see if it is significant. 

salmon.counts <- read_tsv("stat3.samples.Counts.txt")
salmon.counts$ratio <- salmon.counts$ENST00000264657/salmon.counts$ENST00000585517
t.test(subset(salmon.counts, condition=='Favorable')$ratio, subset(salmon.counts, condition=='Poor')$ratio)

 

the results I get for this test show significance 

    Welch Two Sample t-test

data:  subset(tst.pilot, outcome == "Poor")$ratio and subset(tst.pilot, outcome == "Intermediate")$ratio
t = -0.85552, df = 5.1434, p-value = 0.4303
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2329766  0.1158939
sample estimates:
mean of x mean of y 
0.1984490 0.2569904

I was wondering whether this way is statistically robust or not. If not, is there a better way of analyzing the data. 

thanks in advance for any comment or suggestion.

Assa

 

the quantified table from the salmon output:

sampleID    condition    ENST00000264657    ENST00000585517
1    Favorable    2505.73    373.75
2    Favorable    2687.13    324.901
3    Favorable    3026.95    533.415
4    Favorable    2381.98    325.676
5    Favorable    2967.1    547.158
6    Favorable    2354.14    443.844
7    Favorable    2836.7    575.74
8    Favorable    2995.65    331.224
9    Favorable    2821    477.53
10    Favorable    3155.98    443.947
11    Intermediate    1779.66    267.906
12    Intermediate    2071.64    190.962
13    Intermediate    2107.06    574.362
14    Intermediate    4554.63    76.4624
15    Intermediate    2885.07    236.034
16    Intermediate    4400.48    69.2131
17    Intermediate    3128.83    421.91
18    Intermediate    2117.58    494.947
19    Intermediate    2197.06    623.131
20    Intermediate    2214.11    681.548
21    Poor    4064.86    231.687
22    Poor    3089.12    309.805
23    Poor    2309.83    553.167
24    Poor    3132.55    238.842
25    Poor    2804    282.656
26    Poor    2719.42    714.62
27    Poor    4029.91    277.442
28    Poor    3562.57    238.041
29    Poor    3688.88    256.918
30    Poor    3881.81    379.808

 

salmon transcripts significance counts • 1.5k views
ADD COMMENT
0
Entering edit mode

What are the values in your table, and how were they calculated?

ADD REPLY
0
Entering edit mode

the values are the results of the salmon analysis for each of the two transcripts (=counts, TPM)

ADD REPLY
1
Entering edit mode

I would use TPM values from RSEM or EXPRESS and just do a t-test. I haven't seen many comparisons of two different genes before, there may be better methods out there and I would google for that just incase.

ADD REPLY
0
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 15 days ago
Germany

This is what I was thinking about. I have here the TPMs from salmon  and/or kallisto, both inspired by express (AFAIK). I did a t-test, but I was wondering, whether there is a better more robust way of analyzing the data.

ADD COMMENT

Login before adding your answer.

Traffic: 777 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6