Question

comparing transcripts expression between conditions

0

Entering edit mode

Assa Yeroslaviz ★ 1.5k

@assa-yeroslaviz-1597

Last seen 15 days ago

Germany

hello,

I'm analyzing a RNAseq data set with three different outcomes (`favorable`, `intermediate` and `poor`; no control). I am specifically interested in the expression of certain transcripts (e.g. stat3 transcripts).

I would like to show, that there is a significantly stronger expression (=read counts) of stat3a compared to stat3b between two outcomes.

After trying DEXSeq and cuffdiff, which only give me the comparison of a specific transcript with itself between two conditions, I decided to try and do a t-test on the results from the `salmon` quantification run.

I have used `salmon` to quantify my data using the quasi-alignment method and extracted the results for my two transcripts.

I than read them into R and did a t-test to see if it is significant.

salmon.counts <- read_tsv("stat3.samples.Counts.txt")
salmon.counts$ratio <- salmon.counts$ENST00000264657/salmon.counts$ENST00000585517
t.test(subset(salmon.counts, condition=='Favorable')$ratio, subset(salmon.counts, condition=='Poor')$ratio)

the results I get for this test show significance

    Welch Two Sample t-test

data:  subset(tst.pilot, outcome == "Poor")$ratio and subset(tst.pilot, outcome == "Intermediate")$ratio
t = -0.85552, df = 5.1434, p-value = 0.4303
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2329766  0.1158939
sample estimates:
mean of x mean of y 
0.1984490 0.2569904

I was wondering whether this way is statistically robust or not. If not, is there a better way of analyzing the data.

thanks in advance for any comment or suggestion.

Assa

the quantified table from the salmon output:

sampleID    condition    ENST00000264657    ENST00000585517
1    Favorable    2505.73    373.75
2    Favorable    2687.13    324.901
3    Favorable    3026.95    533.415
4    Favorable    2381.98    325.676
5    Favorable    2967.1    547.158
6    Favorable    2354.14    443.844
7    Favorable    2836.7    575.74
8    Favorable    2995.65    331.224
9    Favorable    2821    477.53
10    Favorable    3155.98    443.947
11    Intermediate    1779.66    267.906
12    Intermediate    2071.64    190.962
13    Intermediate    2107.06    574.362
14    Intermediate    4554.63    76.4624
15    Intermediate    2885.07    236.034
16    Intermediate    4400.48    69.2131
17    Intermediate    3128.83    421.91
18    Intermediate    2117.58    494.947
19    Intermediate    2197.06    623.131
20    Intermediate    2214.11    681.548
21    Poor    4064.86    231.687
22    Poor    3089.12    309.805
23    Poor    2309.83    553.167
24    Poor    3132.55    238.842
25    Poor    2804    282.656
26    Poor    2719.42    714.62
27    Poor    4029.91    277.442
28    Poor    3562.57    238.041
29    Poor    3688.88    256.918
30    Poor    3881.81    379.808

salmon transcripts significance counts • 1.5k views

ADD COMMENT • link 8.4 years ago Assa Yeroslaviz ★ 1.5k

0

Entering edit mode

What are the values in your table, and how were they calculated?

ADD REPLY • link 8.4 years ago chris86 ▴ 420

0

Entering edit mode

the values are the results of the salmon analysis for each of the two transcripts (=counts, TPM)

ADD REPLY • link 8.4 years ago Assa Yeroslaviz ★ 1.5k

1

Entering edit mode

I would use TPM values from RSEM or EXPRESS and just do a t-test. I haven't seen many comparisons of two different genes before, there may be better methods out there and I would google for that just incase.

ADD REPLY • link 8.4 years ago chris86 ▴ 420

score 0 · Answer 1 · 2016-07-14

0

Entering edit mode

Assa Yeroslaviz ★ 1.5k

@assa-yeroslaviz-1597

Last seen 15 days ago

Germany

This is what I was thinking about. I have here the TPMs from salmon and/or kallisto, both inspired by express (AFAIK). I did a t-test, but I was wondering, whether there is a better more robust way of analyzing the data.

ADD COMMENT • link 8.4 years ago Assa Yeroslaviz ★ 1.5k