Search
Question: comparing transcripts expression between conditions
0
2.4 years ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

hello,

I'm analyzing a RNAseq data set with three different outcomes (favorable, intermediate and poor; no control). I am specifically interested in the expression of certain transcripts (e.g. stat3 transcripts).

I would like to show, that there is a significantly stronger expression (=read counts) of stat3a compared to stat3b between two outcomes.

After trying DEXSeq and cuffdiff, which only give me the comparison of a specific transcript with itself between two conditions, I decided to try and do a t-test on the results from the salmon quantification run.

I have used salmon to quantify my data using the quasi-alignment method and extracted the results for my two transcripts.

I than read them into R and did a t-test to see if it is significant.

salmon.counts <- read_tsv("stat3.samples.Counts.txt")
salmon.counts$ratio <- salmon.counts$ENST00000264657/salmon.counts$ENST00000585517 t.test(subset(salmon.counts, condition=='Favorable')$ratio, subset(salmon.counts, condition=='Poor')$ratio) the results I get for this test show significance  Welch Two Sample t-test data: subset(tst.pilot, outcome == "Poor")$ratio and subset(tst.pilot, outcome == "Intermediate")\$ratio
t = -0.85552, df = 5.1434, p-value = 0.4303
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.2329766  0.1158939
sample estimates:
mean of x mean of y
0.1984490 0.2569904

I was wondering whether this way is statistically robust or not. If not, is there a better way of analyzing the data.

thanks in advance for any comment or suggestion.

Assa

the quantified table from the salmon output:

sampleID    condition    ENST00000264657    ENST00000585517
1    Favorable    2505.73    373.75
2    Favorable    2687.13    324.901
3    Favorable    3026.95    533.415
4    Favorable    2381.98    325.676
5    Favorable    2967.1    547.158
6    Favorable    2354.14    443.844
7    Favorable    2836.7    575.74
8    Favorable    2995.65    331.224
9    Favorable    2821    477.53
10    Favorable    3155.98    443.947
11    Intermediate    1779.66    267.906
12    Intermediate    2071.64    190.962
13    Intermediate    2107.06    574.362
14    Intermediate    4554.63    76.4624
15    Intermediate    2885.07    236.034
16    Intermediate    4400.48    69.2131
17    Intermediate    3128.83    421.91
18    Intermediate    2117.58    494.947
19    Intermediate    2197.06    623.131
20    Intermediate    2214.11    681.548
21    Poor    4064.86    231.687
22    Poor    3089.12    309.805
23    Poor    2309.83    553.167
24    Poor    3132.55    238.842
25    Poor    2804    282.656
26    Poor    2719.42    714.62
27    Poor    4029.91    277.442
28    Poor    3562.57    238.041
29    Poor    3688.88    256.918
30    Poor    3881.81    379.808

modified 2.4 years ago • written 2.4 years ago by Assa Yeroslaviz1.4k

What are the values in your table, and how were they calculated?

the values are the results of the salmon analysis for each of the two transcripts (=counts, TPM)

1

I would use TPM values from RSEM or EXPRESS and just do a t-test. I haven't seen many comparisons of two different genes before, there may be better methods out there and I would google for that just incase.

0
2.4 years ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

This is what I was thinking about. I have here the TPMs from salmon  and/or kallisto, both inspired by express (AFAIK). I did a t-test, but I was wondering, whether there is a better more robust way of analyzing the data.