Question

Ballgown from StringTie results transcripts which are not expressed in both conditions as Differentially expressed ones

0

Entering edit mode

alva.james • 0

@alvajames-6967

Last seen 6.6 years ago

Germany

Hello All,

I have tried ballgown from StringTie on STAR-align->StringTie data results.And with statest I tried get differentially expressed transcripts. But when I take a median between both conditions I see for several transcripts the median is 0 for both condition and still its has p_value less than 0.05 .

And so I classified as DE transcripts. I would like to have Differentially expressed transcripts from the results of StringTie and from the github explannation I have understood Stattest does it . And I wondering how does it works likeDeseq , egeR etc takes fold change into account

ballgown stringtie • 2.1k views

ADD COMMENT • link updated 8.6 years ago by Jeff Leek ▴ 650 • written 8.6 years ago by alva.james • 0

score 0 · Answer 1 · 2016-07-18

0

Entering edit mode

Jeff Leek ▴ 650

@jeff-leek-5015

Last seen 4.0 years ago

United States

This seems a little unusual - but we have seen it happen when a transcript has zero expression in one group and moderate expression in another, resulting in a relatively strong differential expression signal but a median expression of zero. I wonder if you could look and see what the expression values were for that transcript across samples?

Jeff

ADD COMMENT • link 8.6 years ago Jeff Leek ▴ 650

0

Entering edit mode

. I wonder if you could look and see what the expression values were for that transcript across samples? -->the expression values are also zero across the samples for those transcripts which are identified as Significantly DE ones

ADD REPLY • link 8.6 years ago alva.james • 0

0

Entering edit mode

That seems very strange that they have entirely zero values but a small p-value. This is just a simple linear model in Stringtie. Can you please post your data/code so I can try to assist?

Jeff

ADD REPLY • link 8.6 years ago Jeff Leek ▴ 650

0

Entering edit mode

Code is just as it is in GitHub I followed ,

pData(bg) =data.frame(id=sort(sampleNames(bg)),group=sort(sampleNames(bg)))
 pData(bg) <-cbind(pData(bg) ,as.data.frame(str_split_fixed(pData(bg)$group,"_",2)))
 pData(bg)$group<-NULL
pData(bg)$V1<-NULL
colnames(pData(bg))<-c("id","group")

head(pData(bg))

         id group
1   AE02_ID    ID
2  AE02_REL   REL
3   AE04_ID    ID
4  AE04_REL   REL
5   AE05_ID    ID
6  AE05_REL   REL
7   AE10_ID    ID
8  AE10_REL   REL
# here I just replced ID and REL with 0 an 1 just to make sure its as the gihub explannation

pData(bg)$group<- str_replace_all(pData(bg)$group, "ID", "1")
pData(bg)$group<- str_replace_all(pData(bg)$group, "REL", "0")

stat_results = stattest(bg, feature='transcript', meas='FPKM', covariate='group')
head(stat_results)

 head(stat_results)
      feature id      pval      qval
6  transcript  6 0.3325078 0.8064427
11 transcript 11 0.8350343 0.9564246
17 transcript 17 0.2149321 0.8064427
19 transcript 19 0.3622309 0.8064427
20 transcript 20 0.8265769 0.9538413
21 transcript 21 0.1647989 0.8064427

# and then I filtered and annotated the resulted data frame

stat_results_filtered=stat_results[stat_results[["pval"]] <=0.05, ]

results_withFPKM=merge(stat_results_filtered,transcript_data_frame,by='t_id')

ADD REPLY • link 8.6 years ago alva.james • 0