DESeq analysis

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 9.6 years ago

Hi all I am doing some RNA seq analysis with DESeq. I have applied the nbinomTest to my dataset which I know have many differentially expressed genes but the first problem is that the result values for "padj"column is almost NA and sometimes 1. and when I want to have a splice from my fata frame the result is not meaningful for me. -- output of sessionInfo(): res <- nbinomTest(cds, "Male", "Female") > head(res) id baseMean baseMeanA baseMeanB foldChange log2FoldChange 1 ENSG00000000003 0.1130534 0.000000 0.2261067 Inf Inf 2 ENSG00000000005 0.0000000 0.000000 0.0000000 NaN NaN 3 ENSG00000000419 14.3767155 17.162610 11.5908205 0.6753530 -0.5662863 4 ENSG00000000457 17.0174761 15.342800 18.6921526 1.2183013 0.2848710 5 ENSG00000000460 3.9414822 2.855099 5.0278659 1.7610131 0.8164056 6 ENSG00000000938 16.0894945 18.350117 13.8288718 0.7536122 -0.4081058 pval padj 1 0.9959638 1 2 NA NA 3 0.3208560 1 4 0.5942512 1 5 0.4840607 1 6 0.5409953 1 > res1 <- res[res$padj<0.1,] > head(res1) id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj NA <na> NA NA NA NA NA NA NA NA.1 <na> NA NA NA NA NA NA NA NA.2 <na> NA NA NA NA NA NA NA NA.3 <na> NA NA NA NA NA NA NA NA.4 <na> NA NA NA NA NA NA NA NA.5 <na> NA NA NA NA NA NA NA my first question is that why although I know there are some differentially expressed genes in the my data, all the padj values are NA or 1 and the second question is this "NA.1" , "NA.2", ..... which are emerged as the first column of object "res1"instead of name of genes Thank you so much Regards -- Sent via the guest posting facility at bioconductor.org.

DESeq DESeq • 1.7k views

ADD COMMENT • link updated 11.8 years ago by Wolfgang Huber ★ 13k • written 11.8 years ago by Guest User ★ 13k

0

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 18 days ago

EMBL European Molecular Biology Laborat…

Dear Narges thank you for the feedback. Your second question is easy: use the idiom res1 <- subset(res, padj<0.1) instead, this will avoid the creation of rows full of NA whenever res$padj is NA. Alternatively res[order(res$padj)[1:n], ] with 'n' your favourite lucky number might be useful. Have a look at the R-intro manual for more on subsetting of arrays and dataframes in R. Your first question: can you show us the data for the genes where you know that they are differentially expressed? Perhaps then it might become more apparent why DESeq / nbinomtest did not agree. Also, what does the dispersion plot for cds look like? (This is the plot produced by plotDispEsts in the vignette). Best wishes Wolfgang narges [guest] scripsit 06/26/2012 06:17 PM: > > Hi all > > I am doing some RNA seq analysis with DESeq. I have applied the nbinomTest to my dataset which I know have many differentially expressed genes but the first problem is that the result values for "padj"column is almost NA and sometimes 1. and when I want to have a splice from my fata frame the result is not meaningful for me. > > -- output of sessionInfo(): > > res <- nbinomTest(cds, "Male", "Female") > >> head(res) > id baseMean baseMeanA baseMeanB foldChange log2FoldChange > 1 ENSG00000000003 0.1130534 0.000000 0.2261067 Inf Inf > 2 ENSG00000000005 0.0000000 0.000000 0.0000000 NaN NaN > 3 ENSG00000000419 14.3767155 17.162610 11.5908205 0.6753530 -0.5662863 > 4 ENSG00000000457 17.0174761 15.342800 18.6921526 1.2183013 0.2848710 > 5 ENSG00000000460 3.9414822 2.855099 5.0278659 1.7610131 0.8164056 > 6 ENSG00000000938 16.0894945 18.350117 13.8288718 0.7536122 -0.4081058 > pval padj > 1 0.9959638 1 > 2 NA NA > 3 0.3208560 1 > 4 0.5942512 1 > 5 0.4840607 1 > 6 0.5409953 1 > > >> res1 <- res[res$padj<0.1,] >> head(res1) > id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj > NA <na> NA NA NA NA NA NA NA > NA.1 <na> NA NA NA NA NA NA NA > NA.2 <na> NA NA NA NA NA NA NA > NA.3 <na> NA NA NA NA NA NA NA > NA.4 <na> NA NA NA NA NA NA NA > NA.5 <na> NA NA NA NA NA NA NA > > my first question is that why although I know there are some differentially expressed genes in the my data, all the padj values are NA or 1 and the second question is this "NA.1" , "NA.2", ..... which are emerged as the first column of object "res1"instead of name of genes > > Thank you so much > Regards > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Best wishes Wolfgang Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber

ADD COMMENT • link 11.8 years ago Wolfgang Huber ★ 13k

Login before adding your answer.