Question: GFOLD file as input for DESeq2
0
4.6 years ago by
United States
alantb_cederj0 wrote:

I am trying to do some analysis with DESEq2. I can run the analysis and get my results. However, I am concern about the table I have been using to input the read counting data into DESeq2. The read counts was done by another person using GFOLD. The GFOLD output is a table with Gene Symbol, Gene Name, Read Count, Exon Length, and RPKM. I deleted the unwanted rows and created a .txt file with Gene Symbol and Read Count to use as input to DESeq2. Is is a proper way to run the DESeq2 analysis, or I should count the reads again using one of the packages suggested in the "Begginer's guide to using the DESeq2 package".

Thanks a lot.

deseq2 • 1.7k views
modified 4.6 years ago • written 4.6 years ago by alantb_cederj0
Answer: GFOLD file as input for DESeq2
0
4.6 years ago by
Michael Love23k
United States
Michael Love23k wrote:

These files should be fine if you want to hack it together in R. Make sure that the gene symbols are in the same order in each file, and
that the colData sample information matches the columns of your count matrix.

If you want to use the R packages mentioned in the beginner's guide, they are quite easy as well. I recommend summarizeOverlaps from the GenomicAlignments package or featureCounts from the Rsubread package.

Thanks a lot Michael

Answer: GFOLD file as input for DESeq2
0
4.6 years ago by
Dario Strbenac1.4k
Australia
Dario Strbenac1.4k wrote:

If you have no biological replicates, then it is not worth the effort of reformatting the data and using DESeq2 to analyse it. Simply report the results of GFOLD.

Answer: GFOLD file as input for DESeq2
0
4.6 years ago by
United States
alantb_cederj0 wrote:

Hi Dario, I have 2 replicates for each treatment.

Answer: GFOLD file as input for DESeq2
0
4.6 years ago by
United States
alantb_cederj0 wrote:

I really do not have any experience with R, and basically I am going through some tutorials, reading the manuals and papers about this subject.... and it has been hard for me to understand all steps of the analysis...

I have a simple experiment with control and exposure. For each treatment I have 2 replicates. As I already mentioned, I got the output of the GFOLD, which is a table with 5 columns: Gene Symbol, Gene Name, Read Count, Exon Length, and RPKM. I deleted the unwanted rows and created a new .txt file with Gene Symbol and Read Count. This table has NO headers...

I also created a sample table, which is a .txt file with 3 columns: sample name, file name, and condition.

Following this tutorial "http://dwheelerau.com/2014/02/17/how-to-use-deseq2-to-analyse-rnaseq-data/"  I got this script:

library('DESeq2')

directory<-"/Users/Alan/Documents/NGS_Data"

sampleFiles<-c("sample1.csv", "sample2.csv", "sample3.csv", "sample4.csv")

sampleCondition<-c("untreated", "untreated", "treated","treated")

sampleTable<-data.frame(sampleName=sampleFiles, fileName=sampleFiles, condition=sampleCondition)

ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design=~condition)

ddsHTSeq

colData(ddsHTSeq)$condition<-factor(colData(ddsHTSeq)$condition, levels=c("untreated","treated"))

dds<-DESeq(ddsHTSeq)

res<-results(dds)

plotMA(dds,ylim=c(-2,2),main="DESeq2")

dev.copy(png,"deseq2_MAplot.png")

dev.off()

mcols(res,use.names=TRUE)

write.csv(as.data.frame(res),file="results_deseq2.csv")

It runs fine and I can get the table. The big question is: Is this a proper way to analyze my data? Because I have no experience with DESeq2 and very little knowledge about R, I am concern about making errors, being unable to detect it, and generating a fake result. Another thing that bugs me is the facts that there are a few different ways to generate count tables to input in R, and I do not know if "DESeqDataSetFromHTSeqCount" is the best call for me.

Alan

hi Alan,

Nevertheless, the DESeqDataSetFromHTSeqCount works on the output from htseq-count, which are files - one for each sample - with two columns: gene ID and count. So if that's what you've created, then it should work for you.

It seems I am good to go... Wonderful..

Thanks a lot.

Alan