Question: genes symbol in RangedSummarizedExperiment object and add GRangesList to DESeqDataSet object in DESeq2
0
gravatar for layal8785
2.7 years ago by
layal87850
layal87850 wrote:

I hope I can get clear answer, because I didn't come here to ask till I've struggled for few days.

1) I wanted to get the fpkm counts from the DESeqDataSet object I created it from DESeqDataSetFromMatrix,but a column basepairs is not present in mcols(dds) (mcols of DESeqDataSet object after applying DESeq() ) so we need to calculated from the rowRanges of the dds object.. But it doesn't have GRangesList information.

So the question was how to get the the GRangesList.

2) get the GRanges from Genomic Feature package:

gtffile <- file.path("PATH/Mus_musculus.GRCm38.86.gtf")
txdb <- makeTxDbFromGFF(gtffile, format="gtf", circ_seqs=character())
ebg <- exonsBy(txdb, by="gene")
se <- summarizeOverlaps(features=ebg, bamfiles,
                        mode="Union",
                        singleEnd=FALSE,
                        ignore.strand=TRUE,
                        fragments=TRUE )

How to get the gene symbol instead of the ensemble ID ?

I tried to set use.names = TRUE in exonBy() function but it is not possible to set the use.names to TRUE  when grouped by "gene"... I tried to group it by transcript and get the transcript name it didn't work

In the function summarizeOverlaps no argument could be used to get the gene symbol

since I got the GRangesList (ebg) how to add it to dds (which I have) as it recommended in reference Manual

https://bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf

" If a user wants to store metadata columns about the rows of the countData, but does not have GRanges or GRangesList information, first construct the DESeqDataSet without rowRanges and then add the DataFrame with mcols(dds)"

ADD COMMENTlink modified 2.6 years ago • written 2.7 years ago by layal87850

I created the count matrix from featureCount

FC_countreads <- featureCounts(files="pathes to 12bam files",
                               isPairedEnd=TRUE, annot.ext= "pathsToGeneAnnotation/Mus_musculus.GRCm38.86.gtf",
                               isGTFAnnotationFile=TRUE, GTF.featureType="exon", GTF.attrType="gene_name")

countdata <- FC_countreads$counts

#Set new names to countdata matrix

New_names <- c("DayZero_R1","DayZero_R2","DayZero_R3" , "DayThree_R1","DayThree_R2","DayThree_R3","DaySix_R1","DaySix_R2","DaySix_R3", "DayTwelve_R1", "DayTwelve_R2", "DayTwelve_R3")
colnames(countdata) <- New_names


condition <- factor(c(rep("DayZero", 3),rep("DayThree",3),rep("DaySix",3),rep("DayTwelve",3)))
coldata <- data.frame(row.names = colnames(countdata), condition)
dds <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~ condition)

 

 

ADD REPLYlink written 2.6 years ago by layal87850
Answer: genes symbol in RangedSummarizedExperiment object and add GRangesList to DESeqDa
0
gravatar for Michael Love
2.7 years ago by
Michael Love26k
United States
Michael Love26k wrote:

Let's take a step back, how did you generate the count matrix you used to create the dds object?

ADD COMMENTlink written 2.7 years ago by Michael Love26k

The output from featureCounts should have a list element called annotation with information about each gene's length. See the help page: ?featureCounts.

ADD REPLYlink written 2.6 years ago by Michael Love26k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 380 users visited in the last hour