Question

GOSeq analysis in rice

0

Entering edit mode

Jitendra ▴ 10

@nabiyogesh-11718

Last seen 8 months ago

United Kingdom

I got the differentially expressed genes and I can also download the mapping file from biomart for all the rice gene ID like below:

Gene stableID   Transcript stable ID           GO term accession
BGIOSGA013239  BGIOSGA013239-TA                   GO:0009098
BGIOSGA013239  BGIOSGA013239-TA                   GO:0003862
BGIOSGA013239  BGIOSGA013239-TA                   GO:0009082
BGIOSGA013239  BGIOSGA013239-TA                   GO:0016616
BGIOSGA013239  BGIOSGA013239-TA                   GO:0051287

.....................

Goseq code:
d <- read.csv("deseq2res.csv", header=T, row.names=1)
all_genes <- row.names(d)
 DE_genes <- all_genes[d$padj<0.05]

I am not sure how should I proceed further after this? I am not able to understand how should I get the genes.vector and length.vector.names for the below code and then GO_data.frame?

pwf <- nullp(genes.vector,bias.data=length.vector.names)
head(pwf)

# calculate GO enrichment using default method
GO.WALL <- goseq(pwf, gene2cat=GO_data.frame)

I will be thankful for your valuable time and help.

Many thanks nabiyogesh

goseq • 1.2k views

ADD COMMENT • link 4.8 years ago • updated 4.6 years ago Jitendra ▴ 10

score 1 · Answer 1 · 2020-06-22

1

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 2 hours ago

United States

You don't show enough code or data for anybody to be helpful. Anyway, the goseq vignette has several worked examples that use data that come with the package. These examples are intended for you to do yourself, so that you can see what the data for each argument look like. This includes a KeGG analysis that uses a gene2go data.frame. I would recommend you work through those examples, and if you have questions about particular steps, come back with those questions.

ADD COMMENT • link 4.8 years ago James W. MacDonald 68k

0

Entering edit mode

Thank you so much James,

After going through the goseq vignette, I am trying below code but still getting error?

> deseq2_result <- read.table("deseq2.res.root.txt",sep="\t",head=T,row.names=1)
> DEG <- rownames(subset(deseq2_result, padj <0.05 & log2FoldChange>0))                                                                                                 > ALL <- rownames(subset(deseq2_result))
> DEG.vector <- c(t(DEG))
> ALL.vector<-c(t(ALL))
> gene.vector=as.integer(ALL.vector%in%DEG.vector)
> names(gene.vector)=ALL.vector
> head(gene.vector)
BGIOSGA001272 BGIOSGA001400 BGIOSGA002386 BGIOSGA002510 BGIOSGA002743
            0             0             0             1             1
BGIOSGA002784
            1
> txdb = makeTxDbFromGFF("Oryza_indica.ASM465v1.44.chr.gff3")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning message:
In .extract_exons_from_GRanges(exon_IDX, gr, mcols0, tx_IDX, feature = "exon",  :
  98 exons couldn't be linked to a transcript so were dropped (showing
  only the first 6):
  seqid   start     end strand   ID               Name
1     1 1479179 1479368      + <NA> ENSRNA049495102-E1
2     1 3871367 3871431      - <NA> ENSRNA049495083-E1
3     1 7021193 7021391      - <NA> ENSRNA049495099-E1
4     1 7023628 7023818      - <NA> ENSRNA049495093-E1
5     1 7127068 7127257      + <NA> ENSRNA049495097-E1
6     1 7155225 7155422      - <NA> ENSRNA049495091-E1
                         Parent Parent_type
1 transcript:ENSRNA049495102-T1        <NA>
2 transcript:ENSRNA049495083-T1        <NA>
3 transcript:ENSRNA049495099-T1        <NA>
4 transcript:ENSRNA049495093-T1        <NA>
5 transcript:ENSRNA049495097-T1        <NA>
6 transcript:ENSRNA049495091-T1        <NA>

> txsByGene=transcriptsBy(txdb,"gene")
> length.vector.names=median(width(txsByGene))

> head(length.vector.names)
ABCC13 ABCG36   ACO1   ACT1   ACT3   ADH1
  5655   6943   1132   1547   1725   2894

    > pwf <- nullp(gene.vector,bias.data=length.vector.names)
Error in nullp(gene.vector, bias.data = length.vector.names) :
  bias.data vector must have the same length as DEgenes vector!

Many thanks

ADD REPLY • link 4.6 years ago Jitendra ▴ 10