I am using goseq on some bovine data and I am using Ensembl IDs for bovine genome UMD3.1, which is not supported by goseq, so I had to input lenght and GO information manually. I tried obtaining length information 2 ways: 1) from featCounts files results; 2) from biomaRt.
Both gave me weirdly shaped plots.
Not sure if I'm doing something wrong. The graphs are in totally opposite direction then I expected them to be ( instead of going up and stabilizing as I expected, they are going down).
Below are the codes of how I obtained length info and plots, as well as the plots
1) Using featureCounts results:
featCountsData <- read.table("charolais_145A_TGACCA.featCounts.txt", header=TRUE, comment.char = "#", sep="\t")
featCountsLengthData <- as.numeric(featCountsData[,"Length"])
names(featCountsLengthData) <- featCountsData$Geneid
LenData <- names(geneLengthData) %in% names(genes)
LenData <- geneLengthData[LenData]
pwf <- nullp(DEgenes = genes, bias.data = LenData, plot.fit = TRUE)
2) Using Biomart:
txsData <- makeTxDbFromBiomart(biomart ="ensembl", dataset = "btaurus_gene_ensembl")
txsByGene <- transcriptsBy(txsData,"gene")
geneLengthData <- median(width(txsByGene))
BiomartLenData <- names(geneLengthData) %in% names(genes)
BiomartLenData <- geneLengthData[testLenData]
pwf <- nullp(DEgenes = genes, bias.data = testLenData, plot.fit = TRUE)
PLOT 1: Length info obtained from featureCounts:
PLOT 2: Length info obtained from biomaRt
Any help is appreciated!