countToFPKM and featureLength
1
0
Entering edit mode
Morris • 0
@morris-13226
Last seen 3 days ago
Italy

Hello everyone, I am struggling to get an FPKM matrix using countToFPKM library. everything seems running here is my script I get error.

    library("devtools")
library("biomaRt")
library("dplyr")
library(countToFPKM)

ens_build = "sep2015"
dataset="hsapiens_gene_ensembl"
mart <- useEnsembl(biomart = "ENSEMBL_MART_ENSEMBL", dataset = dataset, version = 80)

gene.annotations <- biomaRt::getBM(mart = mart, attributes=c("ensembl_gene_id", "external_gene_name",
"start_position", "end_position"))
gene.annotations <- dplyr::transmute(gene.annotations, external_gene_name,  ensembl_gene_id, length = end_position - start_position)
convert order column in gene.annotation and make first column as row.name
# Filter and re-order gene.annotations to match the order in feature counts matrix
gene.annotations <- gene.annotations %>% dplyr::filter(gene.annotations$ensembl_gene_id %in% row.names(file.readcounts)) gene.annotations <- gene.annotations[order(match(gene.annotations$ensembl_gene_id, rownames(file.readcounts))),]
# Assign feature lenghts into a numeric vector.
featureLength <- gene.annotations\$length


the future length seems to be ok as integer value but at the end here is the error

fpkm_matrix <- fpkm (file.readcounts, featureLength=featureLength, meanFragmentLength=NULL)

Error in fpkm(file.readcounts, featureLength = featureLength, meanFragmentLength = NULL) :
length(featureLength) == nrow(counts) is not TRUE


I can't understand the error the match function seems working fine...

I accept any help, thank you

countToFPKM • 423 views
1
Entering edit mode
swbarnes2 ★ 1.2k
@swbarnes2-14086
Last seen 10 hours ago
San Diego

You are pulling gene length out of biomart? Why would you do that? Why do you think intron length matters?

0
Entering edit mode

Hi swbarnes2 I was following the script suggested by the author of the library(countToFPKM) to do that

here it is:

https://github.com/AAlhendi1707/countToFPKM/issues/2

he uses BioMart for extracting gene length informations

thanks

0
Entering edit mode

Just because someone posts code doesn't mean it makes sense. Are you really sure that you want gene lengths, and not transcript lengths?

0
Entering edit mode

you're right transcript length still has introns though maybe would be better to have exons lengths

thanks

0
Entering edit mode

Are you sure that just adding up every single exon of a gene is correct?

0
Entering edit mode

What else should I take in consideration? generally for expression analysis as far as I know is considered the processed transcript

0
Entering edit mode

Genes in eukaryotes do not have one single processed transcript per gene. They have many, and they can be different lengths.

What I'm trying to get across to you is that you cannot generate FPKM from gene counts alone.

0
Entering edit mode

What tools do you suggest to introduce all this variables and for a better analysis? I m using what the "market" offers

0
Entering edit mode

Since no one knows what your end analysis goal is, no one can help you. All I can tell you is you can't correct for transcript lengths using gene counts.