countToFPKM and featureLength
1
1
Entering edit mode
Morris ▴ 10
@morris-13226
Last seen 7 months ago
Italy

Hello everyone, I am struggling to get an FPKM matrix using countToFPKM library. everything seems running here is my script I get error.

    library("devtools")
    library("biomaRt")
    library("dplyr")
    library(countToFPKM)

    file.readcounts<- as.matrix(read.csv("Matrix.csv", header = TRUE, row.names = 1))
    nrow(file.readcounts)
    ens_build = "sep2015"
    dataset="hsapiens_gene_ensembl"
    mart <- useEnsembl(biomart = "ENSEMBL_MART_ENSEMBL", dataset = dataset, version = 80)

   gene.annotations <- biomaRt::getBM(mart = mart, attributes=c("ensembl_gene_id", "external_gene_name",
                                                             "start_position", "end_position"))
  gene.annotations <- dplyr::transmute(gene.annotations, external_gene_name,  ensembl_gene_id, length = end_position - start_position)
  convert order column in gene.annotation and make first column as row.name
   # Filter and re-order gene.annotations to match the order in feature counts matrix
   gene.annotations <- gene.annotations %>% dplyr::filter(gene.annotations$ensembl_gene_id %in% row.names(file.readcounts))
    gene.annotations <- gene.annotations[order(match(gene.annotations$ensembl_gene_id, rownames(file.readcounts))),]
    # Assign feature lenghts into a numeric vector.
    featureLength <- gene.annotations$length

the future length seems to be ok as integer value but at the end here is the error

fpkm_matrix <- fpkm (file.readcounts, featureLength=featureLength, meanFragmentLength=NULL)

Error in fpkm(file.readcounts, featureLength = featureLength, meanFragmentLength = NULL) : 
length(featureLength) == nrow(counts) is not TRUE

I can't understand the error the match function seems working fine...

I accept any help, thank you

countToFPKM • 1.8k views
ADD COMMENT
1
Entering edit mode
swbarnes2 ★ 1.3k
@swbarnes2-14086
Last seen 18 hours ago
San Diego

You are pulling gene length out of biomart? Why would you do that? Why do you think intron length matters?

ADD COMMENT
0
Entering edit mode

Hi swbarnes2 I was following the script suggested by the author of the library(countToFPKM) to do that

here it is:

https://github.com/AAlhendi1707/countToFPKM/issues/2

he uses BioMart for extracting gene length informations

thanks

ADD REPLY
0
Entering edit mode

Just because someone posts code doesn't mean it makes sense. Are you really sure that you want gene lengths, and not transcript lengths?

ADD REPLY
0
Entering edit mode

you're right transcript length still has introns though maybe would be better to have exons lengths

thanks

ADD REPLY
0
Entering edit mode

Are you sure that just adding up every single exon of a gene is correct?

ADD REPLY
0
Entering edit mode

What else should I take in consideration? generally for expression analysis as far as I know is considered the processed transcript

ADD REPLY
0
Entering edit mode

Genes in eukaryotes do not have one single processed transcript per gene. They have many, and they can be different lengths.

What I'm trying to get across to you is that you cannot generate FPKM from gene counts alone.

ADD REPLY
0
Entering edit mode

What tools do you suggest to introduce all this variables and for a better analysis? I m using what the "market" offers

ADD REPLY
0
Entering edit mode

Since no one knows what your end analysis goal is, no one can help you. All I can tell you is you can't correct for transcript lengths using gene counts.

ADD REPLY

Login before adding your answer.

Traffic: 750 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6