Search
Question: featureCounts using 3'UTR as genome
0
11 months ago by
joangibert140 wrote:

Hi! I'm new in the bioinformatic world so, maybe, this is commonly known in the comunity but I couldn't find anything in the RSubread package instructions...
I know that featureCounts have an in-built hg38 genome. For most of my purposes it works great but I have some Immunoprecipitation and  RNAseq of a protein that bind to the 3'UTR of the mRNA. I would like to try to compare the hg38 results with a counting only for the 3'UTR. I get the 3'UTR genome from Table Borwser at UCSC (in fasta format) and I specify in featureCounts the option "genome" for my 3'UTR file.
Although fC does not report any error, I couldn't find any differences in the countings when I compared the 3'UTR results with hg38 results. What I am doing wrong?

Here is the command line for fC:
fc <- featureCounts(files=targets\$InputFile,annot.inbuilt="hg38",genome = "hg38_3UTR.fa")

Thanks!!
Joan

modified 11 months ago by James W. MacDonald46k • written 11 months ago by joangibert140
1
11 months ago by
United States
James W. MacDonald46k wrote:

The genome argument doesn't do what you appear to think it does. From ?featureCounts:

genome: a character string giving the name of a FASTA-format file
that includes the reference sequences used in read mapping
that produced the provided SAM/BAM files. NULL by default.
This argument should only be used when juncCounts is
TRUE. Note that providing reference sequences is optional
when juncCounts is set to TRUE.

You could probably define a SAF file of just the 3' UTRs and use that, or alternatively you could use summarizeOverlaps in the GenomicAligments package, which would be an arguably simpler thing to do. I generally use the Rsamtools package to define a BamFileList of my bam files, which you can directly feed to summarizedOverlaps. Something like

library(Rsamtools)
library(GenomicAlignments)
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
utrs <- threeUTRsByTranscript(TxDb.Hsapiens.UCSC.hg38.knownGene)
utrs <- reduce(utrs)
bfl <- BamFileList(<character vector of bamfile locations goes here>)
olaps <- summarizeOverlaps(utrs, bfl)

Depending on the type of reads, you may need to modify (particularly if you have paired-end reads). See the help page for summarizeOverlaps.