Question: Problem with importing text file in featureCounts
0
7 months ago by
estevemp0
estevemp0 wrote:

Hello dear Bioconductor support group,

I am having problems with importing this data results from featureCounts, which output has this format:

Program:featureCounts v1.6.2; Command:

featureCounts -L file.sam -g Parent -a file_exon.gff -G file.fasta -T 20 -o counts.txt

Geneid  Chr Start   End Strand  Length  M54.sam
unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;...unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver  568;2068334;2069275;2073626;2075076;2075787;2076821;2079662;2081007;2082574;2086372;2086736;2088493;2090226;2099578;2102000;2103077;2104046;2104689;2106631;2109282;2109929;2122067;2124736;2125261;2126211;2126765;2128874;2133886;2135222;2137655...


The problem here is that I was using the following command:

counts <- read.table("counts.txt", comment.char = "#", header=TRUE, sep = ";")


Nonetheless, with this command I cannot import my files because the first lines correspond to all row names ("unitig_*") and then, after more than 1000 lines I have the counts and the rest of the information; and I thought the featureCounts format would gave me each result in the order of appearance of every header.

Could you tell me how can I solve this problems to introduce my data results in R?

modified 7 months ago by Gordon Smyth38k • written 7 months ago by estevemp0

I've reformatted your question a bit to make the code and output more readable.

Answer: Problem with importing text file in featureCounts
1
7 months ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

Let me make a few comments that might make things easier for you.

First, you're using the Unix command version of featureCounts, which is not part of Bioconductor. You could consider using the R version of featureCounts in the Rsubread package, and then all the output would be automatically an R object.

Second, files from featureCounts are always tab-delimited, so you always use sep = "\t" if you want to read them into R rather than sep = ";". The results are returned in the order that you would expect.

Third, the output you show doesn't seem to contain any Geneids, as the second line of text that you show goes straight into chromosome names (unitig_0_quiver_quiver etc). Is there are a problem with your GFF file? You can't do any analysis without Geneids.