Problem with importing text file in featureCounts
Entering edit mode
estevemp • 0
Last seen 2.5 years ago

Hello dear Bioconductor support group,

I am having problems with importing this data results from featureCounts, which output has this format:

Program:featureCounts v1.6.2; Command:

featureCounts -L file.sam -g Parent -a file_exon.gff -G file.fasta -T 20 -o counts.txt
Geneid  Chr Start   End Strand  Length  M54.sam
unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;unitig_0_quiver_quiver;...unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver;unitig_1_quiver_quiver  568;2068334;2069275;2073626;2075076;2075787;2076821;2079662;2081007;2082574;2086372;2086736;2088493;2090226;2099578;2102000;2103077;2104046;2104689;2106631;2109282;2109929;2122067;2124736;2125261;2126211;2126765;2128874;2133886;2135222;2137655...

The problem here is that I was using the following command:

counts <- read.table("counts.txt", comment.char = "#", header=TRUE, sep = ";")

Nonetheless, with this command I cannot import my files because the first lines correspond to all row names ("unitig_*") and then, after more than 1000 lines I have the counts and the rest of the information; and I thought the featureCounts format would gave me each result in the order of appearance of every header.

Could you tell me how can I solve this problems to introduce my data results in R?

Thank you in advance.

Rsubread featureCounts • 352 views
Entering edit mode

I've reformatted your question a bit to make the code and output more readable.

Entering edit mode
Last seen 5 hours ago
WEHI, Melbourne, Australia

Let me make a few comments that might make things easier for you.

First, you're using the Unix command version of featureCounts, which is not part of Bioconductor. You could consider using the R version of featureCounts in the Rsubread package, and then all the output would be automatically an R object.

Second, files from featureCounts are always tab-delimited, so you always use sep = "\t" if you want to read them into R rather than sep = ";". The results are returned in the order that you would expect.

Third, the output you show doesn't seem to contain any Geneids, as the second line of text that you show goes straight into chromosome names (unitig_0_quiver_quiver etc). Is there are a problem with your GFF file? You can't do any analysis without Geneids.


Login before adding your answer.

Traffic: 387 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6