using paired end and single end data in featurecounts
1
0
Entering edit mode
@nikellepetrillo-13687
Last seen 7.4 years ago

Hi all, 

I have 4 .sam files after aligning. 3 of the SAM files were created from PE data, and 1 SAM file was created from SE data. I want to use featurecounts to create a count matrix, however, should I be using paired end mode or single end mode? 

Thanks for the help, 

Nikelle 

featurecounts rsubread • 3.6k views
ADD COMMENT
0
Entering edit mode

hi, I'm gonna remove the deseq2 tag, as it's a featureCounts question.

ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 16 hours ago
United States

Presumably you aligned the PE data by telling your aligner that it was PE, and the SE was aligned in SE mode, correct? In that case you should probably use featureCounts in PE mode for the PE data and in SE mode for the SE data. I don't know what featureCounts will do if you give it PE aligned data and say it is SE (double count the paired reads?), but pretending the PE data are SE is probably not the way to go.

ADD COMMENT
0
Entering edit mode

Thanks James. Yes, the PE data was aligned as PE and the SE data was aligned as SE. I would like to create 1 count matrix made from all 4 .sam files using featureCounts. Are you saying to create a count matrix for the 3 .sam files (PE data, using PE mode in feautreCounts) and a separate count matrix for the 1 .sam file (SE data, using SE mode in feautreCounts)? 

If so, is there then a way to combine these 2 count matrices into 1? 

 

ADD REPLY
0
Entering edit mode

Of course! That's just a basic R data manipulation step. From ?featureCounts

Value:

     A list with the following components:

  counts: a data matrix containing read counts for each feature or
          meta-feature for each library.

So the output will be a list, the first item being a matrix with the read counts. You can then just cbind the two counts matrices (ensuring of course that the rows line up correctly) and go from there.

ADD REPLY
0
Entering edit mode

Or merge in R would work as well. But I would remove the chromosom, start, end, strand and length column in the two dfs.

new_df <- merge(table1_df, table2_df, by.x = 'Geneid', by.y = 'Geneid')
ADD REPLY

Login before adding your answer.

Traffic: 433 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6