I have recently posted an Rsubread featurecounts related question, but I have another one, based on a quality control analysis I would like to do:
I am attempting to identify a fluorescent reporter that is driven of a specific promoter in our cells of interest which we then sort (based on fluorescence) and sequence. The fluorescent reporter is ZsG. I cannot find a dedicated FASTA file for this on ensembl or anywhere else for that matter apart from the ebi. The ebi contains two sequences (presumably two isoforms of the protein) that I used as the FASTA file for the reference genome. The following code was used which succesful
ref <- "AAF03372.fa"
Reference index builds successfully with no errors...
I then use the code as per the vignette for paired-end data:
This also run succesfully, however, I get 0 mapping! And am abit concerned because I have no reason to believe out of ~300 samples that cell sorting has not worked correctly.. Especially as some organs express alot of this reporter based on a bigger population of cells of interest, and the gating is stringent based on negative control samples... so I have no reason to doubt that these cells are not green!!..
Are there any tips that you could give for why this may not be working? could it be the sequence in the fasta file? or any parameters that i should be tweaking in the align function (I have increased the number of mismatches to 6 to see if that helped but it didn't!)
Total fragments : 16,782,941 ||
|| Mapped : 0 (0.0%) ||
|| Uniquely mapped : 0 ||
|| Multi-mapping : 0 ||
|| Unmapped : 16,782,941 ||
|| || etc.......
Is there anything I am clearly doing wrong? What are the potential problems that could cause this, is compiling a library of one gene sufficient for this type of analysis? should this be added on to a FASTA file from the mouse genome etc... Any advice or tips would be greatly appreciated! Many thanks!!