Hello!
I have results of targeted sequencing. It is several csv files with information about amplicons (unique id of amplicon and number of reads). It looks like that (*image below). And I have to compare two coverage profiles of two samples. There are several methods to do that (Pearson correlation, Euclidean distance, Chi-square, t-test, clustering analysis and so on...), but I don't quite sure what will be statistically correct?
The goal is to validate the result. I have two samples (control and new) and I want to know that the coverage profile of new sample is the same with control sample.
*important moment: inside sample there is kind of competition for reagents, so if one amplicon will be "covered" high-covered then others will be "covered" less.
image: https://drive.google.com/file/d/1o78wnJgVKs3QEUEZbSD4oTpdr7k5_zCw/view?usp=sharing
Not an answer but a suggestion: it sounds like you could fit a localised regression to the data and then compare coverage profiles that way. By doing this, you could also introduce covariates into the model. I have never done this previously; so, remains a suggestion.
Thanks, Kevin! I will try this! And what do you think about correlation or chi-square? Is it statistically correct to use them in this case?
I think that the localised regression idea will stand better in a publication or report. You could fit the model and then check the coefficient to show that there are no differences between the 2 samples.
A correlation with a highly statistically significant p-value could assist, too. I am not sure about the use of a Chi Square test.