DESeq2 or edgeR for Presence and Absence of Proteins
Hello everyone,

I am doing my Masters, where I am tasked with comparing the metabolic potential between 57 archaea species. I want to compare the presence and absence of a list of proteins (that I have annotated using DRAM + RAST) across my species of archaea.

Therefore one of the apps is DESeq2 or edgeR. While these packages were initially designed for transcriptomics, I was wondering if it was possible to feed it a matrix with my species in the column names and the name of proteins as rows names and within the matrix, if the species has a specific protein they get a 1 if they don't it is labelled 0.

I should stress I don't have any transcriptomics or expression data just the information of whether proteins are absent or present in species

Is this possible, or can another package do this better?

Thank you very much ```

ATpoint ★ 3.4k
No, both tools expect count data with certain distributional assumptions which are not satisfied by binary data.

Since you're a master student I strongly suggest to sit with your PI and supervisor and develop an analysis strategy that is able to answer the underlying scientific question. If they do not have a strategy then consider to ask for an experienced collaborator.


