I'm incredibly new to processing data in R but I'll try to explain to the best of my ability.
I have a proteinGroups file from MaxQuant. I have filtered it for sparse signals as well as decoys, contaminants, and reverse signals. On top of this, I have made a separate set that is just my intensities. In this case, we are looking at a mutant vs wt organism for differential protein expression. These both have 4 replicates.
Once my data was filtered and I made a df (1014x18) containing just my intensities. In this df, I have 4 columns of mutant and 4 columns of wt intensities. I converted this df into a matrix and ran the command justvsn() for a log2 transformation and normalization.
mat<-as.matrix(prot2) matvsn<-justvsn(mat) rownames(matvsn)<-prot2$Protein.IDs matvsn<-matvsn[,-1] #Just removing the NA column associated with protein IDs. sessionInfo( )
After this, I tend to struggle with finding out what to do next.
I was told to use limma and was given a handful of resources including the limma vignette, this video (https://www.youtube.com/watch?v=Hg1abiNlPE4), and some online guides.
Where I struggle is finding out what to use for my model. Right now, I have:
I can almost guarantee this is wrong but I don't know what to do. If somebody could please explain what I need to do to create a linear model and get p-values for differential expression analysis, I would be extremely grateful.