Question: I get nothing in my up and down regulated genes
gravatar for Nemo
2.3 years ago by
Nemo80 wrote:

Here is my clean data , I could not post the dput here

Here I try to find up and down regulated genes based on LFQ intensities using limma 

design <- model.matrix(~c(rep(1,2),rep(0,2)))
fit <- lmFit(data, design)
fit2 <- eBayes(fit)
myt <- topTable(fit2, coef=2, n=Inf)

which are empty , it is because I don't have any adj.P.Val smaller than 0.05 but I don't know what criteria to select 

where do I make mistake ??



ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Nemo80
gravatar for Laurent Gatto
2.3 years ago by
Laurent Gatto1.0k
United Kingdom
Laurent Gatto1.0k wrote:

Your data contains a lot of 0 values (about 25%), which is arguable a bit suspicious.

Then, you need to log your data, or probably better data <- log2(data + 1), to get the logFC right (difference of mean intensities per group rather than ratio). If you do this, you will identify proteins that have a presence/absence pattern, relating back to my first point. With 25% of missing values, it is not unexpected to get such a pattern by chance.

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Laurent Gatto1.0k

@Laurent Gatto it is right. I guess having the zeros for some proteins comes back to the fact that I analysis few groups of samples together. So, they might have not found for all samples of a group but could have intensities for another group. 

I read somewhere that he discarded proteins that had less than 50% zero values means I have 4 samples here and if there is not intensities for equal or more than 2 samples then I discard them. However, I am afraid how much this assumption hold because we have 4 samples 2 control and 2 treated. which means if I have one intensity value out of 4 in treated one, it might be ok! No?

that is why I removed all genes which had no intensities over all samples 

do you have any suggestion ? 


ADD REPLYlink written 2.3 years ago by Nemo80

The number of zeros in your data is concerning. Debating on the number of allowed 0s is not going to help, because filtering is not going to fix your issue. You should probably assess your data processing strategy in the light of this problem.

ADD REPLYlink written 2.3 years ago by Laurent Gatto1.0k

@Laurent Gatto I accepted your answer and I appreciate your help. I found were those zeros are coming from and I solved the issue. 

however, I have two questions which are off topic here but seems like you know proteomics and I wanted to ask if you know or not. In a label free quantification. I have used MaxQuant and I identified many proteins. however, some of the genes are missing for some proteins , how do you handle this when you want to do pathway analysis using IPA? 

The other question is that when you want to do pathway analysis using IPA, do you use the LFQ intensities for control with all samples (biological replicate) and treated with all samples (biological replicate) or do you take the average of them and then perform pathway analysis ? 

ADD REPLYlink written 2.3 years ago by Nemo80

I am not familiar with IPA, so can't comment on that aspect.

I am not sure what leads to the absence of gene names. Where do the other ones come from? An online query, the protein fasta file, ...? I guess that tracking the provenance of that information will give a clue about the absence of some gene names.

ADD REPLYlink written 2.3 years ago by Laurent Gatto1.0k
gravatar for Steve Lianoglou
2.3 years ago by
Steve Lianoglou12k wrote:

What type of data is this? Why so many 0s?

It's also (obviously) not log transformed. Whatever data you've got, you'll most likely need to normalize it somehow (the how depends on the type of data), and this data will have to be passed into the lmFit function on the log2 scale.

ADD COMMENTlink written 2.3 years ago by Steve Lianoglou12k

@Steve Lianoglou they are LFQ intensities for proteomics. if you look at the column name, you see that I have two control and two treated (which have biological replicate). 

why so many zeros ? to be honest i don't know , it is like when you find a protein based on Mass for a sample, an intensity will be calculated. However, if it does not show up in another sample, you will have no intensity and it will be zero. what I can do is to remove those genes that have less than 50% intensities , I don't know if it is a good idea to remove all genes which have at least 1 zero, I really don't know (from scientific point of view) because as I explained above , it might appear to be for one sample of one group but not for another ! 

ADD REPLYlink written 2.3 years ago by Nemo80
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 415 users visited in the last hour