If you use any of the negative binomial packages (like DESeq2 or edgeR) then you'll simply have to omit the samples without a pair. Actually it makes no difference whether you explicitly omit them or not, because the linear modelling approach implemented in those packages will have the effect of removing the unpaired samples automatically.
If you use limma to analyse your RNA-seq data (either voom or limma-trend), then you can use duplicateCorrelation() to estimate the correlation between samples from the same pair. This approach allows you to include all your samples, whether they have a pair or not. That's the way I approach problems of this type because I don't like to throw data out.
I usually do recommend duplicateCorrelation() in cases where there are sample relationships that cannot be included in a design matrix. Here I would argue that the variance of interest is, within a pair, how far is the treated sample from the value predicted by a shared treatment effect. Kind of analogous to testing on differences in a paired t-test. If the correlation between pairs is high, I don't think adding unpaired samples helps much to estimate this variance. If the correlation between pairs is low, and the number of unpaired samples is high, I can imagine how it would help. So I can see why you would recommend the duplicateCorrelation route either way.
Including the samples without pair doesn't help you estimate the treatment effect, or the biological variability, if you have a single point per individual. I wouldn't include them.
Michael, thanks a lot for an explanation. However DE genes set slightly differs from running DESeq on complete data set, where not paired samples should be discarded automatically (as Gordon Smyth said below), from the DE genes set I obtained when discard not-paired samples manually from experimental design. The difference is minor though with 1-2 genes less in second case.
I usually do recommend duplicateCorrelation() in cases where there are sample relationships that cannot be included in a design matrix. Here I would argue that the variance of interest is, within a pair, how far is the treated sample from the value predicted by a shared treatment effect. Kind of analogous to testing on differences in a paired t-test. If the correlation between pairs is high, I don't think adding unpaired samples helps much to estimate this variance. If the correlation between pairs is low, and the number of unpaired samples is high, I can imagine how it would help. So I can see why you would recommend the duplicateCorrelation route either way.
Thanks a lot Gordon for clarification of this point.