deseq2 for TCGA tumor/normal paired sample from multiple patients
2
0
Entering edit mode
luren • 0
@luren-9141
Last seen 8.5 years ago
United States

Basically I want to run DEseq to analyze TCGA paired data. say in certain cancer type, there are 107 patients, 214 samples(paired). I used following code(pretty much default). By setting like this, I expect DEseq2 giving me a result using the pairing information as well as 107 patient as biological replicates, is this appropriate? I feel not confident because I use 107 different values for the factor-- patient.

 

library('DESeq2')
directory<-"tBRCA"
sampleFiles <-c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57","58","59","60","61","62","63","64","65","66","67","68","69","70","71","72","73","74","75","76","77","78","79","80","81","82","83","84","85","86","87","88","89","90","91","92","93","94","95","96","97","98","99","100","101","102","103","104","105","106","107" ....to "214")

#Above I use "1""2" to "214" as file name for 214 samples

samplePatient=rep(c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57","58","59","60","61","62","63","64","65","66","67","68","69","70","71","72","73","74","75","76","77","78","79","80","81","82","83","84","85","86","87","88","89","90","91","92","93","94","95","96","97","98","99","100","101","102","103","104","105","106","107"),2)

#Above I use "1" "2"... to "107" as patient name for 107 patients


sampleCondition<-rep(c("untreated","treated"),each=107)

sampleTable<-data.frame(sampleName=sampleFiles, fileName=sampleFiles, condition=sampleCondition,patient=samplePatient)

ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory,design=~patient+condition)

colData(ddsHTSeq)$condition<-factor(colData(ddsHTSeq)$condition, levels=c("untreated","treated"))

dds<-DESeq(ddsHTSeq)

res<-results(dds)
res<-res[order(res$padj),]
write.table(res, file = 'myfile.txt',sep='\t')
plotMA(dds,ylim=c(-2,2),main="DESeq2")

 

deseq2 • 2.8k views
ADD COMMENT
1
Entering edit mode

Just a tip to make your life easier. You could declare your sampleFiles and samplePatient variables like so:

sampleFiles <- as.character(1:214)
samplePatient <- as.character(rep(1:107, 2))

 

ADD REPLY
0
Entering edit mode

Many thanks for your tip!!!!

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

Yes that looks correct.

ADD COMMENT
0
Entering edit mode

Thanks a lot for fast response! But this gives me more than 10k out of 20k genes with p adj 10^-6 and lower. I am using TCGA BRCA data(214 total, from 107 patient, paired, raw counts.) Is this normal?Do you have any suggestions to shrunk it?

 

ADD REPLY
0
Entering edit mode
Neha • 0
@3fa075d2
Last seen 2.3 years ago
Finland

I am new in cancer dataset. I have readcount data from htseq for each patient normal and cancer stage. I have generated separate matrix for normal and cancer. How should I proceed to find expression data for further eQTL analysis.

  1. In DesSEq2 I used read-count for normal patient separately and cancer separately or combined in one folder which is not possible due to same patient number.
  2. I should use matrix for normal and cancer separately or combined matrix.

Kindly help me...how should I proceed.

ADD COMMENT
0
Entering edit mode

This is out of scope for support for DESeq2 software specifically. It sounds like you need general bioinformatic advice for pre-processing expression data for eQTL analysis. You might try posting a detailed question about your dataset and goals to https://www.biostars.org.

ADD REPLY

Login before adding your answer.

Traffic: 569 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6