Question: need help for the study design of a RNA-Seq project
0
gravatar for shirley zhang
6.3 years ago by
shirley zhang1.0k
shirley zhang1.0k wrote:
Dear list, I am not sure whether this list is the right place to ask this study design question. But on this list, I got lots of information regarding how to analyze RNA-Seq data, so would like to give a try. We are going to do RNA-Sequencing using Illumina HiSeq for 200 samples. Given that the sample size is fixed, and the budget is fixed, the following 3 options were proposed. 1. 50bp pair-end reads, sequencing each sample per lane --> we will get ~100 million reads per sample 2. 75bp pair-end reads, sequencing two samples per lane --> we will get ~50-60 million reads per sample 3. 100bp pair-end reads, sequencing four samples per lane --> we will get ~30-40 million reads per sample Based on your experience, which option is the best or you have other suggestions? We would like to do different kinds of analysis for these data, i.e.,novel transcripts, lncRNA, and splicing, SNP, etc. You name it. If we have to sort them by priority (from high to low), I would like to say " novel transcripts, long-noncoding RNAs splicing and differential expression". Currently, the majority of labs sequence 100bp pair-end, right? But I was told that even you sequence 100bp long, after 75bp, the sequencing quality is very bad due to the issue of sequencer itself, that is, it has nothing with the RNA quality of samples. If this is true, why is 100bp read length becoming more popular now? Many thanks, Shirley <zhangxl@bu.edu> [[alternative HTML version deleted]]
snp sequencing • 640 views
ADD COMMENTlink modified 6.3 years ago • written 6.3 years ago by shirley zhang1.0k
Answer: need help for the study design of a RNA-Seq project
0
gravatar for Sean Davis
6.3 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
On Thu, Apr 11, 2013 at 4:32 PM, shirley zhang <shirley0818 at="" gmail.com=""> wrote: > Dear list, > > I am not sure whether this list is the right place to ask this study design > question. But on this list, I got lots of information regarding how to > analyze RNA-Seq data, so would like to give a try. > > We are going to do RNA-Sequencing using Illumina HiSeq for 200 samples. > Given that the sample size is fixed, and the budget is fixed, the following > 3 options were proposed. > > 1. 50bp pair-end reads, sequencing each sample per lane --> we will get > ~100 million reads per sample > 2. 75bp pair-end reads, sequencing two samples per lane --> we will get > ~50-60 million reads per sample > 3. 100bp pair-end reads, sequencing four samples per lane --> we will get > ~30-40 million reads per sample > > Based on your experience, which option is the best or you have other > suggestions? We would like to do different kinds of analysis for these > data, i.e.,novel transcripts, lncRNA, and splicing, SNP, etc. You name it. > If we have to sort them by priority (from high to low), I would like to say > " novel transcripts, long-noncoding RNAs splicing and differential > expression". > > Currently, the majority of labs sequence 100bp pair-end, right? But I was > told that even you sequence 100bp long, after 75bp, the sequencing quality > is very bad due to the issue of sequencer itself, that is, it has nothing > with the RNA quality of samples. If this is true, why is 100bp read length > becoming more popular now? Hi, Shirley. I don't mean this as MY answer to your question, but this blog post has a few statements that might be interesting to you. http://core-genomics.blogspot.com/2013/04/encodes-rna-seq- recommendations-need.html You'll not that it refers to the ENCODE RNA-seq guidelines which might also be instructive. Sean
ADD COMMENTlink written 6.3 years ago by Sean Davis21k
Answer: need help for the study design of a RNA-Seq project
0
gravatar for Dario Strbenac
6.3 years ago by
Dario Strbenac1.5k
Australia
Dario Strbenac1.5k wrote:
It is a question to ask at Biostars which has the address http://www.biostars.org/
ADD COMMENTlink written 6.3 years ago by Dario Strbenac1.5k
Dear Wei, Sean and Dario, Many thanks for all of your reply and suggestions. I really appreciate. I will check the ENCODE RNA-seq guidelines.I also posted my question at Biostars. Thanks again, Shirley On Thu, Apr 11, 2013 at 9:59 PM, Dario Strbenac <d.strbenac@garvan.org.au>wrote: > It is a question to ask at Biostars which has the address > http://www.biostars.org/ <zhangxl@bu.edu> [[alternative HTML version deleted]]
ADD REPLYlink written 6.3 years ago by shirley zhang1.0k
Hi Shirley, I would say it depends. If you are investigating an organism with no good transcriptomic or genomic sequence ressources, I would also recommend 100bp PE because for a subsequent assembly of the reads this is for shure beneficial. If you have plenty of genomic ressources available, I would not generally discard the 50bp option. I attached a paper that partially covers the length / precision debate. Maybe it is helpful for you. Best regards Moritz 2013/4/12 shirley zhang <shirley0818 at="" gmail.com=""> > Dear Wei, Sean and Dario, > > Many thanks for all of your reply and suggestions. I really appreciate. > > I will check the ENCODE RNA-seq guidelines.I also posted my question at > Biostars. > > Thanks again, > Shirley > On Thu, Apr 11, 2013 at 9:59 PM, Dario Strbenac <d.strbenac at="" garvan.org.au=""> >wrote: > > > It is a question to ask at Biostars which has the address > > http://www.biostars.org/ > > > > <zhangxl at="" bu.edu=""> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *Moritz He? PhD Candidate * *Research associate Forest Research Institute of Baden W?rttemberg (FVA) Wonnhalde 4 79100 Freiburg (Germany) phone +49 761 4018 301* -------------- next part -------------- A non-text attachment was scrubbed... Name: Li, Dewey - 2011 - RSEM accurate transcript quantification from RNA-Seq data with or without a reference genome.pdf Type: application/pdf Size: 516515 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20130412="" bbb05fb1="" attachment.pdf="">
ADD REPLYlink written 6.3 years ago by Moritz Hess60
Answer: need help for the study design of a RNA-Seq project
0
gravatar for Wei Shi
6.3 years ago by
Wei Shi3.1k
Australia
Wei Shi3.1k wrote:
Hi Shirley, You should use 100bp PE reads. Longer reads will help reduce the mapping ambiguity, and thus give you more power in detecting new transcripts and SNPs. It will enable to you quantify gene expression levels more accurately as well. Cheers, Wei On Apr 12, 2013, at 6:32 AM, shirley zhang wrote: > Dear list, > > I am not sure whether this list is the right place to ask this study design > question. But on this list, I got lots of information regarding how to > analyze RNA-Seq data, so would like to give a try. > > We are going to do RNA-Sequencing using Illumina HiSeq for 200 samples. > Given that the sample size is fixed, and the budget is fixed, the following > 3 options were proposed. > > 1. 50bp pair-end reads, sequencing each sample per lane --> we will get > ~100 million reads per sample > 2. 75bp pair-end reads, sequencing two samples per lane --> we will get > ~50-60 million reads per sample > 3. 100bp pair-end reads, sequencing four samples per lane --> we will get > ~30-40 million reads per sample > > Based on your experience, which option is the best or you have other > suggestions? We would like to do different kinds of analysis for these > data, i.e.,novel transcripts, lncRNA, and splicing, SNP, etc. You name it. > If we have to sort them by priority (from high to low), I would like to say > " novel transcripts, long-noncoding RNAs splicing and differential > expression". > > Currently, the majority of labs sequence 100bp pair-end, right? But I was > told that even you sequence 100bp long, after 75bp, the sequencing quality > is very bad due to the issue of sequencer itself, that is, it has nothing > with the RNA quality of samples. If this is true, why is 100bp read length > becoming more popular now? > > Many thanks, > Shirley > <zhangxl at="" bu.edu=""> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}
ADD COMMENTlink written 6.3 years ago by Wei Shi3.1k
Answer: need help for the study design of a RNA-Seq project
0
gravatar for shirley zhang
6.3 years ago by
shirley zhang1.0k
shirley zhang1.0k wrote:
Dear Moritz, Thanks a lot for your suggestions and circulating the paper. I will read it. Sorry that I forgot to mention in my original question. We are working on human samples. During the design, the RNA fragment size has to be taken into account as well. If the fragment is 150-200bp, then 100 paired end is a waste as the reads will frequently overlap. Am I right? So I might go to option 2 (76bp PE) which is also recommended by ENCODE guideline. Hope to hear more comments and suggestions. Many thanks, On Fri, Apr 12, 2013 at 3:50 AM, Moritz Hess <ssehztirom@googlemail.com>wrote: > Hi Shirley, > > I would say it depends. If you are investigating an organism with no good > transcriptomic or genomic sequence ressources, I would also recommend 100bp > PE because for a subsequent assembly of the reads this is for shure > beneficial. If you have plenty of genomic ressources available, I would not > generally discard the 50bp option. I attached a paper that partially covers > the length / precision debate. Maybe it is helpful for you. > > Best regards > > Moritz > > 2013/4/12 shirley zhang <shirley0818@gmail.com> > >> Dear Wei, Sean and Dario, >> >> Many thanks for all of your reply and suggestions. I really appreciate. >> >> I will check the ENCODE RNA-seq guidelines.I also posted my question at >> Biostars. >> >> Thanks again, >> Shirley >> On Thu, Apr 11, 2013 at 9:59 PM, Dario Strbenac <d.strbenac@garvan.org.au>> >wrote: >> >> > It is a question to ask at Biostars which has the address >> > http://www.biostars.org/ >> >> >> >> <zhangxl@bu.edu> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > > *Moritz Heß > PhD Candidate > * > *Research associate > Forest Research Institute > of Baden Württemberg (FVA) > Wonnhalde 4 > 79100 Freiburg (Germany) > > phone +49 761 4018 301* > <zhangxl@bu.edu> [[alternative HTML version deleted]]
ADD COMMENTlink written 6.3 years ago by shirley zhang1.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 305 users visited in the last hour