how to count for edgeR input
1
0
Entering edit mode
wang peter ★ 2.0k
@wang-peter-4647
Last seen 9.6 years ago
dear all: i have non-strand specific RNA-seq samples for edgeR analysis. first, i used bowtie to map my reads to the assembled contigs to count the sample. then for each contig, i got two number. one is forward count, the other is reverse count. if i sum these two number together to get one count number for edgeR input. is it right? should i sum them or average them -- shan gao Room 231(Dr.Fei lab) Boyce Thompson Institute Cornell University Tower Road, Ithaca, NY 14853-1801 Office phone: 1-607-254-1267(day) Official email:sg839 at cornell.edu Facebook:http://www.facebook.com/profile.php?id=100001986532253
edgeR edgeR • 1.2k views
ADD COMMENT
0
Entering edit mode
@alessandroguffantigenomniacom-4436
Last seen 9.6 years ago
Hi. You can assume that 20% of the transcripts you have in your hands, if it is human data, will have antisense overlaps at 3'UTR or, to a lesser extent, to internal exons or 5'UTR Since you can not distinguish this occurrence and you can not inroduce this correction factor in your datasets, you are left only with the option of summing all your tag (read) counts in my opinion. A good, simple check would be to sample the distribution of - versus + strand matches - they should be around 50% each. Hope this helps, Alessandro ----------------------------------------------------- Alessandro Guffanti - Bioinformatics, Genomnia srl Via Nerviano, 31 - 20020 Lainate, Milano, Italy Ph: +39-0293305.702 Fax: +39-0293305.777 http://www.genomnia.com "If you can dream it, you can do it" (Walt Disney) -----Original Message----- From: wang peter <wng.peter@gmail.com> To: bioconductor@r-project.org Date: Mon, 23 Jul 2012 09:59:54 -0400 Subject: [BioC] how to count for edgeR input dear all: i have non-strand specific RNA-seq samples for edgeR analysis. first, i used bowtie to map my reads to the assembled contigs to count the sample. then for each contig, i got two number. one is forward count, the other is reverse count. if i sum these two number together to get one count number for edgeR input. is it right? should i sum them or average them -- shan gao Room 231(Dr.Fei lab) Boyce Thompson Institute Cornell University Tower Road, Ithaca, NY 14853-1801 Office phone: 1-607-254-1267(day) Official email:sg839@cornell.edu Facebook:http://www.facebook.com/profile.php?id=100001986532253 [http://www.facebook.com/profile.php?id=100001986532253] _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor [https://stat.ethz.ch/mailman/listinfo/bioconductor] Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [http://news.gmane.org/gmane.science.biology.informatics.conductor] ----------------------------------------------------------- Il Contenuto del presente messaggio potrebbe contenere informazioni confidenziali a favore dei soli destinatari del messaggio stesso. Qualora riceviate per errore questo messaggio siete pregati di cancellarlo dalla memoria del computer e di contattare i numeri sopra indicati. Ogni utilizzo o ritrasmissione dei contenuti del messaggio da parte di soggetti diversi dai destinatari è da considerarsi vietato ed abusivo. The information transmitted is intended only for the per...{{dropped:10}}
0
Entering edit mode
Hi, On Monday, July 23, 2012, Guffanti Alessandro wrote: > Hi. You can assume that 20% of the transcripts you have in your hands, if > it > is human data, will have antisense overlaps at 3'UTR or, to a lesser > extent, > to internal exons or 5'UTR > > Interesting ... do you have some references for that one. I'd be curious to see how they figured that one out Thanks, -Steve > Since you can not distinguish this occurrence and you can not inroduce this > correction factor in your datasets, you are left only with the option of > summing all your tag (read) counts in my opinion. > > A good, simple check would be to sample the distribution of - versus + > strand matches - they should be around 50% each. > > Hope this helps, > > Alessandro > > ----------------------------------------------------- > Alessandro Guffanti - Bioinformatics, Genomnia srl > Via Nerviano, 31 - 20020 Lainate, Milano, Italy > Ph: +39-0293305.702 Fax: +39-0293305.777 > http://www.genomnia.com > "If you can dream it, you can do it" (Walt Disney) > > -----Original Message----- > From: wang peter <wng.peter@gmail.com <javascript:;="">> > To: bioconductor@r-project.org <javascript:;> > Date: Mon, 23 Jul 2012 09:59:54 -0400 > Subject: [BioC] how to count for edgeR input > > > dear all: > i have non-strand specific RNA-seq samples for > edgeR analysis. > first, i used bowtie to map my reads to the > assembled contigs to count the sample. then for each > contig, i got two number. one is forward count, the > other is reverse count. > if i sum these two number together to get one > count number for edgeR input. is it right? > should i sum them or average them > > -- > shan gao > Room 231(Dr.Fei lab) > Boyce Thompson Institute > Cornell University > Tower Road, Ithaca, NY 14853-1801 > Office phone: 1-607-254-1267(day) > Official email:sg839@cornell.edu <javascript:;> > Facebook:http://www.facebook.com/profile.php?id=100001986532253 > [http://www.facebook.com/profile.php?id=100001986532253] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org <javascript:;> > https://stat.ethz.ch/mailman/listinfo/bioconductor > [https://stat.ethz.ch/mailman/listinfo/bioconductor] > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [http://news.gmane.org/gmane.science.biology.informatics.conductor] > > ----------------------------------------------------------- > Il Contenuto del presente messaggio potrebbe contenere informazioni > confidenziali a favore dei > soli destinatari del messaggio stesso. Qualora riceviate per errore questo > messaggio siete pregati > di cancellarlo dalla memoria del computer e di contattare i numeri sopra > indicati. Ogni utilizzo o > ritrasmissione dei contenuti del messaggio da parte di soggetti diversi > dai destinatari è da > considerarsi vietato ed abusivo. > > The information transmitted is intended only for the p...{{dropped:14}}
ADD REPLY
0
Entering edit mode
Many on these, including one in which I participated :-) this is a good starting point - actually this estimate is rather conservative, if you reason in terms of short antisense transcripts at 5' and 3' transcription bubbles.. It's a Genome Biology and the authors have a long history on this topic Hope this Helps Alessandro Sense-antisense pairs in mammals: functional and evolutionary considerations Pedro AF Galante1 [http://genomebiology.com/2007/8/3/R40/#ins1],2 [http://genomebiology.com/2007/8/3/R40/#ins2], Daniel O Vidal1 [http://genomebiology.com/2007/8/3/R40/#ins1], Jorge E de Souza1 [http://genomebiology.com/2007/8/3/R40/#ins1], Anamaria A Camargo1 [http://genomebiology.com/2007/8/3/R40/#ins1] and Sandro J de Souza1 [http://genomebiology.com/2007/8/3/R40/#ins1]* H ----------------------------------------------------- Alessandro Guffanti - Bioinformatics, Genomnia srl Via Nerviano, 31 - 20020 Lainate, Milano, Italy Ph: +39-0293305.702 Fax: +39-0293305.777 http://www.genomnia.com "If you can dream it, you can do it" (Walt Disney) -----Original Message----- From: Steve Lianoglou <mailinglist.honeypot@gmail.com> To: Guffanti Alessandro <alessandro.guffanti@genomnia.com> Cc: wang peter <wng.peter@gmail.com>, "bioconductor@r-project.org" <bioconductor@r-project.org> Date: Mon, 23 Jul 2012 13:12:39 -0400 Subject: Re: [BioC] how to count for edgeR input Hi, On Monday, July 23, 2012, Guffanti Alessandro wrote: Hi. You can assume that 20% of the transcripts you have in your hands, if it is human data, will have antisense overlaps at 3'UTR or, to a lesser extent, to internal exons or 5'UTR Interesting ... do you have some references for that one. I'd be curious to see how they figured that one out Thanks, -Steve Since you can not distinguish this occurrence and you can not inroduce this correction factor in your datasets, you are left only with the option of summing all your tag (read) counts in my opinion. A good, simple check would be to sample the distribution of - versus + strand matches - they should be around 50% each. Hope this helps, Alessandro ----------------------------------------------------- Alessandro Guffanti - Bioinformatics, Genomnia srl Via Nerviano, 31 - 20020 Lainate, Milano, Italy Ph: +39-0293305.702 Fax: +39-0293305.777 http://www.genomnia.com [http://www.genomnia.com/] "If you can dream it, you can do it" (Walt Disney) -----Original Message----- From: wang peter <wng.peter@gmail.com> To: bioconductor@r-project.org Date: Mon, 23 Jul 2012 09:59:54 -0400 Subject: [BioC] how to count for edgeR input dear all: i have non-strand specific RNA-seq samples for edgeR analysis. first, i used bowtie to map my reads to the assembled contigs to count the sample. then for each contig, i got two number. one is forward count, the other is reverse count. if i sum these two number together to get one count number for edgeR input. is it right? should i sum them or average them -- shan gao Room 231(Dr.Fei lab) Boyce Thompson Institute Cornell University Tower Road, Ithaca, NY 14853-1801 Office phone: 1-607-254-1267(day) Official email:sg839@cornell.edu Facebook:http://www.facebook.com/profile.php?id=100001986532253 [http://www.facebook.com/profile.php?id=100001986532253] [http://www.facebook.com/profile.php?id=100001986532253 [http://www.facebook.com/profile.php?id=100001986532253]] _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor [https://stat.ethz.ch/mailman/listinfo/bioconductor] [https://stat.ethz.ch/mailman/listinfo/bioconductor [https://stat.ethz.ch/mailman/listinfo/bioconductor]] Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [http://news.gmane.org/gmane.science.biology.informatics.conductor] [http://news.gmane.org/gmane.science.biology.informatics.conductor [http://news.gmane.org/gmane.science.biology.informatics.conductor]] ----------------------------------------------------------- Il Contenuto del presente messaggio potrebbe contenere informazioni confidenziali a favore dei soli destinatari del messaggio stesso. Qualora riceviate per errore questo messaggio siete pregati di cancellarlo dalla memoria del computer e di contattare i numeri sopra indicati. Ogni utilizzo o ritrasmissione dei contenuti del messaggio da parte di soggetti diversi dai destinatari è da considerarsi vietato ed abusivo. The information transmitted is intended only for the pe...{{dropped:27}}

Login before adding your answer.

Traffic: 933 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6