Question: good news and bad news
0
7.2 years ago by
Hervé Pagès ♦♦ 14k
United States
Hervé Pagès ♦♦ 14k wrote:
Hi, People generally ask "do you want me to start with the good or with the bad news?" Here they are in no particular order (both are changes in BioC devel): - I've added the mcols() accessor as the preferred way (over elementMetadata() and values()) to access the metadata columns. Also the term "metadata columns" is now used consistently everywhere in the IRanges/GenomicRanges/Rsamtools documentation (instead of "element metadata", "elementMetadata columns", "values", "columns of values", "metadata values", "elementMetadata values", etc...) Having 3 synonyms for accessing the same thing is of course not ideal but I hope we can remedy this in the future. - I've added "$" and "$<-" methods for GRanges *only*. Provided as a convenience and as the result of strong popular demand. Note that those methods are not consistent with the other "$" and "$<-" methods defined in the IRanges/GenomicRanges infrastructure, and might confuse some users by making them believe that a GRanges object can be manipulated as a data.frame-like object. It is therefore recommended to use them only interactively, and their use in scripts or packages is discouraged (in that case, 'mcols(x)$name' should be used instead of 'x$name'). Those changes are in IRanges 1.15.35 and GenomicRanges 1.9.48. Please let me know if you have any questions or concerns about this. Thanks, H. -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
modified 7.2 years ago by Steve Lianoglou12k • written 7.2 years ago by Hervé Pagès ♦♦ 14k
0
7.2 years ago by
United States
Michael Lawrence11k wrote:
On Thu, Aug 16, 2012 at 10:16 AM, Hervé Pagès <hpages@fhcrc.org> wrote: > Hi, > > People generally ask "do you want me to start with the good or with > the bad news?" Here they are in no particular order (both are changes > in BioC devel): > > - I've added the mcols() accessor as the preferred way (over > elementMetadata() and values()) to access the metadata columns. > Also the term "metadata columns" is now used consistently > everywhere in the IRanges/GenomicRanges/**Rsamtools documentation > (instead of "element metadata", "elementMetadata columns", > "values", "columns of values", "metadata values", > "elementMetadata values", etc...) > > I am wondering about the name choice. What about "metaframe"? That way it is similar to the analogous but differently shaped "metadata" and conveys the nature of the return value. > Having 3 synonyms for accessing the same thing is of course not > ideal but I hope we can remedy this in the future. > > - I've added "$" and "$<-" methods for GRanges *only*. Provided > as a convenience and as the result of strong popular demand. Note > that those methods are not consistent with the other "$" and "$<-" > methods defined in the IRanges/GenomicRanges infrastructure, and > might confuse some users by making them believe that a GRanges > object can be manipulated as a data.frame-like object. It is > therefore recommended to use them only interactively, and > their use in scripts or packages is discouraged (in that > case, 'mcols(x)$name' should be used instead of 'x$name'). > > Awesome. So it is now time to begin the campaign for adding this to GRangesList, since people will be confused about the inconsistency between the very similar (often interchangeable) data structures. Thanks for the update (where was the bad news?), Michael Those changes are in IRanges 1.15.35 and GenomicRanges 1.9.48. > Please let me know if you have any questions or concerns about this. > > Thanks, > H. > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > [[alternative HTML version deleted]]
Hi Michael, On 08/16/2012 10:25 AM, Michael Lawrence wrote: > > > On Thu, Aug 16, 2012 at 10:16 AM, Hervé Pagès <hpages at="" fhcrc.org=""> <mailto:hpages at="" fhcrc.org="">> wrote: > > Hi, > > People generally ask "do you want me to start with the good or with > the bad news?" Here they are in no particular order (both are changes > in BioC devel): > > - I've added the mcols() accessor as the preferred way (over > elementMetadata() and values()) to access the metadata columns. > Also the term "metadata columns" is now used consistently > everywhere in the IRanges/GenomicRanges/__Rsamtools documentation > (instead of "element metadata", "elementMetadata columns", > "values", "columns of values", "metadata values", > "elementMetadata values", etc...) > > > > I am wondering about the name choice. What about "metaframe"? That way > it is similar to the analogous but differently shaped "metadata" and > conveys the nature of the return value. The name of the accessor should be consistent with the english term we use in emails or in the documentation. I don't remember anybody referring to the metadata columns as the "meta frame" or the "metadata frame". Beside "mcols" is shorter than "metaframe". > > > Having 3 synonyms for accessing the same thing is of course not > ideal but I hope we can remedy this in the future. > > - I've added "$" and "$<-" methods for GRanges *only*. Provided > as a convenience and as the result of strong popular demand. Note > that those methods are not consistent with the other "$" and "$<-" > methods defined in the IRanges/GenomicRanges infrastructure, and > might confuse some users by making them believe that a GRanges > object can be manipulated as a data.frame-like object. It is > therefore recommended to use them only interactively, and > their use in scripts or packages is discouraged (in that > case, 'mcols(x)$name' should be used instead of 'x$name'). > > > Awesome. So it is now time to begin the campaign for adding this to > GRangesList, since people will be confused about the inconsistency > between the very similar (often interchangeable) data structures. > "$" is already implemented for GRangesList and does the right thing. Cheers, H. > Thanks for the update (where was the bad news?), > Michael > > Those changes are in IRanges 1.15.35 and GenomicRanges 1.9.48. > Please let me know if you have any questions or concerns about this. > > Thanks, > H. > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fhcrc.org <mailto:hpages at="" fhcrc.org=""> > Phone: (206) 667-5791 <tel:%28206%29%20667-5791> > Fax: (206) 667-1319 <tel:%28206%29%20667-1319> > > _________________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/__listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: > http://news.gmane.org/gmane.__science.biology.informatics.__conductor <http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> > > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319 ADD REPLYlink written 7.2 years ago by Hervé Pagès ♦♦ 14k Right around the time I get used to using values(GR) all over the place... and, an excellent opportunity to improve R's Unicode support, squandered! ;-) As long as things work right for SummarizedExperiments and their coercions to/from other common data structures, I'm certainly not going to complain. I'm not sure how I feel about having urged Hervé and Martin to compromise their principles about this, but it will make some peoples' lives a lot easier. Not least, mine, as I try to get examples and documentation completed for the 'regulatoR' package I've been working on lately. Having seen the logic behind your original design decisions, I understand why you were reluctant to do this. If it turns out that we have demanded "enough rope to shoot ourself in the foot", well, we asked for it! So, thank you, quite a lot, for including something that could well be as dangerous as it is handy. Thanks, --t On Thu, Aug 16, 2012 at 10:44 AM, Hervé Pagès <hpages@fhcrc.org> wrote: > Hi Michael, > > > On 08/16/2012 10:25 AM, Michael Lawrence wrote: > >> >> >> On Thu, Aug 16, 2012 at 10:16 AM, Hervé Pagès <hpages@fhcrc.org>> <mailto:hpages@fhcrc.org>> wrote: >> >> Hi, >> >> People generally ask "do you want me to start with the good or with >> the bad news?" Here they are in no particular order (both are changes >> in BioC devel): >> >> - I've added the mcols() accessor as the preferred way (over >> elementMetadata() and values()) to access the metadata columns. >> Also the term "metadata columns" is now used consistently >> everywhere in the IRanges/GenomicRanges/__**Rsamtools >> documentation >> >> (instead of "element metadata", "elementMetadata columns", >> "values", "columns of values", "metadata values", >> "elementMetadata values", etc...) >> >> >> >> I am wondering about the name choice. What about "metaframe"? That way >> it is similar to the analogous but differently shaped "metadata" and >> conveys the nature of the return value. >> > > The name of the accessor should be consistent with the english term we > use in emails or in the documentation. I don't remember anybody > referring to the metadata columns as the "meta frame" or the > "metadata frame". Beside "mcols" is shorter than "metaframe". > > > >> >> Having 3 synonyms for accessing the same thing is of course not >> ideal but I hope we can remedy this in the future. >> >> - I've added "$" and "$<-" methods for GRanges *only*. Provided >> as a convenience and as the result of strong popular demand. Note >> that those methods are not consistent with the other "$" and "$<-" >> methods defined in the IRanges/GenomicRanges infrastructure, and >> might confuse some users by making them believe that a GRanges >> object can be manipulated as a data.frame-like object. It is >> therefore recommended to use them only interactively, and >> their use in scripts or packages is discouraged (in that >> case, 'mcols(x)$name' should be used instead of 'x$name'). >> >> >> Awesome. So it is now time to begin the campaign for adding this to >> GRangesList, since people will be confused about the inconsistency >> between the very similar (often interchangeable) data structures. >> >> > "$" is already implemented for GRangesList and does the right thing. > > Cheers, > H. > > Thanks for the update (where was the bad news?), >> Michael >> >> Those changes are in IRanges 1.15.35 and GenomicRanges 1.9.48. >> Please let me know if you have any questions or concerns about this. >> >> Thanks, >> H. >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpages@fhcrc.org <mailto:hpages@fhcrc.org> >> Phone: (206) 667-5791 <tel:%28206%29%20667-5791> >> Fax: (206) 667-1319 <tel:%28206%29%20667-1319> >> >> ______________________________**___________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> > >> https://stat.ethz.ch/mailman/_**_listinfo/bioconductor<https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> >> >> <https: stat.ethz.ch="" mailman="" **listinfo="" bioconductor<https:="" s="" tat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> > >> Search the archives: >> http://news.gmane.org/gmane.__**science.biology.informatics.__** >> conductor<http: news.gmane.org="" gmane.__science.biology.informatics="" .__conductor="">< >> http://news.gmane.org/gmane.**science.biology.informatics.**conduct or<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >> > >> >> >> > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
Thanks Herve, this change is much appreciated. Kasper On Thu, Aug 16, 2012 at 2:02 PM, Tim Triche, Jr. <tim.triche at="" gmail.com=""> wrote: > Right around the time I get used to using values(GR) all over the place... > and, an excellent opportunity to improve R's Unicode support, squandered! > ;-) > > As long as things work right for SummarizedExperiments and their coercions > to/from other common data structures, I'm certainly not going to complain. > I'm not sure how I feel about having urged Herv? and Martin to compromise > their principles about this, but it will make some peoples' lives a lot > easier. > Not least, mine, as I try to get examples and documentation completed for > the 'regulatoR' package I've been working on lately. > > Having seen the logic behind your original design decisions, I understand > why you were reluctant to do this. > If it turns out that we have demanded "enough rope to shoot ourself in the > foot", well, we asked for it! > So, thank you, quite a lot, for including something that could well be as > dangerous as it is handy. > > Thanks, > > --t > > > On Thu, Aug 16, 2012 at 10:44 AM, Hervé Pagès <hpages at="" fhcrc.org=""> wrote: > >> Hi Michael, >> >> >> On 08/16/2012 10:25 AM, Michael Lawrence wrote: >> >>> >>> >>> On Thu, Aug 16, 2012 at 10:16 AM, Hervé Pagès <hpages at="" fhcrc.org="">>> <mailto:hpages at="" fhcrc.org="">> wrote: >>> >>> Hi, >>> >>> People generally ask "do you want me to start with the good or with >>> the bad news?" Here they are in no particular order (both are changes >>> in BioC devel): >>> >>> - I've added the mcols() accessor as the preferred way (over >>> elementMetadata() and values()) to access the metadata columns. >>> Also the term "metadata columns" is now used consistently >>> everywhere in the IRanges/GenomicRanges/__**Rsamtools >>> documentation >>> >>> (instead of "element metadata", "elementMetadata columns", >>> "values", "columns of values", "metadata values", >>> "elementMetadata values", etc...) >>> >>> >>> >>> I am wondering about the name choice. What about "metaframe"? That way >>> it is similar to the analogous but differently shaped "metadata" and >>> conveys the nature of the return value. >>> >> >> The name of the accessor should be consistent with the english term we >> use in emails or in the documentation. I don't remember anybody >> referring to the metadata columns as the "meta frame" or the >> "metadata frame". Beside "mcols" is shorter than "metaframe". >> >> >> >>> >>> Having 3 synonyms for accessing the same thing is of course not >>> ideal but I hope we can remedy this in the future. >>> >>> - I've added "$" and "$<-" methods for GRanges *only*. Provided >>> as a convenience and as the result of strong popular demand. Note >>> that those methods are not consistent with the other "$" and "$<-" >>> methods defined in the IRanges/GenomicRanges infrastructure, and >>> might confuse some users by making them believe that a GRanges >>> object can be manipulated as a data.frame-like object. It is >>> therefore recommended to use them only interactively, and >>> their use in scripts or packages is discouraged (in that >>> case, 'mcols(x)$name' should be used instead of 'x$name'). >>> >>> >>> Awesome. So it is now time to begin the campaign for adding this to >>> GRangesList, since people will be confused about the inconsistency >>> between the very similar (often interchangeable) data structures. >>> >>> >> "$" is already implemented for GRangesList and does the right thing. >> >> Cheers, >> H. >> >> Thanks for the update (where was the bad news?), >>> Michael >>> >>> Those changes are in IRanges 1.15.35 and GenomicRanges 1.9.48. >>> Please let me know if you have any questions or concerns about this. >>> >>> Thanks, >>> H. >>> >>> -- >>> Hervé Pagès >>> >>> Program in Computational Biology >>> Division of Public Health Sciences >>> Fred Hutchinson Cancer Research Center >>> 1100 Fairview Ave. N, M1-B514 >>> P.O. Box 19024 >>> Seattle, WA 98109-1024 >>> >>> E-mail: hpages at fhcrc.org <mailto:hpages at="" fhcrc.org=""> >>> Phone: (206) 667-5791 <tel:%28206%29%20667-5791> >>> Fax: (206) 667-1319 <tel:%28206%29%20667-1319> >>> >>> ______________________________**___________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org <mailto:bioconductor at="" r-**project.org<bioconductor="" at="" r-project.org=""> >>> > >>> https://stat.ethz.ch/mailman/_**_listinfo/bioconductor<https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> >>> >>> <https: stat.ethz.ch="" mailman="" **listinfo="" bioconductor<https:="" stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> > >>> Search the archives: >>> http://news.gmane.org/gmane.__**science.biology.informatics.__** >>> conductor<http: news.gmane.org="" gmane.__science.biology.informatic="" s.__conductor="">< >>> http://news.gmane.org/gmane.**science.biology.informatics.**conduc tor<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >>> > >>> >>> >>> >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpages at fhcrc.org >> Phone: (206) 667-5791 >> Fax: (206) 667-1319 >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ADD REPLYlink written 7.2 years ago by Kasper Daniel Hansen6.4k Dear listers, Apologies for asking a general statistics question here, but maybe someone will be willing to help me. I'm analysing an Illumina dataset that comes from 4 biological groups. For simplicity let's call them: Control Drug A Drug B - Concentration 1 Drug B - concentration 2 Each group has 4 biological replicates and they were hybridised across 2 chips so that each chip had 2 samples from each group. In terms of biological questions asked, Drug A is being compared to Control. And two concentrations of Drug B are compared to control as well as to each other. So Drug A is never compared to Drug B. As far as I understand, for comparing Drug A to Control I have two options: 1) Extract data for Drug A and Control from the dataset and run a linear model on those; 2) Run a linear model on samples from all groups and set up contrasts to compare Drug A to Control. Naturally, the second option has a higher number of experimental units, which brings variation down and results in more differentially expressed genes being detected between Drug A and Control. Now my question is, is there anything wrong (ethically, statistically, etc) with the second option? Many thanks for your help! Aliaksei. ADD REPLYlink written 7.2 years ago by Aliaksei Holik350 On Fri, Aug 17, 2012 at 6:46 AM, Salvador <salvador@bio.bsu.by> wrote: > Dear listers, > > Apologies for asking a general statistics question here, but maybe someone > will be willing to help me. > > I'm analysing an Illumina dataset that comes from 4 biological groups. For > simplicity let's call them: > Control > Drug A > Drug B - Concentration 1 > Drug B - concentration 2 > > Each group has 4 biological replicates and they were hybridised across 2 > chips so that each chip had 2 samples from each group. In terms of > biological questions asked, Drug A is being compared to Control. And two > concentrations of Drug B are compared to control as well as to each other. > So Drug A is never compared to Drug B. > > As far as I understand, for comparing Drug A to Control I have two options: > > 1) Extract data for Drug A and Control from the dataset and run a linear > model on those; > 2) Run a linear model on samples from all groups and set up contrasts to > compare Drug A to Control. > > Naturally, the second option has a higher number of experimental units, > which brings variation down and results in more differentially expressed > genes being detected between Drug A and Control. > > Now my question is, is there anything wrong (ethically, statistically, > etc) with the second option? > > Hi, Aliksei. Option 2 is the preferred option. Sean > Many thanks for your help! > > Aliaksei. > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > [[alternative HTML version deleted]] ADD REPLYlink written 7.2 years ago by Sean Davis21k Hi, On Thu, Aug 16, 2012 at 1:25 PM, Michael Lawrence <lawrence.michael at="" gene.com=""> wrote: > On Thu, Aug 16, 2012 at 10:16 AM, Hervé Pagès <hpages at="" fhcrc.org=""> wrote: > >> Hi, >> >> People generally ask "do you want me to start with the good or with >> the bad news?" Here they are in no particular order (both are changes >> in BioC devel): >> >> - I've added the mcols() accessor as the preferred way (over >> elementMetadata() and values()) to access the metadata columns. >> Also the term "metadata columns" is now used consistently >> everywhere in the IRanges/GenomicRanges/**Rsamtools documentation >> (instead of "element metadata", "elementMetadata columns", >> "values", "columns of values", "metadata values", >> "elementMetadata values", etc...) >> >> > > I am wondering about the name choice. What about "metaframe"? That way it > is similar to the analogous but differently shaped "metadata" and conveys > the nature of the return value. I guess the motivation is the length of the name since most people will be writing it often. mframe perhaps, even though that's just as long as values, but more informative of what it actually is. I'm just as happy w/ mcols though, to be honest. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact ADD REPLYlink written 7.2 years ago by Steve Lianoglou12k Answer: good news and bad news 0 7.2 years ago by Denali Steve Lianoglou12k wrote: Hi, On Thu, Aug 16, 2012 at 1:16 PM, Hervé Pagès <hpages at="" fhcrc.org=""> wrote: [snip] > - I've added "$" and "$<-" methods for GRanges *only*. Provided > as a convenience and as the result of strong popular demand. Note > that those methods are not consistent with the other "$" and "$<-" > methods defined in the IRanges/GenomicRanges infrastructure, and > might confuse some users by making them believe that a GRanges > object can be manipulated as a data.frame-like object. It is > therefore recommended to use them only interactively, and > their use in scripts or packages is discouraged (in that > case, 'mcols(x)$name' should be used instead of 'x$name'). [/snip] Nice! There's a joke lying in here somewhere about the cause of the serious heatwaves in the US being due to to the heat being chased out from Hades now that GRanges$ has landed, but comedy isn't my day job, so I'll leave it at that. Thanks! --steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact