Finding coding SNPs with predictCoding

0

Entering edit mode

Valerie Obenchain ★ 6.8k

@valerie-obenchain-4275

Last seen 4.0 years ago

United States

Hi Thomas, Alex, Changes below have been made in release 1.2.4 and devel 1.3.4. predictCoding : - Now returns a proteinID that is the triplet number in the protein - The refAA and varAA are the 1-letter amino acid code but not the 3-letter code (i.e., 'G' but not 'Gly'). We don't currently have a function that returns the 3-letter code. Is this of interest/use? - Over the next dev cycle I will implement methods for annotations (subject argument) to be gff/gtf. Currently genomes (seqSource argument) can be a BSgenome or fasta file. locateVariants: To add flexibility for finding variants in a particular region the function now dispatches on a 'region' argument (e.g., CodingVariants(), IntronVariants(), etc.). I've included a SpliceSiteVariants() that counts ranges that overlap with any portion of the first 2 or last 2 nucleotides of an intron. Called as, locateVariants(query, subject, range=SpliceSiteVariants()) Thanks again for the suggestions. Valerie On 03/05/2012 01:25 AM, Alex Gutteridge wrote: > Hi Valerie, > > I'd exactly echo what Thomas wrote (thanks Thomas!). My specific use > case is mapping coding SNPs to PDB protein structures, so something > that encodes 'His88 (CAT) -> Gln88 (CAA)' is all I need. > > Alex Gutteridge. > > On 03.03.2012 00:58, Thomas Girke wrote: >> Hi Valerie, >> >> Based on my experience the position in the complete protein (rather than >> CDS) sequence would be the most important piece of information to record >> here. For instance, if a SNP changes the 88th triplet CAT (His) in the >> ORF of a transcript to CAA (Gln) then you want to record it like this: >> His88 (CAT) -> Gln88 (CAA). This way the user can map this change to a >> protein structure or inject it into the corresponding protein sequence >> without any additional remapping or coding. >> >> Another feature to consider are SNPs affecting splice sites (commonly >> first and last two nucleotides of an intron). >> >> If possible it would be also very useful to support users who want to >> work with their custom genomes and annotations files provided as FASTA >> and GFF/GTF files, respectively. >> >> Best, >> >> Thomas >> >> >> On Fri, Mar 02, 2012 at 11:31:57PM +0000, Valerie Obenchain wrote: >>> Slight change to this - >>> >>> I'm now returning the following new columns, >>> >>> \item{\code{seqTxLoc}}{ >>> Location in transcript-based coordinates of the first >>> nucleotide in >>> the codon sequence to be translated. This position >>> corresponds to the >>> first nucleotide in both the \code{refSeq} and \code{varSeq} >>> columns. >>> } >>> \item{\code{varTxLoc}}{ >>> Location in transcript-based coordinates of the first >>> nucleotide in >>> the variant. This value will be the same as \code{seqTxLoc} >>> when the >>> variant starts exactly at the beginning of a codon. >>> } >>> \item{\code{varCdsLoc}}{ >>> Location in cds-based coordinates of the first nucleotide in >>> the variant. This position is relative to the start of the >>> cds region >>> defined in the \code{subject} annotation. >>> } >>> \item{\code{subjStrand}}{ >>> The strand of the \code{subject} the variant matched. >>> \code{predictCoding} >>> determines which variants fall in a coding region by finding the >>> overlaps >>> between the \code{query} and \code{subject}. The \code{query} >>> may be >>> un-stranded \sQuote{*} but the \code{subject} annotation will >>> have a strand. >>> } >>> >>> >>> You are interested in 'protein coordinates'. Does 'varCdsLoc' described >>> above meet the need or are you looking for the actual codon number in >>> the coding sequence? I am interested in hearing more about what you are >>> doing with the protein coordinates, how you are using them. It would >>> help us better design future functions. >>> >>> Thanks, >>> Valerie >>> >>> On 03/02/2012 01:11 AM, Alex Gutteridge wrote: >>> > Thanks Valerie - much appreciated! >>> > >>> > On 01.03.2012 21:30, Valerie Obenchain wrote: >>> >> A 'txLoc' column has been added to the output of predictCoding. >>> >> Available in devel version 1.1.57. >>> >> >>> >> Valerie >>> >> >>> >> >>> >> On 02/28/2012 08:20 AM, Valerie Obenchain wrote: >>> >>> Good suggestion. Yes, predictCoding is does this internally. I'll >>> >>> post back here when this has been added. >>> >>> >>> >>> Valerie >>> >>> >>> >>> >>> >>> >>> >>> On 02/28/2012 01:49 AM, Alex Gutteridge wrote: >>> >>>> Hi Valerie, >>> >>>> >>> >>>> Thanks everything works great now. One small feature request - >>> >>>> would it be hard to output the protein sequence position of the >>> >>>> coding SNPs? At the moment once I've run predictCoding I'm >>> >>>> re-extracting the cds and working out the position of each coding >>> >>>> SNP so I can see where in the protein sequence it is, but it seems >>> >>>> like this is probably just replicating what predictCoding must be >>> >>>> doing internally anyway? >>> >>>> >>> >>>> Alex Gutteridge >>> >>> >>> >>> >>> >>> On 02/24/2012 10:39 AM, Valerie Obenchain wrote: >>> >>>> Hi Alex, >>> >>>> >>> >>>> Thanks for the bug report. The cdsID was taken from an overlap >>> >>>> between the query and GRangesList of cds by transcripts. This gave >>> >>>> the correct transcript number but (incorrectly) took the first cds >>> >>>> number in the list by default. Now fixed in devel 1.1.55. >>> >>>> >>> >>>> I've also updated the man page. >>> >>>> >>> >>>> Valerie >>> >>>> >>> >>>> >>> >>>> >>> >>>> On 02/24/2012 02:08 AM, Alex Gutteridge wrote: >>> >>>>> On 22.02.2012 18:58, Hervé Pagès wrote: >>> >>>>>> Hi Alex, >>> >>>>>> >>> >>>>>> On 02/22/2012 03:56 AM, Alex Gutteridge wrote: >>> >>>>> >>> >>>>> [...] >>> >>>>> >>> >>>>>>> But the predictCoding call gives this error: >>> >>>>>>> >>> >>>>>>> Error in .setSeqNames(x, value) : >>> >>>>>>> The replacement value for isActiveSeq must be a logical vector, >>> >>>>>>> with >>> >>>>>>> names that match the seqlevels of the object >>> >>>>>> >>> >>>>>> The error message doesn't help much but I think the pb is >>> that you >>> >>>>>> didn't rename chMT properly. Try to do this: >>> >>>>>> >>> >>>>>> seqlevels(snps) <- gsub("chrMT", "chrM", seqlevels(snps)) >>> >>>>>> >>> >>>>>> before you start the for(eg in entrez.ids){..} loop again. >>> >>>>>> >>> >>>>>> Cheers, >>> >>>>>> H. >>> >>>>> >>> >>>>> Thanks Herv? that nailed it. I'm having some difficulty >>> joining up >>> >>>>> the output of predictCoding() with the query SNPs though. If >>> >>>>> someone could point out where the disconnect in my thinking is I >>> >>>>> would appreciate it! >>> >>>>> >>> >>>>> Here's my (now edited down) script: >>> >>>>> >>> >>>>> library(BSgenome.Hsapiens.UCSC.hg19) >>> >>>>> library(VariantAnnotation) >>> >>>>> library(SNPlocs.Hsapiens.dbSNP.20110815) >>> >>>>> library(TxDb.Hsapiens.UCSC.hg19.knownGene) >>> >>>>> >>> >>>>> entrez.ids = c('6335') >>> >>>>> txdb19 = TxDb.Hsapiens.UCSC.hg19.knownGene >>> >>>>> >>> >>>>> snps = getSNPlocs(c("ch1","ch2"),as.GRanges=T) >>> >>>>> seqlevels(snps) <- gsub("ch", "chr", seqlevels(snps)) >>> >>>>> seqlevels(snps) <- gsub("chrMT", "chrM", seqlevels(snps)) >>> >>>>> >>> >>>>> gene.list = cdsBy(txdb19, by="gene") >>> >>>>> vsd.list = gene.list[entrez.ids] >>> >>>>> cds.list = cdsBy(txdb19,by="tx") >>> >>>>> >>> >>>>> eg = entrez.ids[1] >>> >>>>> >>> >>>>> snp.idx = unique(queryHits(findOverlaps(snps, vsd.list[[eg]]))) >>> >>>>> eg.snps = snps[snp.idx] >>> >>>>> iupac = values(eg.snps)[,"alleles_as_ambig"] >>> >>>>> eg.snps.exp = rep(eg.snps, nchar(IUPAC_CODE_MAP[iupac])) >>> >>>>> variant.alleles = >>> >>>>> >>> DNAStringSet(strsplit(paste(IUPAC_CODE_MAP[iupac],collapse=""),"") [[1]]) >>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> aa = >>> >>>>> >>> predictCoding(eg.snps.exp,txdb19,seqSource=Hsapiens,varAllele=vari ant.alleles) >>> >>>>> >>> >>>>> ##### >>> >>>>> >>> >>>>> Then if I query the predictCoding results in aa in an interactive >>> >>>>> session I get the following (see inline comments for what I think >>> >>>>> should be happening, but I must be misinterpreting what queryID >>> >>>>> means) >>> >>>>> >>> >>>>> The docs for predictCoding() contain a small typo >>> >>>>> (s/queryHits/queryID), but otherwise seem clear? >>> >>>>> >>> >>>>> Columns include ?queryID?, ?consequence?, ?refSeq?, ?varSeq?, >>> >>>>> ?refAA?, ?varAA?, ?txID?, ?geneID?, and ?cdsID?. >>> >>>>> >>> >>>>> ?queryHits? The ?queryHits? column provides a map back to >>> the >>> >>>>> variants in the original ?query?. If the ?query? was a >>> >>>>> ?VCF? >>> >>>>> object this index corresponds to the row in the >>> >>>>> ?GRanges? in >>> >>>>> the ?rowData? slot. If ?query? was an expanded >>> ?GRanges?, >>> >>>>> ?RangedData? or ?RangesList? the index corresponds to >>> >>>>> the row >>> >>>>> in the expanded object. >>> >>>>> >>> >>>>> ##### >>> >>>>> >>> >>>>>> aa[1,] >>> >>>>> DataFrame with 1 row and 9 columns >>> >>>>> queryID consequence refSeq varSeq >>> refAA >>> >>>>> <integer> <factor> <dnastringset> <dnastringset> <aastringset> >>> >>>>> 1 1 nonsynonymous CTC ATC >>> L >>> >>>>> varAA txID geneID cdsID >>> >>>>> <aastringset> <character> <factor> <integer> >>> >>>>> 1 I 10921 6335 33668 >>> >>>>>> #So the first SNP (queryID: 1) is nonsynonymous and maps to tx >>> >>>>>> '10921' and cds '33668'. >>> >>>>>> #If I look at the first query SNP I get this: >>> >>>>>> eg.snps.exp[aa[1,'queryID'],] >>> >>>>> GRanges with 1 range and 2 elementMetadata values: >>> >>>>> seqnames ranges strand | RefSNP_id >>> >>>>> alleles_as_ambig >>> >>>>> <rle> <iranges> <rle> | <character> <character> >>> >>>>> [1] chr2 [167055370, 167055370] * | 111558968 >>> >>>>> R >>> >>>>> --- >>> >>>>> seqlengths: >>> >>>>> chr1 chr2 chr3 chr4 chr5 chr6 ... chr20 chr21 chr22 >>> chrX >>> >>>>> chrY chrM >>> >>>>> NA NA NA NA NA NA ... NA NA NA NA >>> >>>>> NA NA >>> >>>>>> #So SNP 1 is at 167055370 on chr2 >>> >>>>>> #But if I check tx '10921' I see that the cds overlapping >>> >>>>>> 167055370 is actually '33651' >>> >>>>>> #And cds '33668' is at the other end of the tx: >>> >>>>>> cds.list[[aa[1,'txID']]] >>> >>>>> GRanges with 26 ranges and 3 elementMetadata values: >>> >>>>> seqnames ranges strand | cds_id >>> >>>>> cds_name >>> >>>>> <rle> <iranges> <rle> | <integer> <character> >>> >>>>> [1] chr2 [167168009, 167168266] - | 33668 <na> >>> >>>>> [2] chr2 [167163466, 167163584] - | 33667 <na> >>> >>>>> [3] chr2 [167163020, 167163109] - | 33666 <na> >>> >>>>> [4] chr2 [167162302, 167162430] - | 33647 <na> >>> >>>>> [5] chr2 [167160748, 167160839] - | 33646 <na> >>> >>>>> [6] chr2 [167159600, 167159812] - | 33645 <na> >>> >>>>> [7] chr2 [167151109, 167151172] - | 33644 <na> >>> >>>>> [8] chr2 [167149741, 167149882] - | 33643 <na> >>> >>>>> [9] chr2 [167144947, 167145153] - | 33642 <na> >>> >>>>> ... ... ... ... ... ... >>> >>>>> ... >>> >>>>> [18] chr2 [167099012, 167099166] - | 33659 <na> >>> >>>>> [19] chr2 [167094604, 167094777] - | 33658 <na> >>> >>>>> [20] chr2 [167089850, 167089972] - | 33657 <na> >>> >>>>> [21] chr2 [167085201, 167085482] - | 33656 <na> >>> >>>>> [22] chr2 [167084180, 167084233] - | 33655 <na> >>> >>>>> [23] chr2 [167083077, 167083214] - | 33654 <na> >>> >>>>> [24] chr2 [167060870, 167060974] - | 33653 <na> >>> >>>>> [25] chr2 [167060465, 167060735] - | 33652 <na> >>> >>>>> [26] chr2 [167055182, 167056374] - | 33651 <na> >>> >>>>> exon_rank >>> >>>>> <integer> >>> >>>>> [1] 2 >>> >>>>> [2] 3 >>> >>>>> [3] 4 >>> >>>>> [4] 5 >>> >>>>> [5] 6 >>> >>>>> [6] 7 >>> >>>>> [7] 8 >>> >>>>> [8] 9 >>> >>>>> [9] 10 >>> >>>>> ... ... >>> >>>>> [18] 19 >>> >>>>> [19] 20 >>> >>>>> [20] 21 >>> >>>>> [21] 22 >>> >>>>> [22] 23 >>> >>>>> [23] 24 >>> >>>>> [24] 25 >>> >>>>> [25] 26 >>> >>>>> [26] 27 >>> >>>>> --- >>> >>>>> seqlengths: >>> >>>>> chr1 chr2 ... >>> >>>>> chr18_gl000207_random >>> >>>>> 249250621 243199373 ... >>> >>>>> 4262 >>> >>>>> >>> >>>>> >>> >>>> >>> >>>> _______________________________________________ >>> >>>> Bioconductor mailing list >>> >>>> Bioconductor at r-project.org >>> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> >>>> Search the archives: >>> >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> >>> _______________________________________________ >>> >>> Bioconductor mailing list >>> >>> Bioconductor at r-project.org >>> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> >>> Search the archives: >>> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> > >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >

SNP Annotation BSgenome ctc BSgenome cycle genomes SNP Annotation BSgenome ctc BSgenome • 1.6k views

ADD COMMENT • link updated 13.8 years ago by Thomas Girke ★ 1.7k • written 13.8 years ago by Valerie Obenchain ★ 6.8k

0

Entering edit mode

Thomas Girke ★ 1.7k

@thomas-girke-993

Last seen 3 days ago

United States

Hi Valerie, Thanks for the update and for implementing these functionalities. >From my perspective, reporting in 1-letter amino acid code is totally sufficient. Thanks, Thomas On Thu, Apr 05, 2012 at 08:47:37PM +0000, Valerie Obenchain wrote: > Hi Thomas, Alex, > > Changes below have been made in release 1.2.4 and devel 1.3.4. > > predictCoding : > > - Now returns a proteinID that is the triplet number in the protein > > - The refAA and varAA are the 1-letter amino acid code but not the > 3-letter code (i.e., 'G' but not 'Gly'). We don't currently have a > function that returns the 3-letter code. Is this of interest/use? > > - Over the next dev cycle I will implement methods for annotations > (subject argument) to be gff/gtf. Currently genomes (seqSource argument) > can be a BSgenome or fasta file. > > > locateVariants: > > To add flexibility for finding variants in a particular region the > function now dispatches on a 'region' argument (e.g., CodingVariants(), > IntronVariants(), etc.). I've included a SpliceSiteVariants() that > counts ranges that overlap with any portion of the first 2 or last 2 > nucleotides of an intron. Called as, > > locateVariants(query, subject, range=SpliceSiteVariants()) > > Thanks again for the suggestions. > > Valerie > > > On 03/05/2012 01:25 AM, Alex Gutteridge wrote: > > Hi Valerie, > > > > I'd exactly echo what Thomas wrote (thanks Thomas!). My specific use > > case is mapping coding SNPs to PDB protein structures, so something > > that encodes 'His88 (CAT) -> Gln88 (CAA)' is all I need. > > > > Alex Gutteridge. > > > > On 03.03.2012 00:58, Thomas Girke wrote: > >> Hi Valerie, > >> > >> Based on my experience the position in the complete protein (rather than > >> CDS) sequence would be the most important piece of information to record > >> here. For instance, if a SNP changes the 88th triplet CAT (His) in the > >> ORF of a transcript to CAA (Gln) then you want to record it like this: > >> His88 (CAT) -> Gln88 (CAA). This way the user can map this change to a > >> protein structure or inject it into the corresponding protein sequence > >> without any additional remapping or coding. > >> > >> Another feature to consider are SNPs affecting splice sites (commonly > >> first and last two nucleotides of an intron). > >> > >> If possible it would be also very useful to support users who want to > >> work with their custom genomes and annotations files provided as FASTA > >> and GFF/GTF files, respectively. > >> > >> Best, > >> > >> Thomas > >> > >> > >> On Fri, Mar 02, 2012 at 11:31:57PM +0000, Valerie Obenchain wrote: > >>> Slight change to this - > >>> > >>> I'm now returning the following new columns, > >>> > >>> \item{\code{seqTxLoc}}{ > >>> Location in transcript-based coordinates of the first > >>> nucleotide in > >>> the codon sequence to be translated. This position > >>> corresponds to the > >>> first nucleotide in both the \code{refSeq} and \code{varSeq} > >>> columns. > >>> } > >>> \item{\code{varTxLoc}}{ > >>> Location in transcript-based coordinates of the first > >>> nucleotide in > >>> the variant. This value will be the same as \code{seqTxLoc} > >>> when the > >>> variant starts exactly at the beginning of a codon. > >>> } > >>> \item{\code{varCdsLoc}}{ > >>> Location in cds-based coordinates of the first nucleotide in > >>> the variant. This position is relative to the start of the > >>> cds region > >>> defined in the \code{subject} annotation. > >>> } > >>> \item{\code{subjStrand}}{ > >>> The strand of the \code{subject} the variant matched. > >>> \code{predictCoding} > >>> determines which variants fall in a coding region by finding the > >>> overlaps > >>> between the \code{query} and \code{subject}. The \code{query} > >>> may be > >>> un-stranded \sQuote{*} but the \code{subject} annotation will > >>> have a strand. > >>> } > >>> > >>> > >>> You are interested in 'protein coordinates'. Does 'varCdsLoc' described > >>> above meet the need or are you looking for the actual codon number in > >>> the coding sequence? I am interested in hearing more about what you are > >>> doing with the protein coordinates, how you are using them. It would > >>> help us better design future functions. > >>> > >>> Thanks, > >>> Valerie > >>> > >>> On 03/02/2012 01:11 AM, Alex Gutteridge wrote: > >>> > Thanks Valerie - much appreciated! > >>> > > >>> > On 01.03.2012 21:30, Valerie Obenchain wrote: > >>> >> A 'txLoc' column has been added to the output of predictCoding. > >>> >> Available in devel version 1.1.57. > >>> >> > >>> >> Valerie > >>> >> > >>> >> > >>> >> On 02/28/2012 08:20 AM, Valerie Obenchain wrote: > >>> >>> Good suggestion. Yes, predictCoding is does this internally. I'll > >>> >>> post back here when this has been added. > >>> >>> > >>> >>> Valerie > >>> >>> > >>> >>> > >>> >>> > >>> >>> On 02/28/2012 01:49 AM, Alex Gutteridge wrote: > >>> >>>> Hi Valerie, > >>> >>>> > >>> >>>> Thanks everything works great now. One small feature request - > >>> >>>> would it be hard to output the protein sequence position of the > >>> >>>> coding SNPs? At the moment once I've run predictCoding I'm > >>> >>>> re-extracting the cds and working out the position of each coding > >>> >>>> SNP so I can see where in the protein sequence it is, but it seems > >>> >>>> like this is probably just replicating what predictCoding must be > >>> >>>> doing internally anyway? > >>> >>>> > >>> >>>> Alex Gutteridge > >>> >>> > >>> >>> > >>> >>> On 02/24/2012 10:39 AM, Valerie Obenchain wrote: > >>> >>>> Hi Alex, > >>> >>>> > >>> >>>> Thanks for the bug report. The cdsID was taken from an overlap > >>> >>>> between the query and GRangesList of cds by transcripts. This gave > >>> >>>> the correct transcript number but (incorrectly) took the first cds > >>> >>>> number in the list by default. Now fixed in devel 1.1.55. > >>> >>>> > >>> >>>> I've also updated the man page. > >>> >>>> > >>> >>>> Valerie > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> On 02/24/2012 02:08 AM, Alex Gutteridge wrote: > >>> >>>>> On 22.02.2012 18:58, Hervé Pagès wrote: > >>> >>>>>> Hi Alex, > >>> >>>>>> > >>> >>>>>> On 02/22/2012 03:56 AM, Alex Gutteridge wrote: > >>> >>>>> > >>> >>>>> [...] > >>> >>>>> > >>> >>>>>>> But the predictCoding call gives this error: > >>> >>>>>>> > >>> >>>>>>> Error in .setSeqNames(x, value) : > >>> >>>>>>> The replacement value for isActiveSeq must be a logical vector, > >>> >>>>>>> with > >>> >>>>>>> names that match the seqlevels of the object > >>> >>>>>> > >>> >>>>>> The error message doesn't help much but I think the pb is > >>> that you > >>> >>>>>> didn't rename chMT properly. Try to do this: > >>> >>>>>> > >>> >>>>>> seqlevels(snps) <- gsub("chrMT", "chrM", seqlevels(snps)) > >>> >>>>>> > >>> >>>>>> before you start the for(eg in entrez.ids){..} loop again. > >>> >>>>>> > >>> >>>>>> Cheers, > >>> >>>>>> H. > >>> >>>>> > >>> >>>>> Thanks Herv? that nailed it. I'm having some difficulty > >>> joining up > >>> >>>>> the output of predictCoding() with the query SNPs though. If > >>> >>>>> someone could point out where the disconnect in my thinking is I > >>> >>>>> would appreciate it! > >>> >>>>> > >>> >>>>> Here's my (now edited down) script: > >>> >>>>> > >>> >>>>> library(BSgenome.Hsapiens.UCSC.hg19) > >>> >>>>> library(VariantAnnotation) > >>> >>>>> library(SNPlocs.Hsapiens.dbSNP.20110815) > >>> >>>>> library(TxDb.Hsapiens.UCSC.hg19.knownGene) > >>> >>>>> > >>> >>>>> entrez.ids = c('6335') > >>> >>>>> txdb19 = TxDb.Hsapiens.UCSC.hg19.knownGene > >>> >>>>> > >>> >>>>> snps = getSNPlocs(c("ch1","ch2"),as.GRanges=T) > >>> >>>>> seqlevels(snps) <- gsub("ch", "chr", seqlevels(snps)) > >>> >>>>> seqlevels(snps) <- gsub("chrMT", "chrM", seqlevels(snps)) > >>> >>>>> > >>> >>>>> gene.list = cdsBy(txdb19, by="gene") > >>> >>>>> vsd.list = gene.list[entrez.ids] > >>> >>>>> cds.list = cdsBy(txdb19,by="tx") > >>> >>>>> > >>> >>>>> eg = entrez.ids[1] > >>> >>>>> > >>> >>>>> snp.idx = unique(queryHits(findOverlaps(snps, vsd.list[[eg]]))) > >>> >>>>> eg.snps = snps[snp.idx] > >>> >>>>> iupac = values(eg.snps)[,"alleles_as_ambig"] > >>> >>>>> eg.snps.exp = rep(eg.snps, nchar(IUPAC_CODE_MAP[iupac])) > >>> >>>>> variant.alleles = > >>> >>>>> > >>> DNAStringSet(strsplit(paste(IUPAC_CODE_MAP[iupac],collapse="")," ")[[1]]) > >>> > >>> >>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> aa = > >>> >>>>> > >>> predictCoding(eg.snps.exp,txdb19,seqSource=Hsapiens,varAllele=va riant.alleles) > >>> >>>>> > >>> >>>>> ##### > >>> >>>>> > >>> >>>>> Then if I query the predictCoding results in aa in an interactive > >>> >>>>> session I get the following (see inline comments for what I think > >>> >>>>> should be happening, but I must be misinterpreting what queryID > >>> >>>>> means) > >>> >>>>> > >>> >>>>> The docs for predictCoding() contain a small typo > >>> >>>>> (s/queryHits/queryID), but otherwise seem clear? > >>> >>>>> > >>> >>>>> Columns include ?queryID?, ?consequence?, ?refSeq?, ?varSeq?, > >>> >>>>> ?refAA?, ?varAA?, ?txID?, ?geneID?, and ?cdsID?. > >>> >>>>> > >>> >>>>> ?queryHits? The ?queryHits? column provides a map back to > >>> the > >>> >>>>> variants in the original ?query?. If the ?query? was a > >>> >>>>> ?VCF? > >>> >>>>> object this index corresponds to the row in the > >>> >>>>> ?GRanges? in > >>> >>>>> the ?rowData? slot. If ?query? was an expanded > >>> ?GRanges?, > >>> >>>>> ?RangedData? or ?RangesList? the index corresponds to > >>> >>>>> the row > >>> >>>>> in the expanded object. > >>> >>>>> > >>> >>>>> ##### > >>> >>>>> > >>> >>>>>> aa[1,] > >>> >>>>> DataFrame with 1 row and 9 columns > >>> >>>>> queryID consequence refSeq varSeq > >>> refAA > >>> >>>>> <integer> <factor> <dnastringset> <dnastringset> <aastringset> > >>> >>>>> 1 1 nonsynonymous CTC ATC > >>> L > >>> >>>>> varAA txID geneID cdsID > >>> >>>>> <aastringset> <character> <factor> <integer> > >>> >>>>> 1 I 10921 6335 33668 > >>> >>>>>> #So the first SNP (queryID: 1) is nonsynonymous and maps to tx > >>> >>>>>> '10921' and cds '33668'. > >>> >>>>>> #If I look at the first query SNP I get this: > >>> >>>>>> eg.snps.exp[aa[1,'queryID'],] > >>> >>>>> GRanges with 1 range and 2 elementMetadata values: > >>> >>>>> seqnames ranges strand | RefSNP_id > >>> >>>>> alleles_as_ambig > >>> >>>>> <rle> <iranges> <rle> | <character> <character> > >>> >>>>> [1] chr2 [167055370, 167055370] * | 111558968 > >>> >>>>> R > >>> >>>>> --- > >>> >>>>> seqlengths: > >>> >>>>> chr1 chr2 chr3 chr4 chr5 chr6 ... chr20 chr21 chr22 > >>> chrX > >>> >>>>> chrY chrM > >>> >>>>> NA NA NA NA NA NA ... NA NA NA NA > >>> >>>>> NA NA > >>> >>>>>> #So SNP 1 is at 167055370 on chr2 > >>> >>>>>> #But if I check tx '10921' I see that the cds overlapping > >>> >>>>>> 167055370 is actually '33651' > >>> >>>>>> #And cds '33668' is at the other end of the tx: > >>> >>>>>> cds.list[[aa[1,'txID']]] > >>> >>>>> GRanges with 26 ranges and 3 elementMetadata values: > >>> >>>>> seqnames ranges strand | cds_id > >>> >>>>> cds_name > >>> >>>>> <rle> <iranges> <rle> | <integer> <character> > >>> >>>>> [1] chr2 [167168009, 167168266] - | 33668 <na> > >>> >>>>> [2] chr2 [167163466, 167163584] - | 33667 <na> > >>> >>>>> [3] chr2 [167163020, 167163109] - | 33666 <na> > >>> >>>>> [4] chr2 [167162302, 167162430] - | 33647 <na> > >>> >>>>> [5] chr2 [167160748, 167160839] - | 33646 <na> > >>> >>>>> [6] chr2 [167159600, 167159812] - | 33645 <na> > >>> >>>>> [7] chr2 [167151109, 167151172] - | 33644 <na> > >>> >>>>> [8] chr2 [167149741, 167149882] - | 33643 <na> > >>> >>>>> [9] chr2 [167144947, 167145153] - | 33642 <na> > >>> >>>>> ... ... ... ... ... ... > >>> >>>>> ... > >>> >>>>> [18] chr2 [167099012, 167099166] - | 33659 <na> > >>> >>>>> [19] chr2 [167094604, 167094777] - | 33658 <na> > >>> >>>>> [20] chr2 [167089850, 167089972] - | 33657 <na> > >>> >>>>> [21] chr2 [167085201, 167085482] - | 33656 <na> > >>> >>>>> [22] chr2 [167084180, 167084233] - | 33655 <na> > >>> >>>>> [23] chr2 [167083077, 167083214] - | 33654 <na> > >>> >>>>> [24] chr2 [167060870, 167060974] - | 33653 <na> > >>> >>>>> [25] chr2 [167060465, 167060735] - | 33652 <na> > >>> >>>>> [26] chr2 [167055182, 167056374] - | 33651 <na> > >>> >>>>> exon_rank > >>> >>>>> <integer> > >>> >>>>> [1] 2 > >>> >>>>> [2] 3 > >>> >>>>> [3] 4 > >>> >>>>> [4] 5 > >>> >>>>> [5] 6 > >>> >>>>> [6] 7 > >>> >>>>> [7] 8 > >>> >>>>> [8] 9 > >>> >>>>> [9] 10 > >>> >>>>> ... ... > >>> >>>>> [18] 19 > >>> >>>>> [19] 20 > >>> >>>>> [20] 21 > >>> >>>>> [21] 22 > >>> >>>>> [22] 23 > >>> >>>>> [23] 24 > >>> >>>>> [24] 25 > >>> >>>>> [25] 26 > >>> >>>>> [26] 27 > >>> >>>>> --- > >>> >>>>> seqlengths: > >>> >>>>> chr1 chr2 ... > >>> >>>>> chr18_gl000207_random > >>> >>>>> 249250621 243199373 ... > >>> >>>>> 4262 > >>> >>>>> > >>> >>>>> > >>> >>>> > >>> >>>> _______________________________________________ > >>> >>>> Bioconductor mailing list > >>> >>>> Bioconductor at r-project.org > >>> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> >>>> Search the archives: > >>> >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor > >>> >>> > >>> >>> _______________________________________________ > >>> >>> Bioconductor mailing list > >>> >>> Bioconductor at r-project.org > >>> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> >>> Search the archives: > >>> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor > >>> > > >>> > >>> _______________________________________________ > >>> Bioconductor mailing list > >>> Bioconductor at r-project.org > >>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> Search the archives: > >>> http://news.gmane.org/gmane.science.biology.informatics.conductor > > >

ADD COMMENT • link 13.8 years ago Thomas Girke ★ 1.7k

0

Entering edit mode

Yup, same from me. On 06.04.2012 00:25, Thomas Girke wrote: > Hi Valerie, > > Thanks for the update and for implementing these functionalities. > From my perspective, reporting in 1-letter amino acid code is totally > sufficient. > > Thanks, > > Thomas > > > > On Thu, Apr 05, 2012 at 08:47:37PM +0000, Valerie Obenchain wrote: >> Hi Thomas, Alex, >> >> Changes below have been made in release 1.2.4 and devel 1.3.4. >> >> predictCoding : >> >> - Now returns a proteinID that is the triplet number in the protein >> >> - The refAA and varAA are the 1-letter amino acid code but not the >> 3-letter code (i.e., 'G' but not 'Gly'). We don't currently have a >> function that returns the 3-letter code. Is this of interest/use? >> >> - Over the next dev cycle I will implement methods for annotations >> (subject argument) to be gff/gtf. Currently genomes (seqSource >> argument) >> can be a BSgenome or fasta file. >> >> >> locateVariants: >> >> To add flexibility for finding variants in a particular region the >> function now dispatches on a 'region' argument (e.g., >> CodingVariants(), >> IntronVariants(), etc.). I've included a SpliceSiteVariants() that >> counts ranges that overlap with any portion of the first 2 or last 2 >> nucleotides of an intron. Called as, >> >> locateVariants(query, subject, range=SpliceSiteVariants()) >> >> Thanks again for the suggestions. >> >> Valerie >> >> >> On 03/05/2012 01:25 AM, Alex Gutteridge wrote: >> > Hi Valerie, >> > >> > I'd exactly echo what Thomas wrote (thanks Thomas!). My specific >> use >> > case is mapping coding SNPs to PDB protein structures, so >> something >> > that encodes 'His88 (CAT) -> Gln88 (CAA)' is all I need. >> > >> > Alex Gutteridge. >> > >> > On 03.03.2012 00:58, Thomas Girke wrote: >> >> Hi Valerie, >> >> >> >> Based on my experience the position in the complete protein >> (rather than >> >> CDS) sequence would be the most important piece of information to >> record >> >> here. For instance, if a SNP changes the 88th triplet CAT (His) >> in the >> >> ORF of a transcript to CAA (Gln) then you want to record it like >> this: >> >> His88 (CAT) -> Gln88 (CAA). This way the user can map this change >> to a >> >> protein structure or inject it into the corresponding protein >> sequence >> >> without any additional remapping or coding. >> >> >> >> Another feature to consider are SNPs affecting splice sites >> (commonly >> >> first and last two nucleotides of an intron). >> >> >> >> If possible it would be also very useful to support users who >> want to >> >> work with their custom genomes and annotations files provided as >> FASTA >> >> and GFF/GTF files, respectively. >> >> >> >> Best, >> >> >> >> Thomas >> >> >> >> >> >> On Fri, Mar 02, 2012 at 11:31:57PM +0000, Valerie Obenchain >> wrote: >> >>> Slight change to this - >> >>> >> >>> I'm now returning the following new columns, >> >>> >> >>> \item{\code{seqTxLoc}}{ >> >>> Location in transcript-based coordinates of the first >> >>> nucleotide in >> >>> the codon sequence to be translated. This position >> >>> corresponds to the >> >>> first nucleotide in both the \code{refSeq} and >> \code{varSeq} >> >>> columns. >> >>> } >> >>> \item{\code{varTxLoc}}{ >> >>> Location in transcript-based coordinates of the first >> >>> nucleotide in >> >>> the variant. This value will be the same as >> \code{seqTxLoc} >> >>> when the >> >>> variant starts exactly at the beginning of a codon. >> >>> } >> >>> \item{\code{varCdsLoc}}{ >> >>> Location in cds-based coordinates of the first nucleotide >> in >> >>> the variant. This position is relative to the start of >> the >> >>> cds region >> >>> defined in the \code{subject} annotation. >> >>> } >> >>> \item{\code{subjStrand}}{ >> >>> The strand of the \code{subject} the variant matched. >> >>> \code{predictCoding} >> >>> determines which variants fall in a coding region by >> finding the >> >>> overlaps >> >>> between the \code{query} and \code{subject}. The >> \code{query} >> >>> may be >> >>> un-stranded \sQuote{*} but the \code{subject} annotation >> will >> >>> have a strand. >> >>> } >> >>> >> >>> >> >>> You are interested in 'protein coordinates'. Does 'varCdsLoc' >> described >> >>> above meet the need or are you looking for the actual codon >> number in >> >>> the coding sequence? I am interested in hearing more about what >> you are >> >>> doing with the protein coordinates, how you are using them. It >> would >> >>> help us better design future functions. >> >>> >> >>> Thanks, >> >>> Valerie >> >>> >> >>> On 03/02/2012 01:11 AM, Alex Gutteridge wrote: >> >>> > Thanks Valerie - much appreciated! >> >>> > >> >>> > On 01.03.2012 21:30, Valerie Obenchain wrote: >> >>> >> A 'txLoc' column has been added to the output of >> predictCoding. >> >>> >> Available in devel version 1.1.57. >> >>> >> >> >>> >> Valerie >> >>> >> >> >>> >> >> >>> >> On 02/28/2012 08:20 AM, Valerie Obenchain wrote: >> >>> >>> Good suggestion. Yes, predictCoding is does this internally. >> I'll >> >>> >>> post back here when this has been added. >> >>> >>> >> >>> >>> Valerie >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> On 02/28/2012 01:49 AM, Alex Gutteridge wrote: >> >>> >>>> Hi Valerie, >> >>> >>>> >> >>> >>>> Thanks everything works great now. One small feature >> request - >> >>> >>>> would it be hard to output the protein sequence position of >> the >> >>> >>>> coding SNPs? At the moment once I've run predictCoding I'm >> >>> >>>> re-extracting the cds and working out the position of each >> coding >> >>> >>>> SNP so I can see where in the protein sequence it is, but >> it seems >> >>> >>>> like this is probably just replicating what predictCoding >> must be >> >>> >>>> doing internally anyway? >> >>> >>>> >> >>> >>>> Alex Gutteridge >> >>> >>> >> >>> >>> >> >>> >>> On 02/24/2012 10:39 AM, Valerie Obenchain wrote: >> >>> >>>> Hi Alex, >> >>> >>>> >> >>> >>>> Thanks for the bug report. The cdsID was taken from an >> overlap >> >>> >>>> between the query and GRangesList of cds by transcripts. >> This gave >> >>> >>>> the correct transcript number but (incorrectly) took the >> first cds >> >>> >>>> number in the list by default. Now fixed in devel 1.1.55. >> >>> >>>> >> >>> >>>> I've also updated the man page. >> >>> >>>> >> >>> >>>> Valerie >> >>> >>>> >> >>> >>>> >> >>> >>>> >> >>> >>>> On 02/24/2012 02:08 AM, Alex Gutteridge wrote: >> >>> >>>>> On 22.02.2012 18:58, Hervé Pagès wrote: >> >>> >>>>>> Hi Alex, >> >>> >>>>>> >> >>> >>>>>> On 02/22/2012 03:56 AM, Alex Gutteridge wrote: >> >>> >>>>> >> >>> >>>>> [...] >> >>> >>>>> >> >>> >>>>>>> But the predictCoding call gives this error: >> >>> >>>>>>> >> >>> >>>>>>> Error in .setSeqNames(x, value) : >> >>> >>>>>>> The replacement value for isActiveSeq must be a logical >> vector, >> >>> >>>>>>> with >> >>> >>>>>>> names that match the seqlevels of the object >> >>> >>>>>> >> >>> >>>>>> The error message doesn't help much but I think the pb is >> >>> that you >> >>> >>>>>> didn't rename chMT properly. Try to do this: >> >>> >>>>>> >> >>> >>>>>> seqlevels(snps) <- gsub("chrMT", "chrM", >> seqlevels(snps)) >> >>> >>>>>> >> >>> >>>>>> before you start the for(eg in entrez.ids){..} loop >> again. >> >>> >>>>>> >> >>> >>>>>> Cheers, >> >>> >>>>>> H. >> >>> >>>>> >> >>> >>>>> Thanks Herv? that nailed it. I'm having some difficulty >> >>> joining up >> >>> >>>>> the output of predictCoding() with the query SNPs though. >> If >> >>> >>>>> someone could point out where the disconnect in my >> thinking is I >> >>> >>>>> would appreciate it! >> >>> >>>>> >> >>> >>>>> Here's my (now edited down) script: >> >>> >>>>> >> >>> >>>>> library(BSgenome.Hsapiens.UCSC.hg19) >> >>> >>>>> library(VariantAnnotation) >> >>> >>>>> library(SNPlocs.Hsapiens.dbSNP.20110815) >> >>> >>>>> library(TxDb.Hsapiens.UCSC.hg19.knownGene) >> >>> >>>>> >> >>> >>>>> entrez.ids = c('6335') >> >>> >>>>> txdb19 = TxDb.Hsapiens.UCSC.hg19.knownGene >> >>> >>>>> >> >>> >>>>> snps = getSNPlocs(c("ch1","ch2"),as.GRanges=T) >> >>> >>>>> seqlevels(snps) <- gsub("ch", "chr", seqlevels(snps)) >> >>> >>>>> seqlevels(snps) <- gsub("chrMT", "chrM", seqlevels(snps)) >> >>> >>>>> >> >>> >>>>> gene.list = cdsBy(txdb19, by="gene") >> >>> >>>>> vsd.list = gene.list[entrez.ids] >> >>> >>>>> cds.list = cdsBy(txdb19,by="tx") >> >>> >>>>> >> >>> >>>>> eg = entrez.ids[1] >> >>> >>>>> >> >>> >>>>> snp.idx = unique(queryHits(findOverlaps(snps, >> vsd.list[[eg]]))) >> >>> >>>>> eg.snps = snps[snp.idx] >> >>> >>>>> iupac = values(eg.snps)[,"alleles_as_ambig"] >> >>> >>>>> eg.snps.exp = rep(eg.snps, nchar(IUPAC_CODE_MAP[iupac])) >> >>> >>>>> variant.alleles = >> >>> >>>>> >> >>> >> DNAStringSet(strsplit(paste(IUPAC_CODE_MAP[iupac],collapse=""),"")[ [1]]) >> >>> >> >>> >>>>> >> >>> >>>>> >> >>> >>>>> >> >>> >>>>> aa = >> >>> >>>>> >> >>> >> predictCoding(eg.snps.exp,txdb19,seqSource=Hsapiens,varAllele=varia nt.alleles) >> >>> >>>>> >> >>> >>>>> ##### >> >>> >>>>> >> >>> >>>>> Then if I query the predictCoding results in aa in an >> interactive >> >>> >>>>> session I get the following (see inline comments for what >> I think >> >>> >>>>> should be happening, but I must be misinterpreting what >> queryID >> >>> >>>>> means) >> >>> >>>>> >> >>> >>>>> The docs for predictCoding() contain a small typo >> >>> >>>>> (s/queryHits/queryID), but otherwise seem clear? >> >>> >>>>> >> >>> >>>>> Columns include ?queryID?, ?consequence?, ?refSeq?, >> ?varSeq?, >> >>> >>>>> ?refAA?, ?varAA?, ?txID?, ?geneID?, and ?cdsID?. >> >>> >>>>> >> >>> >>>>> ?queryHits? The ?queryHits? column provides a map >> back to >> >>> the >> >>> >>>>> variants in the original ?query?. If the ?query? >> was a >> >>> >>>>> ?VCF? >> >>> >>>>> object this index corresponds to the row in the >> >>> >>>>> ?GRanges? in >> >>> >>>>> the ?rowData? slot. If ?query? was an expanded >> >>> ?GRanges?, >> >>> >>>>> ?RangedData? or ?RangesList? the index >> corresponds to >> >>> >>>>> the row >> >>> >>>>> in the expanded object. >> >>> >>>>> >> >>> >>>>> ##### >> >>> >>>>> >> >>> >>>>>> aa[1,] >> >>> >>>>> DataFrame with 1 row and 9 columns >> >>> >>>>> queryID consequence refSeq varSeq >> >>> refAA >> >>> >>>>> <integer> <factor> <dnastringset> <dnastringset> >> <aastringset> >> >>> >>>>> 1 1 nonsynonymous CTC ATC >> >>> L >> >>> >>>>> varAA txID geneID cdsID >> >>> >>>>> <aastringset> <character> <factor> <integer> >> >>> >>>>> 1 I 10921 6335 33668 >> >>> >>>>>> #So the first SNP (queryID: 1) is nonsynonymous and maps >> to tx >> >>> >>>>>> '10921' and cds '33668'. >> >>> >>>>>> #If I look at the first query SNP I get this: >> >>> >>>>>> eg.snps.exp[aa[1,'queryID'],] >> >>> >>>>> GRanges with 1 range and 2 elementMetadata values: >> >>> >>>>> seqnames ranges strand | RefSNP_id >> >>> >>>>> alleles_as_ambig >> >>> >>>>> <rle> <iranges> <rle> | <character> <character> >> >>> >>>>> [1] chr2 [167055370, 167055370] * | 111558968 >> >>> >>>>> R >> >>> >>>>> --- >> >>> >>>>> seqlengths: >> >>> >>>>> chr1 chr2 chr3 chr4 chr5 chr6 ... chr20 chr21 >> chr22 >> >>> chrX >> >>> >>>>> chrY chrM >> >>> >>>>> NA NA NA NA NA NA ... NA NA >> NA NA >> >>> >>>>> NA NA >> >>> >>>>>> #So SNP 1 is at 167055370 on chr2 >> >>> >>>>>> #But if I check tx '10921' I see that the cds overlapping >> >>> >>>>>> 167055370 is actually '33651' >> >>> >>>>>> #And cds '33668' is at the other end of the tx: >> >>> >>>>>> cds.list[[aa[1,'txID']]] >> >>> >>>>> GRanges with 26 ranges and 3 elementMetadata values: >> >>> >>>>> seqnames ranges strand | >> cds_id >> >>> >>>>> cds_name >> >>> >>>>> <rle> <iranges> <rle> | <integer> <character> >> >>> >>>>> [1] chr2 [167168009, 167168266] - | >> 33668 <na> >> >>> >>>>> [2] chr2 [167163466, 167163584] - | >> 33667 <na> >> >>> >>>>> [3] chr2 [167163020, 167163109] - | >> 33666 <na> >> >>> >>>>> [4] chr2 [167162302, 167162430] - | >> 33647 <na> >> >>> >>>>> [5] chr2 [167160748, 167160839] - | >> 33646 <na> >> >>> >>>>> [6] chr2 [167159600, 167159812] - | >> 33645 <na> >> >>> >>>>> [7] chr2 [167151109, 167151172] - | >> 33644 <na> >> >>> >>>>> [8] chr2 [167149741, 167149882] - | >> 33643 <na> >> >>> >>>>> [9] chr2 [167144947, 167145153] - | >> 33642 <na> >> >>> >>>>> ... ... ... ... ... >> ... >> >>> >>>>> ... >> >>> >>>>> [18] chr2 [167099012, 167099166] - | >> 33659 <na> >> >>> >>>>> [19] chr2 [167094604, 167094777] - | >> 33658 <na> >> >>> >>>>> [20] chr2 [167089850, 167089972] - | >> 33657 <na> >> >>> >>>>> [21] chr2 [167085201, 167085482] - | >> 33656 <na> >> >>> >>>>> [22] chr2 [167084180, 167084233] - | >> 33655 <na> >> >>> >>>>> [23] chr2 [167083077, 167083214] - | >> 33654 <na> >> >>> >>>>> [24] chr2 [167060870, 167060974] - | >> 33653 <na> >> >>> >>>>> [25] chr2 [167060465, 167060735] - | >> 33652 <na> >> >>> >>>>> [26] chr2 [167055182, 167056374] - | >> 33651 <na> >> >>> >>>>> exon_rank >> >>> >>>>> <integer> >> >>> >>>>> [1] 2 >> >>> >>>>> [2] 3 >> >>> >>>>> [3] 4 >> >>> >>>>> [4] 5 >> >>> >>>>> [5] 6 >> >>> >>>>> [6] 7 >> >>> >>>>> [7] 8 >> >>> >>>>> [8] 9 >> >>> >>>>> [9] 10 >> >>> >>>>> ... ... >> >>> >>>>> [18] 19 >> >>> >>>>> [19] 20 >> >>> >>>>> [20] 21 >> >>> >>>>> [21] 22 >> >>> >>>>> [22] 23 >> >>> >>>>> [23] 24 >> >>> >>>>> [24] 25 >> >>> >>>>> [25] 26 >> >>> >>>>> [26] 27 >> >>> >>>>> --- >> >>> >>>>> seqlengths: >> >>> >>>>> chr1 chr2 ... >> >>> >>>>> chr18_gl000207_random >> >>> >>>>> 249250621 243199373 ... >> >>> >>>>> 4262 >> >>> >>>>> >> >>> >>>>> >> >>> >>>> >> >>> >>>> _______________________________________________ >> >>> >>>> Bioconductor mailing list >> >>> >>>> Bioconductor at r-project.org >> >>> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >>> >>>> Search the archives: >> >>> >>>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >>> >>> >> >>> >>> _______________________________________________ >> >>> >>> Bioconductor mailing list >> >>> >>> Bioconductor at r-project.org >> >>> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >>> >>> Search the archives: >> >>> >>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >>> > >> >>> >> >>> _______________________________________________ >> >>> Bioconductor mailing list >> >>> Bioconductor at r-project.org >> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >>> Search the archives: >> >>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >> -- Alex Gutteridge

ADD REPLY • link 13.8 years ago Alex Gutteridge ▴ 650

Login before adding your answer.