Question

Ringo/Starr getProfiles function

0

Entering edit mode

Noah Dowell ▴ 410

@noah-dowell-3791

Last seen 9.6 years ago

Dear Wolfgang, Thank you for your response. Sorry about the "sessionInfo()" mistake... I think I included it earlier but I will include each time (I included it below). Benedikt Zacher caught my NA error earlier since I had my probeAnnot object pointing to one name of chromosomes (Affymetrix nomenclature) and the tssAnno pointing to another (ensembl nomenclature). I am still struggling somewhat with other issues if you should chose to read on. Thank you much for your time and assistance. Sincerely, Noah Dear Benedikt, Maybe it will come to me sending the R object to you, but I think I am close to understanding the workflow a little better. I think I just missed some of the essential early steps necessary to using Starr (sorry I am new to R and trying to get up to speed quickly). I think the following workflow would be helpful to include in the vignette (or maybe I am getting this wrong). Here is what I am going to try (I excluded some of the commands to load or read files): ### fire up R and create a raw Annotation file using Biomart library(biomaRt) ensembl = useMart("fungal_mart_3") ensembl = useDataset("scerevisiae_eg_gene", mart = ensembl) chrom <- c("I") transcriptAnno <- getBM(attributes=c("ensembl_gene_id", "chromosome_name", + "strand", "transcript_start", "transcript_end"), + filters = "chromosome_name", values = chrom, mart = ensembl) transAnnoChr = transcriptAnno transAnnoChr[,2][transAnnoChr[,2]=="I"] <- "Sc:Oct_2003;chr1" ###change the column names to match your file in the Starr vignette and then put them in the right order: names(transAnnoChr) = c("name", "chr", "strand", "start", "end") chrOnly = transAnnoChr[,2] startOnly = transAnnoChr[,4] endOnly = transAnnoChr[,5] strandOnly= transAnnoChr[,3] nameOnly = transAnnoChr[,1] tssAnno = cbind(chrOnly,startOnly, endOnly, strandOnly, nameOnly) ### make a minimal Expression set so that the writeGFF function works library(Biobase) exprs = as.matrix(tssAnno, header = TRUE, sep = "\t", row.names = 1, as.is = TRUE) minimalSet = new("ExpressionSet", exprs = exprs) ### Now we can use the Starr package to write the gffAnno file library(Starr) bpmap = readBpmap("Sc03b_MR_v04.bpmap") probeAnno = bpmapToProbeAnno(bpmap) writeGFF(minimalSet, probeAnno, "tssAnno.gff") transcriptAnno = read.gffAnno("tssAnno.gff", feature = ???) ### I am not sure what to use as a feature for the read.gffAnno function. ### I think this should get me ready to use the getProfiles function, your thoughts?? Thank you so much for your help and for creating this package. I have not run through these commands but wanted to see if the workflow of 1. Annotation file creation (biomaRt) 2. ExpressionSet Creation (BioBase) 3. gffAnno Creation (Starr) Is this the way to go? Best, Noah >> sessionInfo() > R version 2.10.0 (2009-10-26) > i386-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] grid stats graphics grDevices utils datasets methods base > > other attached packages: > [1] Starr_1.2.0 affxparser_1.18.0 affy_1.24.0 Ringo_1.10.0 Matrix_0.999375-31 lattice_0.17-26 > [7] limma_3.2.1 RColorBrewer_1.0-2 Biobase_2.6.0 > > loaded via a namespace (and not attached): > [1] affyio_1.14.0 annotate_1.24.0 AnnotationDbi_1.8.0 DBI_0.2-4 genefilter_1.28.0 > [6] MASS_7.3-3 preprocessCore_1.8.0 pspline_1.0-13 RSQLite_0.7-3 splines_2.10.0 > [11] survival_2.35-7 xtable_1.5-5

Annotation Starr Annotation Starr • 889 views

ADD COMMENT • link updated 14.4 years ago by zacher@lmb.uni-muenchen.de ▴ 50 • written 14.4 years ago by Noah Dowell ▴ 410

score 0 · Answer 1 · 2009-12-14

Dear Noah, the writeGFF function does not create a gff annotation, but writes all coloumns of the assayData (intensities in the ExpressionSet) to a gff file. If you do this, you can view the data in one of the established genome Browsers (which take this format). I modified your script (my changes are commented), such that the transciprAnno object is compatible with the probeAnno and the ExpressionSet from the vignette. It should work now with your data. The first part below shows the code from the vignette for normalization: library(Starr) dataPath <- system.file("extdata", package="Starr") bpmapChr1 <- readBpmap(file.path(dataPath, "Scerevisiae_tlg_chr1.bpmap")) cels <- c(file.path(dataPath,"Rpb3_IP_chr1.cel"), file.path(dataPath,"wt_IP_chr1.cel"), file.path(dataPath,"Rpb3_IP2_chr1.cel")) names <- c("rpb3_1", "wt_1","rpb3_2") type <- c("IP", "CONTROL", "IP") rpb3Chr1 <- readCelFile(bpmapChr1, cels, names, type, featureData=T, log.it=T) rpb3_loess <- normalize.Probes(rpb3Chr1, method="loess") ips <- rpb3Chr1$type == "IP" controls <- rpb3Chr1$type == "CONTROL" description <- c("Rpb3vsWT") rpb3_loess_ratio <- getRatio(rpb3_loess, ips, controls, description, fkt=median, featureData=F) probeAnnoChr1 <- bpmapToProbeAnno(bpmapChr1) ############################## ### Here is your biomaRt code: ############################## library(biomaRt) ensembl = useMart("fungal_mart_3") ensembl = useDataset("scerevisiae_eg_gene", mart = ensembl) chrom <- c("I") transcriptAnno <- getBM(attributes=c("ensembl_gene_id", "chromosome_name", "strand", "transcript_start", "transcript_end"), filters = "chromosome_name", values = chrom, mart = ensembl) transcriptAnno[,2][transcriptAnno[,2]=="I"] <- "Sc:Oct_2003;chr1" names(transcriptAnno) = c("name", "chr", "strand", "start", "end") ########################### ### Rename $chr to "chr1", as in probeAnnoChr1 (you can se names with ls(probeAnnoChr1)) ### See what the names are in your probeAnno object. ########################### transcriptAnno$chr = sapply(strsplit(transcriptAnno$chr, ";"), function(x) {x[2]}) ########################## ### For plotting along the TSS, the $start and $end are both set to the TSS position in the gff annotation, ########################## watson <- which(transcriptAnno$strand == 1) crick <- which(transcriptAnno$strand == -1) transcriptAnno[watson, ]$end = transcriptAnno[watson, ]$start transcriptAnno[crick, ]$start = transcriptAnno[crick, ]$end ############################ ### Getting and plotting the profiles ############################ profile <- getProfiles(rpb3_loess_ratio, probeAnnoChr1, transcriptAnno, 500, 500, feature="TSS", borderNames="TSS", method="basewise") clust <- rep(1, dim(transcriptAnno)[1]) names(clust) <- transcriptAnno$name x11() plotProfiles(profile, cluster=clust, type="l") ## End I hope you don't get errors now. I think I will include the biomaRt example in the vignette to make the usage of these functions more clearly. Best regards Benedikt Noah Dowell <noahd at="" ucla.edu=""> schrieb : > Dear Wolfgang, > > Thank you for your response. Sorry about the "sessionInfo()" > mistake... I think I included it earlier but I will include each time (I > included it below). Benedikt Zacher caught my NA error earlier since I had my > probeAnnot object pointing to one name of chromosomes (Affymetrix nomenclature) > and the tssAnno pointing to another (ensembl nomenclature). I am still > struggling somewhat with other issues if you should chose to read on. Thank you > much for your time and assistance. > > Sincerely, > > Noah > > > Dear Benedikt, > > Maybe it will come to me sending the R object to you, but I think I am close to > understanding the workflow a little better. I think I just missed some of the > essential early steps necessary to using Starr (sorry I am new to R and trying > to get up to speed quickly). I think the following workflow would be helpful to > include in the vignette (or maybe I am getting this wrong). Here is what I am > going to try (I excluded some of the commands to load or read files): > > ### fire up R and create a raw Annotation file using Biomart > > library(biomaRt) > > ensembl = useMart("fungal_mart_3") > > ensembl = useDataset("scerevisiae_eg_gene", mart = ensembl) > > chrom <- c("I") > > transcriptAnno <- getBM(attributes=c("ensembl_gene_id", > "chromosome_name", > + "strand", "transcript_start", > "transcript_end"), > + filters = "chromosome_name", values = chrom, mart = > ensembl) > > transAnnoChr = transcriptAnno > > transAnnoChr[,2][transAnnoChr[,2]=="I"] <- > "Sc:Oct_2003;chr1" > > ###change the column names to match your file in the Starr vignette and then > put them in the right order: > > names(transAnnoChr) = c("name", "chr", "strand", > "start", "end") > > chrOnly = transAnnoChr[,2] > startOnly = transAnnoChr[,4] > endOnly = transAnnoChr[,5] > strandOnly= transAnnoChr[,3] > nameOnly = transAnnoChr[,1] > > tssAnno = cbind(chrOnly,startOnly, endOnly, strandOnly, nameOnly) > > ### make a minimal Expression set so that the writeGFF function works > > library(Biobase) > > exprs = as.matrix(tssAnno, header = TRUE, sep = "\t", row.names = 1, > as.is = TRUE) > > minimalSet = new("ExpressionSet", exprs = exprs) > > ### Now we can use the Starr package to write the gffAnno file > > library(Starr) > > bpmap = readBpmap("Sc03b_MR_v04.bpmap") > > probeAnno = bpmapToProbeAnno(bpmap) > > writeGFF(minimalSet, probeAnno, "tssAnno.gff") > > > transcriptAnno = read.gffAnno("tssAnno.gff", feature = ???) > > ### I am not sure what to use as a feature for the read.gffAnno function. > ### I think this should get me ready to use the getProfiles function, your > thoughts?? > > > Thank you so much for your help and for creating this package. I have not run > through these commands but wanted to see if the workflow of > 1. Annotation file creation (biomaRt) > 2. ExpressionSet Creation (BioBase) > 3. gffAnno Creation (Starr) > > Is this the way to go? > > Best, > > Noah > > > >> sessionInfo() > > R version 2.10.0 (2009-10-26) > > i386-apple-darwin9.8.0 > > > > locale: > > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > > > attached base packages: > > [1] grid stats graphics grDevices utils datasets methods > base > > > > other attached packages: > > [1] Starr_1.2.0 affxparser_1.18.0 affy_1.24.0 Ringo_1.10.0 > Matrix_0.999375-31 lattice_0.17-26 > > [7] limma_3.2.1 RColorBrewer_1.0-2 Biobase_2.6.0 > > > > loaded via a namespace (and not attached): > > [1] affyio_1.14.0 annotate_1.24.0 AnnotationDbi_1.8.0 > DBI_0.2-4 genefilter_1.28.0 > > [6] MASS_7.3-3 preprocessCore_1.8.0 pspline_1.0-13 > RSQLite_0.7-3 splines_2.10.0 > > [11] survival_2.35-7 xtable_1.5-5