A couple VSN related questions
0
0
Entering edit mode
@wolfgang-huber-3550
Last seen 17 days ago
EMBL European Molecular Biology Laborat…
Dear Leo > Thank you very much for your thorough and clear answer! I have an > additional question: is the "sigsq" slot of a vsn object the estimated > variance (which the name and actual value seem to indicate) or the standard > deviation (as documented on the man page)? Also is it before or after the > log(2) transformation (I guess it is before, but want to make sure)? > > Thanks again for taking your time to answer my questions! thank you, you are right. It is the variance squared; the man page is fixed in vsn >= 3.4.6. sigma^2 is defined in the vignette "Likelihood calculations for vsn". It is computed before that affine scaling and shifting that we talked about in the previous mail. Best wishes Wolfgang ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber > -----Original Message----- > From: Wolfgang Huber [mailto:huber at ebi.ac.uk] > Sent: Wednesday, January 30, 2008 7:55 PM > To: Leo J. LEE > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] A couple VSN related questions > > Dear Leo, > > answers please see below. > >> I am new to Bioconductor and R (having the latest version, Bioconductor > 2.1 >> and R 2.6.1, installed) so please bear with me if I ask something >> naive/stupid. Anyway, the most important reason why I come to > Bioconductor >> is to use the vsn package, and after trying it a bit, I have the following >> questions: >> >> 1. It is clear from the documentation (and a previous email from Wolfgang) >> that the real computation of vsn2 is written in C. Is it possible for me > to >> access the C source code (and compile and run it myself)? If yes, this > will >> be truly great since I have to do other computational analysis outside of > R >> anyway. > > Yes, it is all open source. The most obvious way is to download the > .tar.gz file from > http://www.bioconductor.org/packages/devel/bioc/html/vsn.html > and unpack it. See the file vsn2.c. There is also public, anonymous svn > access, which is described on the bioc website. > >> 2. I am trying to follow the computation performed by VSN as documented in >> the "Introduction to VSN" vignette. Loading the kidney dataset, I have > the >> following data before and after VSN: >>> exprs(kidney)[1:10,] >> channels >> spots green red >> 1 815.32 937.02 >> 2 671.66 765.03 >> 3 713.93 713.59 >> 4 703.97 656.70 >> 5 493.59 472.10 >> 6 477.92 453.13 >> 7 346.55 385.47 >> 8 377.34 421.86 >> 9 132.80 121.77 >> 10 148.65 130.41 >>> fit = vsn2(kidney) >> vsn: 8704 x 2 matrix (1 stratum). 100% done. >>> fit at hx[1:10,] >> channels >> spots green red >> 1 9.711781 9.870624 >> 2 9.453765 9.597638 >> 3 9.533994 9.506128 >> 4 9.515436 9.398666 >> 5 9.067212 8.995503 >> 6 9.028838 8.948561 >> 7 8.673814 8.771463 >> 8 8.762664 8.868634 >> 9 7.957576 7.926483 >> 10 8.016070 7.957696 >> >> Now I am trying to reproduce the result from parameters estimated by VSN >> using >>> coef(fit)[1,,] >> [,1] [,2] >> [1,] -0.08444849 -5.927967 >> [2,] -0.06787093 -5.957545 >>> offset <- coef(fit)[1,,1]; >>> scale <- exp(coef(fit)[1,,2]); >>> ynorm_my = asinh(exprs(kidney)*scale + offset); >>> ynorm_my[1:10,] >> channels >> spots green red >> 1 1.4820851 1.6139176 >> 2 1.2851048 1.4029668 >> 3 1.3588520 1.3584152 >> 4 1.3272734 1.2650499 >> 5 1.0353039 0.9986861 >> 6 0.9954236 0.9530607 >> 7 0.7626212 0.8400535 >> 8 0.8148209 0.8976601 >> 9 0.2661626 0.2376892 >> 10 0.3115130 0.2662458 >> >> which is clearly different from the result produced by VSN. Could anybody >> tell me what I have done wrong? Thank you so much! > > Try this: > > library("vsn") > data("kidney") > fit = vsn2(kidney) > shift = coef(fit)[1,,1] > scale = exp(coef(fit)[1,,2]) > > nk = t(asinh(t(exprs(kidney))*scale + shift)) > nk = nk/log(2) - fit at hoffset > > identical(nk , exprs(fit)) > [1] TRUE > > > This is a subtle point. It is mentioned (admittedly too briefly) in the > vsn2 man page, section "Note on overall scale and location of the glog > transformation": > > The data are returned on a glog scale to base 2. More precisely, > the transformed data are subject to the transformation > glog2(f(b)*x+a) + c, where glog2(u) = log2(u+sqrt(u*u+1)) = > asinh(u)/log(2) is called the generalised logarithm, a and b are > the fitted model parameters (see references), f is a parameter > transformation [4], and the overall constant offset c is computed > from b such that for large x the transformation approximately > corresponds to the log2 function. The offset c is inconsequential > for all differential expression calculations, but many users like > to see the data in a range that they are familiar with. > > > So the difference to what you did is (i) to go from glog to glog_2 scale > (the division by log(2)), and (ii) to subtract fit at hoffset. The > computation of fit at hoffset is done towards the end of the "vsnMatrix" > function. > > Best wishes > Wolfgang > >
GO Kidney vsn GO Kidney vsn • 642 views
ADD COMMENT

Login before adding your answer.

Traffic: 549 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6