Question

A couple VSN related questions

0

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 17 days ago

EMBL European Molecular Biology Laborat…

Dear Leo > Thank you very much for your thorough and clear answer! I have an > additional question: is the "sigsq" slot of a vsn object the estimated > variance (which the name and actual value seem to indicate) or the standard > deviation (as documented on the man page)? Also is it before or after the > log(2) transformation (I guess it is before, but want to make sure)? > > Thanks again for taking your time to answer my questions! thank you, you are right. It is the variance squared; the man page is fixed in vsn >= 3.4.6. sigma^2 is defined in the vignette "Likelihood calculations for vsn". It is computed before that affine scaling and shifting that we talked about in the previous mail. Best wishes Wolfgang ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber > -----Original Message----- > From: Wolfgang Huber [mailto:huber at ebi.ac.uk] > Sent: Wednesday, January 30, 2008 7:55 PM > To: Leo J. LEE > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] A couple VSN related questions > > Dear Leo, > > answers please see below. > >> I am new to Bioconductor and R (having the latest version, Bioconductor > 2.1 >> and R 2.6.1, installed) so please bear with me if I ask something >> naive/stupid. Anyway, the most important reason why I come to > Bioconductor >> is to use the vsn package, and after trying it a bit, I have the following >> questions: >> >> 1. It is clear from the documentation (and a previous email from Wolfgang) >> that the real computation of vsn2 is written in C. Is it possible for me > to >> access the C source code (and compile and run it myself)? If yes, this > will >> be truly great since I have to do other computational analysis outside of > R >> anyway. > > Yes, it is all open source. The most obvious way is to download the > .tar.gz file from > http://www.bioconductor.org/packages/devel/bioc/html/vsn.html > and unpack it. See the file vsn2.c. There is also public, anonymous svn > access, which is described on the bioc website. > >> 2. I am trying to follow the computation performed by VSN as documented in >> the "Introduction to VSN" vignette. Loading the kidney dataset, I have > the >> following data before and after VSN: >>> exprs(kidney)[1:10,] >> channels >> spots green red >> 1 815.32 937.02 >> 2 671.66 765.03 >> 3 713.93 713.59 >> 4 703.97 656.70 >> 5 493.59 472.10 >> 6 477.92 453.13 >> 7 346.55 385.47 >> 8 377.34 421.86 >> 9 132.80 121.77 >> 10 148.65 130.41 >>> fit = vsn2(kidney) >> vsn: 8704 x 2 matrix (1 stratum). 100% done. >>> fit at hx[1:10,] >> channels >> spots green red >> 1 9.711781 9.870624 >> 2 9.453765 9.597638 >> 3 9.533994 9.506128 >> 4 9.515436 9.398666 >> 5 9.067212 8.995503 >> 6 9.028838 8.948561 >> 7 8.673814 8.771463 >> 8 8.762664 8.868634 >> 9 7.957576 7.926483 >> 10 8.016070 7.957696 >> >> Now I am trying to reproduce the result from parameters estimated by VSN >> using >>> coef(fit)[1,,] >> [,1] [,2] >> [1,] -0.08444849 -5.927967 >> [2,] -0.06787093 -5.957545 >>> offset <- coef(fit)[1,,1]; >>> scale <- exp(coef(fit)[1,,2]); >>> ynorm_my = asinh(exprs(kidney)*scale + offset); >>> ynorm_my[1:10,] >> channels >> spots green red >> 1 1.4820851 1.6139176 >> 2 1.2851048 1.4029668 >> 3 1.3588520 1.3584152 >> 4 1.3272734 1.2650499 >> 5 1.0353039 0.9986861 >> 6 0.9954236 0.9530607 >> 7 0.7626212 0.8400535 >> 8 0.8148209 0.8976601 >> 9 0.2661626 0.2376892 >> 10 0.3115130 0.2662458 >> >> which is clearly different from the result produced by VSN. Could anybody >> tell me what I have done wrong? Thank you so much! > > Try this: > > library("vsn") > data("kidney") > fit = vsn2(kidney) > shift = coef(fit)[1,,1] > scale = exp(coef(fit)[1,,2]) > > nk = t(asinh(t(exprs(kidney))*scale + shift)) > nk = nk/log(2) - fit at hoffset > > identical(nk , exprs(fit)) > [1] TRUE > > > This is a subtle point. It is mentioned (admittedly too briefly) in the > vsn2 man page, section "Note on overall scale and location of the glog > transformation": > > The data are returned on a glog scale to base 2. More precisely, > the transformed data are subject to the transformation > glog2(f(b)*x+a) + c, where glog2(u) = log2(u+sqrt(u*u+1)) = > asinh(u)/log(2) is called the generalised logarithm, a and b are > the fitted model parameters (see references), f is a parameter > transformation [4], and the overall constant offset c is computed > from b such that for large x the transformation approximately > corresponds to the log2 function. The offset c is inconsequential > for all differential expression calculations, but many users like > to see the data in a range that they are familiar with. > > > So the difference to what you did is (i) to go from glog to glog_2 scale > (the division by log(2)), and (ii) to subtract fit at hoffset. The > computation of fit at hoffset is done towards the end of the "vsnMatrix" > function. > > Best wishes > Wolfgang > >

GO Kidney vsn GO Kidney vsn • 642 views

ADD COMMENT • link 16.2 years ago Wolfgang Huber ★ 13k