Question: How to access normalized data in the NanoStringDiff package?
0
gravatar for casey.rimland
12 months ago by
casey.rimland110
University of Cambridge, National Institutes of Health, Chapel Hill School of Medicine
casey.rimland110 wrote:

I am trying to use NanoStringDiff for differential expression analysis of a nanostring data-set with 506 endogenous genes in the set. I was wondering how/if there is a way to output the normalized data that NanoStringDiff uses to run the differential expression LRT tests? I have been able to run the differential expression analyses correctly (I hope!), but now would like to know if there is a way to access the normalized data to use for PCA plots, heatmaps, etc? I tried assay(exprs) but it just gave me the raw counts. Thanks!

 

#Load data

path<-paste(dir,"nanostring_R.csv",sep="/")
designs <- data.frame(group=c("WT_IL13", "WT_IL13", "WT_IL13", "WT_CTRL", "WT_CTRL", "WT_CTRL", "RA1_IL13", "RA1_IL13", "RA1_IL13", "RA1_CTRL", "RA1_CTRL", "RA1_CTRL"))

#Create a Nanostring dataset
nanostringdata <- createNanoStringSetFromCsv(path = path, header = TRUE, designs = designs)

#Run DE analysis
pheno=pData(nanostringdata)
group=pheno$group
design.full=model.matrix(~0+group)
design.full

NanoStringData_Norm <- estNormalizationFactors(nanostringdata)

#Get Results for pairwise contrasts
result_WT <- glm.LRT(NanoStringData_Norm,design.full,contrast=c(0,0,-1,1))
nanostring nanostringdiff • 346 views
ADD COMMENTlink modified 12 months ago by James W. MacDonald50k • written 12 months ago by casey.rimland110
Answer: How to access normalized data in the NanoStringDiff package?
1
gravatar for James W. MacDonald
12 months ago by
United States
James W. MacDonald50k wrote:

I don't think there is a direct accessor, but this is what is done to the data prior to fitting any model:

    c = positiveFactor(NanoStringData)
    d = housekeepingFactor(NanoStringData)
    k = c * d
    lamda_i = negativeFactor(NanoStringData)
    Y = exprs(NanoStringData)
    Y_n = sweep(Y, 2, lamda_i, FUN = "-")
    Y_nph = sweep(Y_n, 2, k, FUN = "/")
    Y_nph[Y_nph <= 0] = 0.1

And then 

     Y_nph <- log(Y_nph)

will give you data that you can plot.

ADD COMMENTlink written 12 months ago by James W. MacDonald50k

Thank you!

I just gave the code a try and I got stuck on this step with a warning message:

Y_n = sweep(Y, 2, lamda_i, FUN = "-")

Warning message:
In max(cumDim[cumDim <= lstats]) :
  no non-missing arguments to max; returning -Inf

Anything I might be doing wrong? The code runs through but there are just NA in the final log(Y_nph)

ADD REPLYlink written 12 months ago by casey.rimland110

That error comes from some checking in sweep to make sure that the length of lambda_i is reasonable for the dimensions of the matrix you are sweeping on. So there appears to be a problem with either your Y matrix or whatever you are getting for lambda_i. You need to take a look at those data and see what's up.

ADD REPLYlink written 12 months ago by James W. MacDonald50k

I was trying to run it before calling the estNormalizationFactors. Fixed it now and have the output. Thank you bunches!

 

ADD REPLYlink written 12 months ago by casey.rimland110

Hello,

I get similar situation like above.

To get normalized data for plotting, I tried to use NanoStringDataNormalization , but that normalized data looks not consistent to the logFC provided by glm.LRT.

I found this comment and compared the normalized matrix using this code (without the last log transformation) after estNormalizationFactors with raw data, and that by NanoStringDataNormalization with the same raw data, but those two are quite different.

which one should I use?

ps. I really appreciate your package though.

ADD REPLYlink written 3 months ago by ysksuh0

You cannot generate log fold changes you get from a generalized linear model 'by hand'. In other words, there is no formula that you can plug data into, in order to get the results the GLM will provide. The parameters for the GLM are estimated using an iterative procedure that you won't be able to replicate, and the 'normalized' data we are talking about are just gross estimates that are useful for plotting.

ADD REPLYlink written 3 months ago by James W. MacDonald50k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 313 users visited in the last hour