Question

Clariom-D array - subsetting probes to known genes prior to normalization

0

Entering edit mode

Lauro Sumoy ▴ 10

@lauro-sumoy-3287

Last seen 2.6 years ago

Spain

Dear forum members,

I am analyzing Clariom-D array data using the oligo package in R.

I find that the rma normalization step using all probesets works reasonably well (normalized intensity boxplots centered and evenly distributed), but upon filtering the results only for known genes (with symbols) the normalized intensity distribution appears to be very biased (boxplots are not centered nor have even distributions anymore). I think this could be explained by differences in RNA biotype composition (real or perhaps an artefact derived from partial RNA degradation or differences in RNA fragmentation steps leading to uneven representation of signals by different RNA biotypes). I would like to test subsetting to the probests that will be assessed in the end -genes with known symbol- prior to normalization to get more comparable data for statistical analysis of differential gene expression -more homogeneous normalized intensity distributions on boxplots-. Is this possible? If so could anyone give some code examples?

Thank you in advance.

Lauro

oligo • 640 views

ADD COMMENT • link updated 2.7 years ago by James W. MacDonald 65k • written 2.7 years ago by Lauro Sumoy ▴ 10

score 0 · Answer 1 · 2021-08-25

The normalization takes place at the probe level, and is based on the assumption that most of the probes are measuring things that aren't changing expression. Subsetting the probes and then making the same assumption is probably a stronger assumption (alternatively, possibly less likely to be true), so all things equal I would caution you against doing that.

That said, I don't think it would be super difficult to do what you propose, but what you are asking for is special code intended to change the existing functionality of the package to do this special thing you want to do. Which is not what this support site is intended for, which is to help people do the things that the code is already meant to do.

Put another way, you are on your own in this quest. You will have to learn the inner workings of the pd.clariom.d.human package, which encapsulates a SQLite database, as well as the oligo package, which interacts with that package to do the normalization and summarization. For someone who might not know anything about those two packages it could be quite a bit of work just to learn how it all fits together in order to know enough to make the changes you desire, so you probably have to decide if the work required is likely to be worth it in the end.