Question

How to craete hthgu133pluspm.db

0

Entering edit mode

pbachali ▴ 50

@pbachali-9651

Last seen 4.3 years ago

Hi,

I am working with HT_HG-U133_PLUS_PM Chip Definition File. I found hthgu133pluspmcdf and hthgu133pluspmprobe. But I could not find hthgu133pluspm.db. I have done a literature search and looks like we can build the package. I am not sure how to do the annotation database. I have looked at the vignettes of AnnotationForge but I could not do it.

Could somebody help me in getting the annotations of this chip.

Thanks,

Prat

hthgu133pluspm annotation • 1.6k views

ADD COMMENT • link 7.3 years ago pbachali ▴ 50

score 0 · Answer 1 · 2017-01-12

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 6 hours ago

United States

There are two ways to approach this problem. First, you can go the easy way, which is to note that the probeset IDs on the HT_HG-U133_PLUS_PM arrays are identical to those on the conventional U133_Plus2 array, except for the addition of an extra _PM in the probeset name. Because obviously Affy would want to do something like that. There are some small differences in that there are fewer probes for some probesets (and obviously no MM probes), but altogether, other than 40 extra QC probes, the arrays are very similar. And are intended to measure the same thing.

So you could just process the data as normal, then remove all the '_PM's from the probeset IDs and use the hgu133plus2.db package to annotate. That's probably what I would do.

An alternative would be to get the annotation csv file and then build your own annotation package. We can certainly help you with that task, but would need more information than 'I could not do it', which could mean any number of things.

In other words, this is an old array that you can't really get any more, so it's not in anybody's interest to make the annotation package, especially since you could make do with an existing one. So you will have to do the legwork to make it. If you show what code you used to try to make the package on your own, then we can give pointers as to where you went wrong. But like I said, it's way easier to just use the existing annotation package.

ADD COMMENT • link 7.3 years ago James W. MacDonald 65k

0

Entering edit mode

Thanks much James for the help. It is extremely helpful. One final question is if I remove probeset IDs with "_PM", would I remove any important probedetIDs which are extremely important? I am just curious to know about this.

I really appreciate all your help. I am extremely happy with all your suggestions/advice.

Best,

Prat

ADD REPLY • link 7.3 years ago pbachali ▴ 50

0

Entering edit mode

You misunderstand. I didn't say to remove probesets with _PM in their name, instead I said you should remove all the _PMs from the probeset names. As an example, say you processed your data and now have it in an object called 'eset'. You could then do

featureNames(eset) <- gsub("_PM$", "", featureNames(eset))

ADD REPLY • link 7.3 years ago James W. MacDonald 65k

0

Entering edit mode

Oh got you. Thanks much again for the help.

ADD REPLY • link 7.3 years ago pbachali ▴ 50

score 0 · Answer 2 · 2017-01-26

0

Entering edit mode

pbachali ▴ 50

@pbachali-9651

Last seen 4.3 years ago

When I tried to process the CEL files of this chip, I tried using "gcrma" normalization method, it is throwing me the following error

Error in matrix(NA, nrow = max(cbind(pmIndex, mmIndex)), ncol = 1) :
invalid 'nrow' value (too large or NA)

But if I use "rma", I am able to successfully normalize the data. And I am successfully able to change the featureNames and could map all the annotations.

Thanks,

Prat

ADD COMMENT • link 7.3 years ago pbachali ▴ 50

0

Entering edit mode

If you want to make a comment, please use the ADD COMMENT button. The 'Add your answer' box below is for adding answers to questions, not posting comments.

You cannot use gcrma directly on a PM-only chip because gcrma relies on the MM probes to estimate the background contribution due to the GC content of the probes. You can specify some other background probes to use as a stand-in for the MM probes, but in general that's a lot of work for (probably) a minimal gain.

ADD REPLY • link 7.3 years ago James W. MacDonald 65k

0

Entering edit mode

Yes, I see that for gcram normalization we do require MM probes and we do not have MM probes in this chip. I did rma normalization and I am quite unsure about the probe filtering method to use for rma normalization. How to filter low intensity probes here? I mean what is the criteria to see which are low-intensity probes? Is genefilter the reasonable function to do probe filtering? Are there any other approaches to fliter low intensity probes?

Thanks,

Prat

ADD REPLY • link 7.3 years ago pbachali ▴ 50

0

Entering edit mode

We are now well off the original goal of this thread. If you want to ask a new question, start a new thread. But before you do so, please note that this support forum is intended primarily to help people with questions about how to use Bioconductor software, rather than general questions about how best to do an analysis. The latter is a much larger and complex question that is not so easily answered on an internet forum. In addition, a simple google search of the form

filter genes site:support.bioconductor.org

or something similar will bring up lots of existing answers to your question. This can be said of most questions, and it's preferable that you try that first rather than asking a question that has been answered many times already.

ADD REPLY • link 7.3 years ago James W. MacDonald 65k