3.1 years ago by
The pd.mogene.2.0.st package is used by oligo when you process your arrays. It basically tells oligo where all the probes are on the array, as well as which probes to combine into a probeset when summarizing.
The mogene20sttranscriptcluster.db package maps the 'core' probesets (the default summarization level for oligo) to the genes that are interrogated by each probeset, as well as other information about each gene. Note that you can also summarize this array at the 'probeset' level, which corresponds to the 'PSR' or probe set region, which roughly corresponds to exons. If you do that, then you want to annotate using the mogene20stprobeset.db package.
The annotation packages differ in a couple ways from the csv files you can get from Affy, but do note that they are based on those files. They differ in ease of use (parsing the Affy csv files is a non-trivial exercise), as well as the mappings. To generate the annotation files we get the RefSeq and GenBank IDs for each Affy probeset ID and then map to Entrez Gene, and then map to all the other annotation databases. So if NCBI has different information for a given Entrez Gene ID, then the annotation data package may differ from the Affy csv.
You could also annotate using biomaRt, and you will get reasonably similar results. The differences would be between the data housed at NCBI versus EBI.