Question regarding Bioconductor package "affy" and custom-commercial platform
0
0
Entering edit mode
Maximilian ▴ 10
@maximilian-11742
Last seen 6.9 years ago

Dear Ladies and Gentleman,

I am trying to use a custom-commercial Affymetrix platform called 'Affymetrix Escherichia coli full sequence array (EcFS) [EcFS_1, EcFS_2, and EcFS_3]'. The platform data can be found under the GEO accession number GPL13336:

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL13336

Because it is custom-commercial, there is no commercial probe table from Affymetrix but the authors published the respective _lq.txt and the cdf.txt files for this platform (link). I want to use the Bioconductor package "affy" and then get the presence/absence calls of the genes from the regarding .CEL files.

Now my question: Is it possible to get the probe table for this custom-commercial Affymetrix platform out of the published files? If it is possible, could someone please explain to me how to do it or please recommend me a manual to read it up?

Thank you for your help.

Kind regards,

Maximilian

affy • 1.1k views
ADD COMMENT
0
Entering edit mode

What exactly do you mean by 'probe table'? If you mean that you want to generate a cdfenv or cdf package, then see the vignettes and help files for the makecdfenv package.

ADD REPLY
0
Entering edit mode

Dear Mr. MacDonald,

thank you for your response. With 'probe table' I mean Probe Set Data in Tabular Format for the respective Affymetrix .CEL files. I assumed you need a specific Probe Set Data in Tabular Forma​t for each respective type of Affymetrix microarray.

For example for the commercial Affymetrix Human Genome U133 Plus 2.0 Array ​you have a specific Probe Set Data in Tabular Forma​t ​to link the genes to their respective expression.

Did I understand this correctly? Because in the data I used (link in first post) a custom-commercial Affymetrix microarray was used and there is no commercial Probe Set Data in Tabular Forma​t.

​I want to get the present/absent calls from the genes in the respective .CEL files. Are your suggested cdfenv files the way to do this?

Thank you for your help.

Kind regards,

Maximilian

ADD REPLY
0
Entering edit mode

You are saying things here that don't make sense, or more correctly I should say are meaningless. When you say

"For example for the commercial Affymetrix Human Genome U133 Plus 2.0 Array ​you have a specific Probe Set Data in Tabular Forma​t ​to link the genes to their respective expression."

That's basically gibberish because while there is a 'Probe Set Data in Tabular Forma​t', it is not used to link genes to their respective expression. The CDF file links probes (and their location on the array) to probesets, which is then used to generate expression values. The makecdfenv package processes the CDF file so the affy package can then use that information to process the CEL files.

There is no need to assume how things work, when you can actually read some documentation and find out how they work. I have already pointed you to the makecdfenv vignette and help pages. If you had read that, I doubt you would be asking me if the 'cdfenv files are the way to do this'. There are also vignettes and the help pages for the affy package that should help clear things up.

If you are planning on using Open Source tools like R and Bioconductor, you will do yourself a service by learning how to figure things out by yourself. There is no paid support staff, and those who would help you will soon stop helping if it appears you are unwilling to help yourself.

The best way to learn is

  1. Read the documentation that comes with the software you want to use.
  2. Use google searches to find other relevant documentation
  3. If 1 and 2 don't answer your question, try asking here.

If it appears that you are unwilling to do 1 and 2, then 3 will soon become unavailable to you as well, because people will just see your questions and say 'ugh, that dude again...' and just ignore your post. Don't be that dude.

ADD REPLY
0
Entering edit mode

Dear Maximilian,
It seems you are rather new to the analysis of microarray data. Although you stated your problem rather cryptic in this and your other post, I understand what in the end you would like to achieve: having normalized expression data from that E. Coli experiments you linked to, which you would like to analyze using the COBRA toolbox. For this also info on whether a probeset/gene is expressed is also handy.
Nevertheless, you will first need to know which basic(!) steps are required to do this given the public datasets you would like use. Don't take this the wrong way, but you can/should not simply expect someone on a forum like this will take you by the hand and explain to you step-by-step what to do after you asked a rather unclear question... just like James said. It would be best to discuss this with your PI or an experienced colleague at your group/university. If after reading the relevant documents (vignettes) and help-pages you still face some specific problems, you will find here many knowledgeable people that will try to help you with that... but be clear and specific!


Coming back to your question: the Affymetrix dataset you linked to is indeed a custom-made Affymetrix array. To reanalyse this, you will need the raw data (CEL) and chip definition (CDF) files. You can download the CEL files from GEO, but the CDF GEO provides for this array is a weird one: it is in TXT format, and although there are TXT-ASCII based CDFs, the format and content of the available CDF doesn't adhere to the expected standards. In other words, you cannot convert it into a CDF-environment (which is needed for BioC/R to normalize the data) using the library makecdfenv() [that is what James was referring to in his 1st post...]. I would suggest you get in touch with the submitter of that dataset and ask them directly for that CDF file. Alternatively you could maybe use the normalized data that also can be downloaded from GEO.
Please also have a look at the limma user's guide, which provides an entry on how to normalize one- or two-colour Agilent data. For Affymetrix datasets, this, this and this site are also useful to get started.

ADD REPLY
0
Entering edit mode

Dear Mr. Hooiveld,

I understand what you wrote. I am sorry I asked too unspecific question and didn't really understand the material. I didn't want to appear so demanding. Thank you, Mr. Hooiveld, for your useful links. I will come back with more knowledge and more specific and understandable questions.

Kind regards,

Maximilian   

ADD REPLY

Login before adding your answer.

Traffic: 751 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6