Question

Tigre Package question 3

0

Entering edit mode

Antti Honkela ▴ 50

@antti-honkela-6384

Last seen 9.0 years ago

Finland

Dear Anisha, (FYI: taking the discussion back to the list.) mmgmos is only meant for processing Affymetrix microarray data so it cannot be used with RNA-seq data. It is possible to obtain similar error bars for expression estimates as mmgmos provides from RNA-seq using suitable analysis tools such as BitSeq, but unfortunately using those in tigre does not currently work out of the box. If you need this, e.g. if you have data with very few or no replicates of your time series, please let me know so we can try to work it out. If you have sufficiently many replicate time series, you should be OK without any pre-specified variances - the model will fit a simple variance model in this case. To trigger this behaviour, you should use processRawData() to process your expression data matrix and pass the resulting ExpressionTimeSeries object to GP...() functions. Antti On 2014-02-11 00:16 , Solanki, Anisha wrote: > Dear Antti, > > Thanks for your reply. The information you have given me has been very > useful. I had another quick question regarding the mmgmos command. I > understand that the command accepts data as an AffyObject. However, I have > data from RNA-Seq and not from affymetrix microarrays. Hence I cannot > create an Affyobject from my data as the object requires CEL files to > convert the data into an AffyObject. Is there any other alternative other > than using an Affyobject. I have tried to run a matrix with the expression > values of every sample from my data. However the command mmgmos doesn't > seem to accept this as a valid object. > > Please advise. > > Thanks > > Anisha > > > On 10/02/2014 09:38, "Antti Honkela" <antti.honkela at="" hiit.fi=""> wrote: > >> On 2014-02-09 18:49 , Solanki, Anisha wrote: >> >> Dear Anisha, >> >>> I have now solved the previous error by adding variances independently >>> to >>> the expression Dataset. >> >> The error variances are critical to the accuracy of the method, so you >> should never just impute any values there without careful consideration. >> More about how you could fix this better below. >> >>> I just had another quick question. The targets are >>> ranked by the log-likelihood. Does this mean that the higher the >>> log-likelihood the greater the probability of the gene being a target or >>> vice versa? Also what does null log likelihood stand for? >> >> Our method is based on comparing log-likelihoods over different data >> sets (time series for different genes), which is slightly trickier than >> usual comparison of log-likelihoods over the same data. >> >> The log-likelihood measures how well the data fit a model assuming >> regulation, therefore higher log-likelihood should be counted as >> evidence for being a target. >> >> That said, some time series are easy to fit, and get a high likelihood >> over practically any model. To catch these, we fit the baseline or null >> model (which is just a time-independent Gaussian). We can then filter >> out genes that fit the null model equally well or better than the true >> model. >> >> Finally, even though one might consider the likelihood ratio of real vs. >> null a useful statistic, it is actually not good for ranking. This is >> because the range of null model likelihoods is much larger, and >> therefore the ranking will be determined by how badly the null model >> fits instead of how well the real model fits, and tell nothing about the >> regulation. >> >> In summary, you should: >> 1. *Filter* by likelihood ratio real/null: only keep genes where >> log-likelihood > null-log-likelihood >> 2. *Rank* remaining genes by log-likelihood >> >>>> I think this means that my Data lacks calculated variances. As I >>>> understand from your User guide you process affymetrix Datasets using >>>> the >>>> mmgmos command from the PUMA package which automatically calculates the >>>> variances for you. However, when I try to run my expression value >>>> matrix >>>> through this mmgmos command it doesn't work and gives me this error >>>> "unable to find an inherited method for function ?probeNames? for >>>> signature ?"ExpressionTimeSeries"? >> >> You should run mmgmos on the original AffyBatch object, not on an >> ExpressionTimeSeries object. >> >> >> Hope this helps, >> >> Antti >> >> -- >> Antti Honkela >> antti.honkela at hiit.fi - http://www.hiit.fi/u/ahonkela/ >> > -- Antti Honkela antti.honkela at hiit.fi - http://www.hiit.fi/u/ahonkela/

Microarray impute PROcess convert mmgmos puma tigre trigger Microarray impute PROcess • 1.3k views

ADD COMMENT • link updated 10.2 years ago by Solanki, Anisha ▴ 60 • written 10.2 years ago by Antti Honkela ▴ 50

score 0 · Answer 1 · 2014-02-11

Dear Antti, Thanks for your reply. I would like you to know that the Data I have doesn't have any replicates its just a time series with 7 samples. Is there a different method of calculating the variances? Please advise. Thanks Anisha On 11/02/2014 10:32, "Antti Honkela" <antti.honkela at="" hiit.fi=""> wrote: >Dear Anisha, > >(FYI: taking the discussion back to the list.) > >mmgmos is only meant for processing Affymetrix microarray data so it >cannot be used with RNA-seq data. > >It is possible to obtain similar error bars for expression estimates as >mmgmos provides from RNA-seq using suitable analysis tools such as >BitSeq, but unfortunately using those in tigre does not currently work >out of the box. If you need this, e.g. if you have data with very few or >no replicates of your time series, please let me know so we can try to >work it out. > >If you have sufficiently many replicate time series, you should be OK >without any pre-specified variances - the model will fit a simple >variance model in this case. To trigger this behaviour, you should use >processRawData() to process your expression data matrix and pass the >resulting ExpressionTimeSeries object to GP...() functions. > > >Antti > > >On 2014-02-11 00:16 , Solanki, Anisha wrote: >> Dear Antti, >> >> Thanks for your reply. The information you have given me has been very >> useful. I had another quick question regarding the mmgmos command. I >> understand that the command accepts data as an AffyObject. However, I >>have >> data from RNA-Seq and not from affymetrix microarrays. Hence I cannot >> create an Affyobject from my data as the object requires CEL files to >> convert the data into an AffyObject. Is there any other alternative >>other >> than using an Affyobject. I have tried to run a matrix with the >>expression >> values of every sample from my data. However the command mmgmos doesn't >> seem to accept this as a valid object. >> >> Please advise. >> >> Thanks >> >> Anisha >> >> >> On 10/02/2014 09:38, "Antti Honkela" <antti.honkela at="" hiit.fi=""> wrote: >> >>> On 2014-02-09 18:49 , Solanki, Anisha wrote: >>> >>> Dear Anisha, >>> >>>> I have now solved the previous error by adding variances independently >>>> to >>>> the expression Dataset. >>> >>> The error variances are critical to the accuracy of the method, so you >>> should never just impute any values there without careful >>>consideration. >>> More about how you could fix this better below. >>> >>>> I just had another quick question. The targets are >>>> ranked by the log-likelihood. Does this mean that the higher the >>>> log-likelihood the greater the probability of the gene being a target >>>>or >>>> vice versa? Also what does null log likelihood stand for? >>> >>> Our method is based on comparing log-likelihoods over different data >>> sets (time series for different genes), which is slightly trickier than >>> usual comparison of log-likelihoods over the same data. >>> >>> The log-likelihood measures how well the data fit a model assuming >>> regulation, therefore higher log-likelihood should be counted as >>> evidence for being a target. >>> >>> That said, some time series are easy to fit, and get a high likelihood >>> over practically any model. To catch these, we fit the baseline or null >>> model (which is just a time-independent Gaussian). We can then filter >>> out genes that fit the null model equally well or better than the true >>> model. >>> >>> Finally, even though one might consider the likelihood ratio of real >>>vs. >>> null a useful statistic, it is actually not good for ranking. This is >>> because the range of null model likelihoods is much larger, and >>> therefore the ranking will be determined by how badly the null model >>> fits instead of how well the real model fits, and tell nothing about >>>the >>> regulation. >>> >>> In summary, you should: >>> 1. *Filter* by likelihood ratio real/null: only keep genes where >>> log-likelihood > null-log-likelihood >>> 2. *Rank* remaining genes by log-likelihood >>> >>>>> I think this means that my Data lacks calculated variances. As I >>>>> understand from your User guide you process affymetrix Datasets using >>>>> the >>>>> mmgmos command from the PUMA package which automatically calculates >>>>>the >>>>> variances for you. However, when I try to run my expression value >>>>> matrix >>>>> through this mmgmos command it doesn't work and gives me this error >>>>> "unable to find an inherited method for function ?probeNames? for >>>>> signature ?"ExpressionTimeSeries"? >>> >>> You should run mmgmos on the original AffyBatch object, not on an >>> ExpressionTimeSeries object. >>> >>> >>> Hope this helps, >>> >>> Antti >>> >>> -- >>> Antti Honkela >>> antti.honkela at hiit.fi - http://www.hiit.fi/u/ahonkela/ >>> >> > >-- >Antti Honkela >antti.honkela at hiit.fi - http://www.hiit.fi/u/ahonkela/ >