Search
Question: Error while using champ.svd() function in subset of data
0
gravatar for karthikrpad
7 weeks ago by
karthikrpad0 wrote:

Hi,

I am trying to analyze Illumina EPIC methylation array data for a project. Since the project design was not done correctly, I am trying to subset/subdivide the samples into three groups, and do the differential comparisons within those groups. When I try to subset the data after the myLoad object has been generated, I get an error asking me to check the dimensions of the subset, even though the dimensions of the matrices seem to be correct. When I change my SampleSheet file to just list those samples that I want to subset, and run champ.svd(), I get the following error:

[===========================]
[<<<<< ChAMP.SVD START >>>>>]
-----------------------------
champ.SVD Results will be saved in ./CHAMP_SVDimages/ .

[SVD analysis will be proceed with 741930 probes and 16 samples.]

[ champ.SVD() will only check the dimensions between data and pd, instead if checking if Sample_Names are correctly matched (because some user may have no Sample_Names in their pd file),thus please make sure your pd file is in accord with your data sets (beta) and (rgSet).]

<< Following Factors in your pd(sample_sheet.csv) will be analysised: >>
<Sample_ID>(character):TCC1, TCC2, TCC3, TCC4, TCC5, TCP1, TCP2, TCP3, TCP4, TCP5, TCP6, TCP7, TCP8, TCP9, TCP10, TCP11
<Sample_Well>(character):A1, B1, C1, D1, E1, F1, G1, H1, A2, B2, C2, D2, E2, F2, G2, H2
<Sample_Group>(character):TCC, TCP
<Slide>(numeric):201496710011, 201496710034
<Array>(character):R01C01, R02C01, R03C01, R04C01, R05C01, R06C01, R07C01, R08C01
<X>(factor):, .
[champ.SVD have automatically select ALL factors contain at least two different values from your pd(sample_sheet.csv), if you don't want to analysis some of them, please remove them manually from your pd variable then retry champ.SVD().]

<< Following Factors in your pd(sample_sheet.csv) will not be analysis: >>
<Sample_Name>
<Sample_Plate>
<Pool_ID>
[Factors are ignored because they only indicate Name or Project, or they contain ONLY ONE value across all Samples.]

<< PhenoTypes.lv generated successfully. >>
Error in summary(lm(svd.o$v[, c] ~ PhenoTypes.lv[[f]]))$coeff[2, 4] :
  subscript out of bounds

 

I am not sure what is going wrong, so any suggestions or help is appreciated. Thanks!

ADD COMMENTlink modified 7 weeks ago by rcavalca40 • written 7 weeks ago by karthikrpad0
0
gravatar for rcavalca
7 weeks ago by
rcavalca40
United States
rcavalca40 wrote:

Hello,

I'm a colleague of the poster, but I figured I'd post my findings here in case anyone else has this issue. The problem turned out to be that one of the columns (Slide) in our phenotype data could be construed as numeric, and so in the block

    for(c in 1:topPCA)
        for(f in 1:ncolPhenoTypes.lv))
            ifclassPhenoTypes.lv[,f])!="numeric")
                svdPV.m[c,f] <- kruskal.test(svd.o$v[,c] ~ as.factorPhenoTypes.lv[[f]]))$p.value
            else
                svdPV.m[c,f] <- summary(lm(svd.o$v[,c] ~ PhenoTypes.lv[[f]]))$coeff[2,4];

We were falling into the else statement, and were getting the error. We resolved the issue by doing as.factor() on the Slide column.

Thanks, Raymond Cavalcante

ADD COMMENTlink written 7 weeks ago by rcavalca40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 199 users visited in the last hour