Question: Error while using champ.svd() function in subset of data
gravatar for karthikrpad
9 months ago by
karthikrpad10 wrote:


I am trying to analyze Illumina EPIC methylation array data for a project. Since the project design was not done correctly, I am trying to subset/subdivide the samples into three groups, and do the differential comparisons within those groups. When I try to subset the data after the myLoad object has been generated, I get an error asking me to check the dimensions of the subset, even though the dimensions of the matrices seem to be correct. When I change my SampleSheet file to just list those samples that I want to subset, and run champ.svd(), I get the following error:

[<<<<< ChAMP.SVD START >>>>>]
champ.SVD Results will be saved in ./CHAMP_SVDimages/ .

[SVD analysis will be proceed with 741930 probes and 16 samples.]

[ champ.SVD() will only check the dimensions between data and pd, instead if checking if Sample_Names are correctly matched (because some user may have no Sample_Names in their pd file),thus please make sure your pd file is in accord with your data sets (beta) and (rgSet).]

<< Following Factors in your pd(sample_sheet.csv) will be analysised: >>
<Sample_ID>(character):TCC1, TCC2, TCC3, TCC4, TCC5, TCP1, TCP2, TCP3, TCP4, TCP5, TCP6, TCP7, TCP8, TCP9, TCP10, TCP11
<Sample_Well>(character):A1, B1, C1, D1, E1, F1, G1, H1, A2, B2, C2, D2, E2, F2, G2, H2
<Sample_Group>(character):TCC, TCP
<Slide>(numeric):201496710011, 201496710034
<Array>(character):R01C01, R02C01, R03C01, R04C01, R05C01, R06C01, R07C01, R08C01
<X>(factor):, .
[champ.SVD have automatically select ALL factors contain at least two different values from your pd(sample_sheet.csv), if you don't want to analysis some of them, please remove them manually from your pd variable then retry champ.SVD().]

<< Following Factors in your pd(sample_sheet.csv) will not be analysis: >>
[Factors are ignored because they only indicate Name or Project, or they contain ONLY ONE value across all Samples.]

<< generated successfully. >>
Error in summary(lm(svd.o$v[, c] ~[[f]]))$coeff[2, 4] :
  subscript out of bounds


I am not sure what is going wrong, so any suggestions or help is appreciated. Thanks!

ADD COMMENTlink modified 9 months ago by rcavalca120 • written 9 months ago by karthikrpad10
gravatar for rcavalca
9 months ago by
United States
rcavalca120 wrote:


I'm a colleague of the poster, but I figured I'd post my findings here in case anyone else has this issue. The problem turned out to be that one of the columns (Slide) in our phenotype data could be construed as numeric, and so in the block

    for(c in 1:topPCA)
        for(f in
                svdPV.m[c,f] <- kruskal.test(svd.o$v[,c] ~[[f]]))$p.value
                svdPV.m[c,f] <- summary(lm(svd.o$v[,c] ~[[f]]))$coeff[2,4];

We were falling into the else statement, and were getting the error. We resolved the issue by doing as.factor() on the Slide column.

Thanks, Raymond Cavalcante

ADD COMMENTlink written 9 months ago by rcavalca120

This helped a lot! Thanks!

ADD REPLYlink written 4 days ago by weinhold0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 386 users visited in the last hour