Question: Merge column to SummarizedExperiment from other dataframe
0
gravatar for s.w.vanderlaan
9 months ago by
s.w.vanderlaan20 wrote:

Hi,

I have a RangedSummarizedExperiment which looks like this:

class: RangedSummarizedExperiment 

dim: 483731 485 

metadata(4): creationDate author BBMRIomicsVersion note

assays(1): data

rownames(483731): cg01707559 cg02004872 ... ch.22.47579720R ch.22.48274842R

rowData names(10): addressA addressB ... probeEnd probeTarget

colnames(485): 200397860027_R01C01 200397860027_R02C02 ... 200556930046_R03C01 200556930046_R06C02

colData names(946): STUDY_NUMBER SampleID ... Basename ID

And I have a dataframe which looks like this: 

STUDY_NUMBER    UPID    Testosterone    Estradiol    SHBG    Gender
1    1    NA    NA    NA    male
2    2    NA    NA    NA    male
3    3    10.02    62    49.6    male
4    4    NA    NA    NA    male
5    5    NA    NA    NA    female

I would like to merge this table (n rows = 3662), based on STUDY_NUMBER. So I used the following code: 

colData(aems450k1.MvaluesQCIMPplaqueSE) <- merge(colData(aems450k1.MvaluesQCIMPplaqueSE), AEDB_Q1_20180223_sex, by = "STUDY_NUMBER", all.x = TRUE)

Which results in the following RangedSummarizedExperiment object:

class: RangedSummarizedExperiment&nbsp;

dim: 483731 485&nbsp;

metadata(4): creationDate author BBMRIomicsVersion note

assays(1): data

rownames(483731): cg01707559 cg02004872 ... ch.22.47579720R ch.22.48274842R

rowData names(10): addressA addressB ... probeEnd probeTarget

colnames: NULL

colData names(952): STUDY_NUMBER SampleID ... Sex T_E2

You'll note that colnames is now NULL. My question therefore: 

How can I prevent this from happening?

My second question: 

Could this be happening because the order (based on STUDY_NUMBER) of the two dataframes are not the same?

In fact: Could this result in the colData being 'uncoupled' from the Assay data? Reason of I am thinking this, is because an analysis on a variable X in the dataset (not in the merged-data) results in a significant result. After merging (the variable X has not changed!), the exact same analysis is not significant anymore...

Many thanks,

Sander

ADD COMMENTlink modified 9 months ago • written 9 months ago by s.w.vanderlaan20
Answer: Merge column to SummarizedExperiment from other dataframe
0
gravatar for James W. MacDonald
9 months ago by
United States
James W. MacDonald50k wrote:

When you merge the colData and your data.frame you end up changing the rownames of the resulting DataFrame, which is where the colnames come from. You could just do

cn <- colnames(aems450k1.MvaluesQCIMPplaqueSE)

colData(aems450k1.MvaluesQCIMPplaqueSE) <- merge(colData(aems450k1.MvaluesQCIMPplaqueSE), AEDB_Q1_20180223_sex, by.x = "STUDY_NUMBER", by.y = "STUDY_NUMBER", all.x = TRUE)

colnames(aems450k1.MvaluesQCIMPplaqueSE) <- cn

 

ADD COMMENTlink modified 9 months ago • written 9 months ago by James W. MacDonald50k

I think the issue is that the colData gets a different order than the Assay data, which should not happen. But if I sort = to the merge command everything is just fine, and I can add the colnames later on. So:

dim(aems450k1.MvaluesQCIMPplaqueSE) 

aems450k1.MvaluesQCIMPplaqueSE 

colData(aems450k1.MvaluesQCIMPplaqueSE) <- merge(colData(aems450k1.MvaluesQCIMPplaqueSE), AEDB_Q1_20180223_sex, by = "STUDY_NUMBER", sort = FALSE) 

colnames(aems450k1.MvaluesQCIMPplaqueSE) <- aems450k1.MvaluesQCIMPplaqueSE$ID 

dim(aems450k1.MvaluesQCIMPplaqueSE)

Which results in :

class: RangedSummarizedExperiment 
dim: 483731 485 
metadata(4): creationDate author BBMRIomicsVersion note
assays(1): data
rownames(483731): cg01707559 cg02004872 ... ch.22.47579720R ch.22.48274842R
rowData names(10): addressA addressB ... probeEnd probeTarget
colnames(485): 8918692001_R01C01 8918692001_R02C01 ... 9221198166_R06C01 9221198166_R06C02
colData names(946): STUDY_NUMBER SampleID ... Basename ID

Which is the correct order in the colnames. While without sort =, the order of colnames would be like colnames(485): 9221198166_R06C02 9221198166_R06C01 ... 8918692001_R02C01 8918692001_R01C01.

Does this makes sense?

ADD REPLYlink modified 9 months ago • written 9 months ago by s.w.vanderlaan20
Answer: Merge column to SummarizedExperiment from other dataframe
0
gravatar for s.w.vanderlaan
9 months ago by
s.w.vanderlaan20 wrote:

I think the issue is that the colData gets a different order than the Assay data, which should not happen. But if I sort = to the merge command everything is just fine, and I can add the colnames later on. So:

dim(aems450k1.MvaluesQCIMPplaqueSE) 

aems450k1.MvaluesQCIMPplaqueSE 

colData(aems450k1.MvaluesQCIMPplaqueSE) <- merge(colData(aems450k1.MvaluesQCIMPplaqueSE), AEDB_Q1_20180223_sex, by = "STUDY_NUMBER", sort = FALSE) 

colnames(aems450k1.MvaluesQCIMPplaqueSE) <- aems450k1.MvaluesQCIMPplaqueSE$ID 

dim(aems450k1.MvaluesQCIMPplaqueSE)

Which results in :

class: RangedSummarizedExperiment 
dim: 483731 485 
metadata(4): creationDate author BBMRIomicsVersion note
assays(1): data
rownames(483731): cg01707559 cg02004872 ... ch.22.47579720R ch.22.48274842R
rowData names(10): addressA addressB ... probeEnd probeTarget
colnames(485): 8918692001_R01C01 8918692001_R02C01 ... 9221198166_R06C01 9221198166_R06C02
colData names(946): STUDY_NUMBER SampleID ... Basename ID

Which is the correct order in the colnames. While without sort =, the order of colnames would be like colnames(485): 9221198166_R06C02 9221198166_R06C01 ... 8918692001_R02C01 8918692001_R01C01.

Does this makes sense?

ADD COMMENTlink written 9 months ago by s.w.vanderlaan20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 193 users visited in the last hour