creating a subsetted MRexperiment based on sample name?
2
0
Entering edit mode
rleonzay • 0
@rleonzay-8504
Last seen 9.3 years ago

Hi, 

 

 

Im working with a 53 sample of OTU sequences and I was hoping to create a new MRexperiment with a subset of the samples. I would like to use the sample name as the feature for which to select the samples by.

I am able to use the:  samplesToKeep = which(pData(obj)$sampleName == "XX") command and with the "featuresToKeep" i can create a new MR experiment. but I can only achieve to get one sample at a time using the sample name, so I go from :

MRexperiment (storageMode: environment)
assayData: 1319558 features, 53 samples 
  element names: counts 
protocolData: none
phenoData
  sampleNames: Yf 717 ... P4 (53 total)
  varLabels: ProjectName LinkerPrimerSequence ... BarcodeName (5 total)
  varMetadata: labelDescription
featureData
  featureNames: denovo0 denovo1 ... denovo1319557 (1319558 total)
  fvarLabels: taxonomy1 taxonomy2 ... taxonomy8 (8 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'

to

MRexperiment (storageMode: environment)
assayData: 1319558 features, 1 samples 
  element names: counts 
protocolData: none
phenoData
  sampleNames: Yc
  varLabels: ProjectName LinkerPrimerSequence ... BarcodeName (5 total)
  varMetadata: labelDescription
featureData
  featureNames: denovo0 denovo1 ... denovo1319557 (1319558 total)
  fvarLabels: taxonomy1 taxonomy2 ... taxonomy8 (8 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'

Is there a way that I can get add more than one samples to the subset? or if I separate the samples individually, is there a way I can merge those MRexperiment??

thank you for your time

 

metagenomeseq • 1.9k views
ADD COMMENT
1
Entering edit mode
@joseph-nathaniel-paulson-6442
Last seen 7.7 years ago
United States

msd16s is just an example dataset. Another example is mouseData. I highlight below. But for your specific it looks like you'll have to specify the samples you're interested in, either by index or sample name.

samplesToKeep = c("Yf","717","718")

obj2 = obj[,samplesToKeep]

To add metadata you can use the pData(obj)$variableName = vector or fData(obj)$variableName, but be careful with consistency and ordering of sample values.

 

> data(mouseData)
> mouseData
MRexperiment (storageMode: environment)
assayData: 10172 features, 139 samples 
  element names: counts 
protocolData: none
phenoData
  sampleNames: PM1:20080107 PM1:20080108 ... PM9:20080303 (139 total)
  varLabels: mouseID date ... status (5 total)
  varMetadata: labelDescription
featureData
  featureNames: Prevotellaceae:1 Lachnospiraceae:1 ...
    Parabacteroides:956 (10172 total)
  fvarLabels: superkingdom phylum ... OTU (7 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:  
> mouseData[,pData(mouseData)$mouseID=="PM1"]
MRexperiment (storageMode: environment)
assayData: 10172 features, 12 samples 
  element names: counts 
protocolData: none
phenoData
  sampleNames: PM1:20080107 PM1:20080108 ... PM1:20080303 (12 total)
  varLabels: mouseID date ... status (5 total)
  varMetadata: labelDescription
featureData
  featureNames: Prevotellaceae:1 Lachnospiraceae:1 ...
    Parabacteroides:956 (10172 total)
  fvarLabels: superkingdom phylum ... OTU (7 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:  
ADD COMMENT
0
Entering edit mode

that worked great!! thank you so much!

ADD REPLY
0
Entering edit mode
@joseph-nathaniel-paulson-6442
Last seen 7.7 years ago
United States

Hi rleonzay,

If each sample has an independent sample name then `which(pData(obj)$sampleName=="XX")` is going to return the column index for sample "XX". However, you can create a general vector for the samples of interest to subset the obj.

As an example the data vignette for msd16s [http://bioconductor.org/packages/release/data/experiment/vignettes/msd16s/inst/doc/msd16s.pdf] shows a few subsetting examples. If you're still unable to subset the MRexperiment object, feel free to send me an MRE and I'll show you specifically how.

 

Best,

ADD COMMENT
0
Entering edit mode

 

 

Thank you for your response, it seems like the msd16S package will make subsetting the MRexperiement object much easier. but if Im reading the msdq6s vignette correctly, i would still have the same problem with trying to get my subset.. I guess the issue is that the samples that I want to subset out don't have any metadata or feature in common...  as an example.. to use the same sample dataset from the msd16S vignette,  it would be like trying to subset together samples from "Bangladesh", "Gambia" and "Mexico". the pData of my sample looks like this:

> pData(obj)
    ProjectName LinkerPrimerSequence BarcodeSequence Description   BarcodeName
Yf  040914JB27F AGRGTTTGATCMTGGCTCAG        CGTAGATA          Yf  Ill.27F.bar4
717 040914JB27F AGRGTTTGATCMTGGCTCAG        CTAATCGC         717 Ill.27F.bar23
B   040914JB27F AGRGTTTGATCMTGGCTCAG        CGTAGGCT           B  Ill.27F.bar5
718 040914JB27F AGRGTTTGATCMTGGCTCAG        CTAATGCA         718 Ill.27F.bar24 

so for example,  I want to be able to have samples Yf, 717 and 718 all together in the same MRexperiment object.. Im not the most experience at r just yet, so my apologies if this is very simple and I just don't get it yet.. 

I guess another option would be to add metadata to the Mrexperiment object? 

ADD REPLY

Login before adding your answer.

Traffic: 878 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6