Easiest way to convert read10xMolInfo data into a dataframe in R with gene labels as row names
1
0
Entering edit mode
Sitapriya • 0
@4f700f46
Last seen 12 months ago
United States

Hi, I have been playing around with the read10xMolInfo.

What I am looking for is a simple solution wherein I can convert the .h5 file into an R data frame or data table and have gene names added as a separate columns or as row names.

I came up with this solution but wondering if this works?

I am assuming here that the genes are indexed.

Meaning: Gene ID in position 1 (in the gene list) corresponds to the gene ID labels = 1 in the mol.info table.

mol.info <- read10xMolInfo(mol.info.file.h5)

mol.info.df <- mol.info$data %>% data.frame()

molecule.df.gene.name <- mol.info.df%>% 
    mutate(Gene = mol.info$genes[gene])
tidyverse R SingleCellExperiment SingleCellData • 1.8k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Initially posted to https://github.com/MarioniLab/DropletUtils/issues/104 geometry dash scratch

Thanks for sharing this link. I also have the same problem

ADD REPLY
0
Entering edit mode

Initially posted to https://github.com/MarioniLab/DropletUtils/issues/104 paybyplate

Great solution. It solved my problem. Thanks

ADD REPLY
0
Entering edit mode

Thanks for providing answer. applebees

ADD REPLY
0
Entering edit mode

Hpefully! it would be helpful. ehall pass login

ADD REPLY
0
Entering edit mode

For more info visit us on R350 status check

ADD REPLY
0
Entering edit mode
Peter Hickey ▴ 740
@petehaitch
Last seen 7 weeks ago
WEHI, Melbourne, Australia

An example usually helps. Here's a way to do it using base R.

library(DropletUtils)

# Mocking up some 10X HDF5-formatted data.
out <- DropletUtils:::simBasicMolInfo(tempfile())

# Reading the resulting file.
mol_info <- read10xMolInfo(out)

# Converting to data.frame.
mol_info_df <- as.data.frame(mol_info$data)
# Replace number in gene column with name.
mol_info_df$gene <- mol_info$gene[mol_info_df$gene]

# View the first few rows of the result.
head(mol_info_df)
#>   cell    umi gem_group   gene reads
#> 1 CCGA 366381         1  ENSG5    13
#> 2 CACG 673390         1  ENSG4     3
#> 3 ACAC 894928         1 ENSG11    10
#> 4 TACG 456777         1 ENSG10    12
#> 5 GATG 757426         1 ENSG14    10
#> 6 GTCT 833839         1  ENSG9    12
ADD COMMENT

Login before adding your answer.

Traffic: 557 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6