Deleting a column from data frame and then running DESeq2
1
1
Entering edit mode
amv112 • 0
@fe155c8c
Last seen 3 months ago
United States

Forgive me if this post is messy, I'm new to this! I'm analyzing RNA Seq data and found that one of my samples is an outlier (sample AV17). I'm trying to exclude it from my analysis, but whenever I do, using this code: dds = subset(countData, select = -c(AV17) ), it works, but then when I run the next line dds$condition <- relevel(dds$condition, ref = "CNTRL_WD") it fails and says "dds$condition <- relevel(dds$condition, ref = "CNTRL_LFD") Error in relevel.default(dds$condition, ref = "CNTRL_LFD") : 'relevel' only for (unordered) factors" This line would previously work when sample AV17 was included. How can I fix this?

```dir="/Users/Desktop/Trf2 Data" setwd=(dir) directory<-getwd()

library("DESeq2") library("dplyr") library("tidyverse")

countData=read.table("/Users/Desktop/Trf2 Data/Trf2_Liver_RawCounts.txt") dim(countData)

DONT NEED THIS ANYMORE. UPDATE DOESN'T REQUIRE IT.

rownames(countData)=countData[,1]

Get rid of rownames column now that rownames are set

countData<-countData[,2:dim(countData)[2]]

dim(countData)

Metadata ----------------------------------------------------------------

The col data is the file-group assignment file

coldata = read.delim("/Users/Desktop/Trf2 Data/Trf2_metadata1.txt") rownames(coldata)<-coldata[,1] coldata$condition <-as.factor(coldata$condition) dim(coldata)

make sure all columns match with sample info

all(colnames(countData) %in% rownames(coldata)) all(colnames(countData)==rownames(coldata))

dds<-DESeqDataSetFromMatrix(countData = countData, colData = coldata, design = ~ condition) dim(dds)

Remove any genes without at least 10 counts

keep <- rowSums(counts(dds)) >=10 dds <- dds[keep,] dds

dds = subset(countData, select = -c(AV17) )

Set reference factor

mods= ~ relevel(factor(condition), ref="CNTRL_WD")

dds$condition <- relevel(dds$condition, ref = "CNTRL_WD") dds <- DESeq(dds)```

include your problematic code here with any corresponding output

please also include the results of running the following in an R session

sessionInfo( ) ```R version 4.3.2 (2023-10-31) Platform: x86_64-apple-darwin20 (64-bit) Running under: macOS Sonoma 14.1

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York tzcode source: internal

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] biomaRt_2.58.0 EnhancedVolcano_1.20.0 ggrepel_0.9.5
[4] BiocManager_1.30.22 lubridate_1.9.3 forcats_1.0.0
[7] stringr_1.5.1 purrr_1.0.2 readr_2.1.5
[10] tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.4
[13] tidyverse_2.0.0 dplyr_1.1.4 DESeq2_1.42.0
[16] SummarizedExperiment_1.32.0 Biobase_2.62.0 MatrixGenerics_1.14.0
[19] matrixStats_1.2.0 GenomicRanges_1.54.1 GenomeInfoDb_1.38.5
[22] IRanges_2.36.0 S4Vectors_0.40.2 BiocGenerics_0.48.1

loaded via a namespace (and not attached): [1] tidyselect_1.2.0 farver_2.1.1 blob_1.2.4 filelock_1.0.3
[5] Biostrings_2.70.1 bitops_1.0-7 fastmap_1.1.1 RCurl_1.98-1.14
[9] BiocFileCache_2.10.1 XML_3.99-0.16 digest_0.6.34 timechange_0.3.0
[13] lifecycle_1.0.4 KEGGREST_1.42.0 RSQLite_2.3.4 magrittr_2.0.3
[17] compiler_4.3.2 progress_1.2.3 rlang_1.1.3 tools_4.3.2
[21] utf8_1.2.4 yaml_2.3.8 knitr_1.45 prettyunits_1.2.0
[25] S4Arrays_1.2.0 labeling_0.4.3 curl_5.2.0 bit_4.0.5
[29] DelayedArray_0.28.0 xml2_1.3.6 abind_1.4-5 BiocParallel_1.36.0
[33] withr_3.0.0 grid_4.3.2 fansi_1.0.6 colorspace_2.1-0
[37] scales_1.3.0 cli_3.6.2 rmarkdown_2.25 crayon_1.5.2
[41] generics_0.1.3 rstudioapi_0.15.0 httr_1.4.7 tzdb_0.4.0
[45] cachem_1.0.8 DBI_1.2.1 zlibbioc_1.48.0 parallel_4.3.2
[49] AnnotationDbi_1.64.1 XVector_0.42.0 vctrs_0.6.5 Matrix_1.6-1.1
[53] hms_1.1.3 bit64_4.0.5 locfit_1.5-9.8 glue_1.7.0
[57] codetools_0.2-19 stringi_1.8.3 gtable_0.3.4 munsell_0.5.0
[61] pillar_1.9.0 rappdirs_0.3.3 htmltools_0.5.7 GenomeInfoDbData_1.2.11 [65] dbplyr_2.4.0 R6_2.5.1 evaluate_0.23 lattice_0.21-9
[69] png_0.1-8 memoise_2.0.1 Rcpp_1.0.12 SparseArray_1.2.3
[73] xfun_0.41 pkgconfig_2.0.3

```

DESeq2 • 326 views
ADD COMMENT
0
Entering edit mode

Cleaning up:

I'm analyzing RNA Seq data and found that one of my samples is an outlier (sample AV17). I'm trying to exclude it from my analysis, but whenever I do, using this code: dds = subset(countData, select = -c(AV17) ), it works, but then when I run the next line dds$condition <- relevel(dds$condition, ref = "CNTRL_WD") it fails and says

dds$condition <- relevel(dds$condition, ref = "CNTRL_LFD")
Error in relevel.default(dds$condition, ref = "CNTRL_LFD") : 
  'relevel' only for (unordered) factors"

This line would previously work when sample AV17 was included. How can I fix this?

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 17 hours ago
United States

It looks like you have declared something as an ordered factor. DESeq2 doesn't work with those.

If you have a character, just use the following to define a factor:

dds$condition <- factor(dds$condition, levels=c("these","are","the","levels"))

This will set the order of the levels with the first being the reference.

ADD COMMENT

Login before adding your answer.

Traffic: 544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6