I am new to R and RStudio but have been trying to work through different examples using Rsubread for my data. I have tried reading vignettes and manuals prior to posting here but I am stuck and could really use some advice.
I have 7 paired-end, fastq files from Illumina sequencing of human samples. I have downloaded RStudio version 2022.12.0+353. I have installed the packages BiocManager and Rsubread using the packages function available in RStudio
I downloaded GRCh38.13.genome.fa.gz (release 43) from GENECODE and built the human genome reference into my working directory. Next I aligned my data one at a time and then placed my 7 aligned files in "BAM436files". I then ran feature counts on all files and placed them in "counts436". Everything seems to work well with most of the reads mapped during the alignment and featurecounts indicated 79-81% assigned alignments in all 7 samples.
The problem comes when I try to just look at the data. I tried to view the first 3 rows using commands like... this just lists all the data, not the first 3 rows. I can export the data to excel and the excel folder has 4 tabs; "counts", "annotation", "targets", "stat". The counts data has the names of my 7 samples across the top and then count data in each column...so I thought I was on the right track. So I guess my question is, why can't I limit the data display to the first "n" rows. Also, can I use this file for DESeq? I been trying to figure out how to generate files like coldata but thought I should make sure my count file would work first.
The codes I used are presented below.
>buildindex(basename = "human_index", reference = "GRCh38.p13.genome.fa.gz")
>align("human_index", readfile1 = "A436EV_1.fq.gz", readfile2 = "A436EV_2.fq.gz", output_file = "A436EV.aligned", output_format = ".", annot.inbuilt = "hg38", nthreads = 8)
>BAM436files <- c("A436EV.aligned", "C436EV.aligned", "E436EV.aligned", "A436200f.aligned", "B436200f.aligned", "C436200f.aligned", "D436200f.aligned")
>counts436 <- featurecounts(BAM436files, annotated.ext="directory for GRCh38.GTF", isGTFannotationFile=TRUE, nthreads=4, isPairedEnd=TRUE, countMulitMappingReads=FALSE, reportReads="CORE"
>head(counts436, 3)
>install.packages("openxlsx")
>library(openxlsx)
>write.xlsx(counts436, "destination folder"
# include your problematic code here with any corresponding output
# please also include the results of running the following in an R session
sessionInfo( )
Thanks for the help. I was able to work through it using the bioconductor link. For some reason, I thought I had to create a new tab delimited table using excel or a text editor for coldata.