Hi all,
I've been trying to analyse an RNA-seq dataset, and I decided to try the newer HISAT2>StringTie>Ballgown approach instead of Tophat2>Cufflinks>CummeRbund etc.
I'm having real trouble working out how to handle my biological replicates, as there doesn't seem to be much documentation or discussion on these newer tools. It seems like most people would use Cuffnorm and it's easy to see why as you can very easily specify what are your repeats for each sample. I'm sure there's a way to do this in Ballgown but I'm far to inexperienced to spot it so any help would be fantastic.
Thanks in advance.
Hi Alyssa,
I have a similar question to what was posted here, except I have 6 biological replicates (2 samples, 3 replicates each) and 4 technical replicates per biological replicates (for a total of 24). I have done as you stated for denoting the replicates in pData. How do I go about combining the expression values and getting the average expression? And at what step of the analysis do I do that for?
Thanks.
hi Alyssa, new to R and ballgown. Have 16 samples run thru hisat2 with the --dta and stringtie with -B option, made pheno_data, and ballgown dir with the 16 sample dir with the .ctab and .gtf files for each sample. Got to run in ballgown ok, and made .csv files for genes and transcripts. What I need to do now is tell ballgown how to handle the 16 samples. There are 2 biological reps per sample, and two treatment groups, ctr and bmp2, and 4 time points. Could you give me some help on how to make pheno_data csv file. I need to deal with the varience in the biology rep first, then the stats of diff between ctr and bmp2 treatments, then the stat of the changes between time points and treatment. Thanks so much, Enjoying the program. steveharris