Question: Quality control on RNA-Seq (level 3) data from TCGA
0
gravatar for NS
4.0 years ago by
NS60
United States
NS60 wrote:

I have downloaded RNA_Seq files for mRNAs in level 3 format from TCGA database. These files contain "read_counts".

Now, my supervisor emphasizes to do quality control steps, e.g. GC-content bias and length bias but I do not know anything about these procedures and how I can do them. My major is not biology and it is the first time I am working with RNA-Seq files.

I appreciate if anyone can help me and tell me how I can do such pre-processing steps.

ADD COMMENTlink modified 4.0 years ago by Matthew McCormack180 • written 4.0 years ago by NS60
Answer: Quality control on RNA-Seq (level 3) data from TCGA
3
gravatar for Matthew McCormack
4.0 years ago by
United States
Matthew McCormack180 wrote:

If you have the files in .fastq format you can use fastqc available from here: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Fastqc will work on zipped files, so you do not have to unzip them first. However, I think this format is what TGAC calls level 1 data. Level 3 data seems to me to be files in which the expression changes have already been calculated, so in other words much of the analysis has already been done. You would not be able to asses the quality of sequencing from this type of file. You would need the level 1 data (specifically, .fastq files.).

If you download fastqc and also the .fastq files, preferably zipped .fastq files, then you can just open the zipped .fastq files into fastqc. They will take a few minutes to load a single file. It will then provide a series of graphics. You can watch this 12 min. video made by the fastqc author on how to interpret the results here: https://www.youtube.com/watch?v=bz93ReOv87Y

(The names here can be a little confusing if you are beginning, so remember that .fastq is a sequencing file with results from the sequencing machine, and fastqc is a program you can download to assess the quality of the sequencing using the .fastq files.)

ADD COMMENTlink written 4.0 years ago by Matthew McCormack180
Answer: Quality control on RNA-Seq (level 3) data from TCGA
1
gravatar for Steve Lianoglou
4.0 years ago by
Denali
Steve Lianoglou12k wrote:

I think you should consider asking your supervisor for some pointers ... it sounds like you're still engaged in some type of training (graduate school, perhaps?) and this is what advisors/supervisors are for.

In any event, googling for the topics you mention along with "RNAseq" will surely provide many places you can get started, as well.

ADD COMMENTlink written 4.0 years ago by Steve Lianoglou12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 245 users visited in the last hour