Hello Everyone,
I have a question about TCGAbiolinks that I didn't see anyone else address (please make me aware if I'm incorrect). My goal is to identify within TCGA-BRCA a specific mutation of interest and find all barcodes for breast cancer patients harboring this mutation and place it into a list (I'll also do the same for the WT). The next step would be to take that list and extract all the processed RNAseq data of the patients.
My question is really with the first part (once I have the list filtering out the RNAseq data to my list would be a relatively easy), how can I specify in the GDCquery command that I'm only interested in the patients with this mutation? I was trying to look for a place where I can enter the SNP id or perhaps write the position of the mutation, but I can't seem to find a way to do so.
Using the basic structure outlined in the vignettes my current query code looks something like this
query.maf.hg19_breastcancer <- GDCquery(project = "TCGA-BRCA", data.category = "Simple nucleotide variation", data.type = "Simple somatic mutation", access = "open", legacy = TRUE)
Any advice would be greatly appreciated!
Thank you,
Yonatan