Multiple problems with TCGAbiolinks 2.2.6
Entering edit mode
lmorrill • 0
Last seen 4.4 years ago

I am trying to use TCGAbiolinks for the first time with R version 3.3.2 and Bioconductor version 3.4. I have so far encountered three problems of apparent different nature, which I show here (these are examples I found online):

clin.gbm <- GDCquery_clinic("TCGA-LIHC", "clinical")
                     main = "TCGA Set\n GBM",height = 10, width=10)

The data is downloaded fine (there are plenty of NAs but I imagine this is just how it is). However, the plot isn't shown and I get the error

Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.

I have also tried the following code

days_to_death <- floor(runif(200, 1, 1000))
vital_status <- c(rep("Dead",200))
groups <- c(rep(c("G1","G2"),c(100,100)))
df <- data.frame(days_to_death,vital_status,groups)


Error in TCGAanalyze_survival(df, clusterCol = "groups") :
  Columns vital_status, days_to_death and  days_to_last_follow_up should be in data frame

(which, of course, seem to me like they are!) and, finally,

mut <- GDCquery_Maf("ACC", pipelines = "muse")
clin <- GDCquery_clinic("TCGA-ACC","clinical")
clin <- clin[,c("bcr_patient_barcode","disease","gender","tumor_stage","race","vital_status")]
TCGAvisualize_oncoprint(mut = mut, genes = mut$Hugo_Symbol[1:20],
                        filename = "oncoprint.pdf",
                        annotation = clin,
                        width = 5,
                        heatmap.legend.side = "right",
                        dist.col = 0,
                        label.font.size = 10)

which gives Aggregate function missing, defaulting to 'length' in the last function.


I would be grateful for any suggestions as to how to solve it. It seems to me that the functions TCGAanalyze_survival() and  TCGAvisualize_oncoprint() are the problem whereas the rest works fine.

Update: I found it strange that it was asking for the column days_to_last_follow_up, which I had never specified, so I checked the function TCGAanalyze_survival and it requites it (perhaps there has been an important update of which I am unaware?). Once this new column was added to df I get, again,

Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
software error tcgabiolinks ubuntu • 870 views
Entering edit mode
Last seen 13 months ago
Brazil - University of São Paulo/ Los A…


1) By default, TCGAanalyze_survival will produce a pdf called survival.pdf.  And "Scale for 'colour'..." is just a warning. I just removed it the package, and I added a message to say that the file was created.


2) The second code does not works because we use another column (days_to_last_follow_up) in the code. When the patient is not dead the  days_to_death column (which is NA in that case) is updated with this value for  the analysis. If your data does not have this case, you can copy day_to_death to a column called days_to_last_follow_up. The code will ignore this column as there will be NA is your first column.

Solution: df <- data.frame(days_to_death,vital_status,groups, days_to_last_follow_up= days_to_death)

As you said, we were not requesting this column. But we decided to request it to follow GDC/TCGA standards. That might not be the best solution, but that would assure there will be  no NA in days_to_death column.

3) The message: "Aggregate function ...".  is not a problem too, only a warning. I suppressed it from the code. And there should be a file called oncoprint.pdf in your working directory.

Entering edit mode

Thank you - all solved now!


Login before adding your answer.

Traffic: 323 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6