pcaGoPromoter PCA plot in R: issue with EntrezID conversion
1
1
Entering edit mode
@shermanleung-12516
Last seen 7.1 years ago

I want to make a PCA plot with pcaGoPromoter in R. The PCA plot is successfully output.

But the error message is concerning (1358x NAs, printed 8 times, so 10864x NAs).

The script triggering this converts gene symbol input into entrezID. When not all gene symbol inputs match the annotationDBI package's exported variable named "keys(ALIAS2EG)" (lines 73-74), the error message is printed (lines 75-77).

Yet, 1358 or 10864 NAs don't match up with the number of gene symbols or samples, which confuses me as to the error source? sessionInfo() and traceBack() output an "Error: could not find function "sessioninfo"" and "No traceback available", respectively. R syntax below.

Txt <- read.table("Troubleshooting.txt", sep="\t", stringsAsFactors = FALSE)
Txt2 <- Txt[-1,-1]
TxtRowNames <- Txt[-1,]
TxtColNames <- Txt[,-1]
rownames(Txt2) <- TxtRowNames[,1]
colnames(Txt2) <- TxtColNames[1,]

TxtMatrix <- as.matrix(sapply(Txt2, as.numeric))
rownames(TxtMatrix) <- TxtRowNames[,1]

groups <- as.factor(c(rep("group0", 18), rep("group1", 12), rep("group2", 20), rep("group3", 8), rep("group4", 4)))

org = "Hs"
pcaInfoPlot(eData=TxtMatrix, inputType = "geneSymbol", org = "Hs", groups = groups, printNames = FALSE)

Input file named "Troubleshooting.txt" below:

	9313	1224	2548	1675	7812	0199	1140	1262	1468	1327	8529	7565	0102	2229	7251	8010	6079	6819	5580	4323	2858	1767	6271	4644	2701	8754	6956	0581	1097	1447	8270	6584	6188	4680	4524	4443	4728	5809	6938	6946	1069	0895	3552	4668	7765	8996	8211	2813	254	0535	0688	0614	2349	3775	3873	9891	9559	9210	2278	1681	2517	6426
STAT4	23.19915	27.94375	38.65654	13.88639	50.04449	39.01492	51.21341	50.94292	66.839	34.08679	35.40753	44.93988	46.50882	41.44522	28.50687	29.45479	44.64111	35.88891	51.49246	52.76216	49.18337	68.379	41.07954	36.467	61.02523	38.70053	41.15611	32.96739	31.94683	31.65725	14.76364	9.902103	33.08009	77.56579	42.31062	43.98261	37.50708	20.13991	49.69822	40.82711	45.92907	67.79272	82.9397	32.18494	34.98264	77.36794	45.48687	37.30666	10.83206	99.5925	64.34677	50.58935	57.25155	53.88837	51.4903	56.38808	35.06069	73.62757	63.53493	84.61014	104.3644	53.8075
STAT5A	32.55475	20.91405	13.96418	16.13962	19.30982	28.21588	38.01206	18.64529	17.81376	27.49474	29.53615	23.20675	17.82752	15.77255	37.84343	20.99944	22.47888	18.09823	18.24303	28.93667	24.2842	19.9979	22.22169	20.11972	27.94957	29.28711	26.51355	26.03291	30.4816	36.45321	28.0367	12.13881	23.21871	34.71006	40.68179	32.63882	31.66189	21.80509	23.3295	41.15236	32.09903	22.31749	23.76337	44.11281	21.60571	26.86802	15.43765	19.06542	22.67733	30.78636	20.26558	53.74836	25.76205	22.81123	22.86731	21.0981	25.55881	16.10629	25.49158	25.71282	17.51019	20.17859
STAT5B	29.34803	39.23871	35.59641	62.53675	65.17721	58.5163	58.64405	50.16732	69.25548	70.63443	61.16002	67.58993	77.55661	67.53477	65.38542	60.3465	42.89635	40.88997	109.3396	79.40294	65.54056	67.19398	65.07484	52.03857	67.34473	54.69317	39.93113	74.59403	85.05795	59.75441	43.66469	78.50502	77.10388	56.38369	65.09477	58.78929	48.92863	46.71517	46.74517	49.46949	58.21569	88.36438	49.14809	57.82982	81.51627	54.67383	61.93891	59.93243	43.81245	76.27173	67.46222	72.9911	85.69537	69.1424	71.86661	88.80033	43.42941	34.65601	65.92624	55.03899	85.54913	19.89631
STAT6	38.89492	55.2769	44.71809	66.82571	36.44907	39.39149	48.92197	39.78284	47.91942	37.11774	43.07106	45.93265	43.08953	43.76677	55.66043	47.85568	39.77233	39.44257	43.43257	35.94394	46.16983	40.93459	40.69253	45.66117	43.18754	39.78093	50.48138	32.91945	36.54533	32.30114	49.42401	57.97247	65.0062	28.98422	57.21943	47.63041	69.05553	62.26572	65.93487	62.35591	40.18366	39.47462	47.62658	64.77912	68.15086	40.57777	40.56697	37.11168	64.89862	46.33305	48.63342	45.59366	45.63237	31.11938	40.26489	48.99543	50.56485	47.11536	49.70892	39.20923	53.32842	37.56121
JAK1	439.9206	325.6098	326.6815	297.4987	341.6362	507.0686	463.7957	385.6686	365.959	365.766	437.5612	405.6547	670.5164	465.8846	389.8235	374.3773	379.6636	589.6162	800.0623	342.9356	470.5382	341.2559	353.6279	441.3368	443.1618	381.2422	476.4778	394.7726	423.2596	384.7598	205.8688	277.4601	427.5641	335.5606	474.6544	394.1585	419.0314	382.1477	534.8743	423.4712	564.0081	592.7668	396.1822	435.5475	371.6714	294.684	450.6384	98.86637	132.5037	319.4258	481.735	527.7332	434.489	426.8031	416.6871	507.3962	413.1067	338.7463	430.8003	337.133	328.4037	224.9426
JAK2	30.76121	21.35065	11.46488	47.18938	49.01455	43.05171	52.77835	48.75338	60.03957	51.1185	70.42561	61.67632	75.5535	70.134	53.26757	52.00068	48.60649	64.49801	20.26325	60.43185	55.08773	44.45929	52.87514	48.84322	52.6522	41.00381	32.45596	42.67103	52.70628	59.20725	46.84264	36.01573	187.9149	92.60603	54.42977	74.86516	64.69067	39.71875	64.00294	36.16427	61.96661	77.03024	57.23373	52.25129	58.42437	33.93612	65.46022	20.09151	15.64649	42.31321	74.71553	78.18569	54.77353	55.54325	49.11412	69.13605	17.51631	46.97427	57.88049	43.21951	84.30801	53.96008
JAK3	25.90615	25.82413	22.00422	22.74725	25.77733	15.1311	22.50289	21.53977	20.67226	19.11759	18.42759	19.17412	23.98672	13.82716	17.26181	18.92593	23.94274	26.80977	18.40559	19.03874	26.59609	21.17547	30.09708	22.21938	20.65632	17.3466	26.95856	15.96742	13.23172	28.34152	31.8622	26.33862	19.55962	17.6229	26.31377	19.80722	24.05486	30.71412	15.39993	15.89843	20.89341	25.35745	25.92268	21.12884	25.24908	33.3415	24.73122	22.46892	33.01179	27.96555	18.5295	28.45135	25.14318	21.29793	24.63968	15.916	15.02648	30.91107	20.42242	18.78574	33.33047	17.97448
pca pca plot pcagopromoter • 1.1k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 3 hours ago
United States

You are not getting an error message! An error message says 'error' in the message, like this:

> log("trytodoit")
Error in log("trytodoit") : non-numeric argument to mathematical function

You aren't even getting a warning. It's just printing out an informative message saying that a bunch of your gene symbols can't be mapped to Entrez Gene IDs. And do note that the code does something like this:

index <- inputvector %in% keys
print("I can't find this stuff:", inputvector[!index])

And you are then getting a vector of NA values that can't be found. Since it's just subsetting out the things in your input vector that can't be found, that means you are passing in a bunch of NA values that are being returned to you! As an example:

> test <- c("BRCA1", "P53", NA, NA, NA)
> index <- test %in% keys(org.Hs.egALIAS2EG)
> test[!index]
[1] NA NA NA

So you need to look at your gene symbols and figure out why you have so many NA values.

You don't get anything returned from traceback, because there isn't an error (traceback only works if there was an error returned immediately prior to calling traceback). And as you note, if you call a function that doesn't exist, you get an error that helpfully tells you that the function can't be found. Which is what the apropos function is helpful for

> apropos("sessioninfo")
[1] "sessionInfo"

Note the capital I

ADD COMMENT
0
Entering edit mode

Thank you James, that was very helpful. I've focused my troubleshooting onto why the geneSymbols are not successfully converting into entrezIDs. No luck yet, but all the geneSymbols used in my input above are able to be mapped as entrezIDs from the org.Hs.egALIAS2EG database, when I use the below syntax (to individually map them, one by one). So this suggests the geneSymbols are correct, and mapping to entrezIDs is also okay, yet this process is not working when I use the original question's syntax and input..

I would be open to ideas as to what to test next, to work out what is causing the mapping issue :). In the meantime, I'll keep working on it.

> xx <- as.list(org.Hs.egALIAS2EG)
> xx["P53"] #the geneSymbol input
$P53
[1] "7157" #the entrezID output
> xx["STAT4"]
$STAT4
[1] "6775"
> xx["STAT5A"]
$STAT5A
[1] "6776"
> xx["STAT5B"]
$STAT5B
[1] "6777"
> xx["STAT6"]
$STAT6
[1] "6778"
> xx["JAK1"]
$JAK1
[1] "3716"
> xx["JAK2"]
$JAK2
[1] "3717"
> xx["JAK3"]
$JAK3
[1] "3718"

 

ADD REPLY

Login before adding your answer.

Traffic: 732 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6