Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.6 years ago
I have a RMA normalized genes expression datset with 22810 rows and 9
columns( types of promoters) and a subset of the data is as follows:
ID_REF GSM362180 GSM362181 GSM362188 GSM362189 GSM362192
244901 5.094871713 4.626623079 4.554272515 4.748604391 4.759221647
244902 5.194528083 4.985930299 4.817426064 5.151654407 4.838741605
244903 5.412329253 5.352970877 5.06250609 5.305709079 8.365082403
244904 5.529220594 5.28134657 5.467445095 5.62968933 5.458388909
244905 5.024052699 4.714631878 4.792865831 4.843975286 4.657188246
244906 5.786557533 5.242403911 5.060605782 5.458148567 5.890061836
-- output of sessionInfo():
I want to do a clustering of the above and tried the hierarchical
clustering:
d <- dist(as.matrix(deg), method = "euclidean")
where deg is the a matrix of the differentially expressed genes ( 4300
in number ).And I get the following warning:
Warning message:
In dist(as.matrix(deg), method = "euclidean") : NAs introduced by
coercion
Is it allright to proceed with the clustering inspite of the warning
?
hc <- hclust(d)
plot(hc, hang = -0.01, cex = 0.7)
I get a dendrogram which is very dense and the labels are not clear:
Also I do not know which of the 9 promoters are classified in the tree
for the several genes: How would it be possible to label the tree with
the promoters and also how to visualize the genes into a clearer
dendrogram? There are around 4300 genes and would like to get a better
dendrogram so that I could visualize it better.
--
Sent via the guest posting facility at bioconductor.org.