I am trying to create a wordcloud with 'Thank You' in different languages. Somehow, some characters don't show on the plot, however.
library(ggplot2)
data("thankyou_words_small")
set.seed(42)
data("thankyou_words_small")
ggplot(
thankyou_words_small,
aes(label = word, size = speakers, color = speakers)) +
geom_text_wordcloud(area_corr = TRUE, rm_outside = TRUE) +
scale_size_area(max_size = 24) +
theme_minimal() +
scale_color_gradient(low = "darkblue", high = "lightblue")
I tried using ggwordcloud
instead of ggplot
, which didn't work either.
Please keep your answers simple as I am just a beginner using R. Thank you :)
Alright, since you're using Mac, I think I can give you the answer you're looking for!
This will require a bit of finesse on your end because I've installed a lot of fonts. So what I see is unlikely to be exactly what you see. (You won't need to install fonts for this to work.)
First, I'm using the library
showtext
. If you don't have that installed, you'll need it.This first function call is where you may see something different than what I see.
As you can see this returned two rows for me. I'm going to call for only the first row since these two lines are identical.
If you only return 1 line, then drop the brackets. If your call returned many lines, just make sure that the line you select is "Regular" for "face".
To use this font, you can either call
showtext_begin()
orshowtext_auto()
I really have not seen any difference between the two. Then call the plot. When you call the plot, you need to include thefamily
ingeom_text_wordcloud
, I usedwhere[1, ]$family
, but you can just copy the string, too.It does say that you're supposed to end or close the
showtext
with eithershowtext_end()
orshowtext_auto(F)
. However, I've never had an issue if I forgot or left it out intentionally.There are some other errors in this data for example, 'shukran' or thanks in MS Arabic is شكرا However, this is plotting the text backward. In Pashtoon, usually 'manana' is used for thank you (literally it means acceptance), which is written مننه. That's DEFINITELY not what's in this dataset. It's probably backward. (It's not a word in Pashtoon as it is.)
I thought the reversal of these was due to the left-to-right thing, but Hindi is correct, Japanese is correct, Gujarati is correct...Urdu is incorrect. Sigh. I couldn't find a font that did this any better. I found a way to flip the words, but Farsi is still incorrect. For example, مننه if written one letter at a time is م ن ن ه. That's what's happening with Farsi. (Farsi == Persian)
Here's the manual correctly for these words
Here's the difference: