I'm trying to scrape a few sites that require unicode support. For example, I'm trying to get the title of this book, but it returns jumbled characters:
(-> "http://www.brill.nl/publications/evliya-celebis-book-travels"
java.net.URL. enlive/html-resource
(enlive/select [:h1#page-title]) first :content)
And trying to scrape an Arabic site returns with ?????? all over the place.
(enlive/html-resource (java.net.URL. "http://www.aljazeera.net/portal"))
I'm not sure how I'm supposed to activate unicode support.
Enlive does have unicode support because it uses Java strings. I ran your first example on my computer and got this result:
Perhaps the font that you are using doesn't have glyphs for the pointcodes that you are trying to show?