Change charset of a response body from UTF-8 to CP1251

389 Views Asked by At

I evaluate the following code

(org.httpkit.client/get "http://localhost:81"
                    #(clojure.pprint/pprint (.getBytes (:body %))))

It prints

[-17, -65, -67, -17, -65, -67]

if index.html is in CP1251, and

[-48, -80, -48, -79, -48, -78]

if the same document is in UTF-8.

index.html contents in russian are

абв

http-kit returns response body as UTF-8 encoded String object, but it does not regard an actual charset of HTML document. This results in trash in the body like

"<html>�����</html>"

How can I make org.httpkit.client/get to regard a charset of the document?

1

There are 1 best solutions below

0
Sergey Filkin On

You can get raw bytes of the body by using org.httpkit.client.request with specific option.

The following code prints correct body contents if the document is in CP1251 encoding.

(org.httpkit.client/request {:url "http://localhost:81" :as :byte-array} 
                            #(println (String. (:body %) "cp1251")))