Change charset of a response body from UTF-8 to CP1251

368 Views Asked by At

I evaluate the following code

(org.httpkit.client/get "http://localhost:81"
                    #(clojure.pprint/pprint (.getBytes (:body %))))

It prints

[-17, -65, -67, -17, -65, -67]

if index.html is in CP1251, and

[-48, -80, -48, -79, -48, -78]

if the same document is in UTF-8.

index.html contents in russian are

абв

http-kit returns response body as UTF-8 encoded String object, but it does not regard an actual charset of HTML document. This results in trash in the body like

"<html>�����</html>"

How can I make org.httpkit.client/get to regard a charset of the document?

1

There are 1 best solutions below

0
On

You can get raw bytes of the body by using org.httpkit.client.request with specific option.

The following code prints correct body contents if the document is in CP1251 encoding.

(org.httpkit.client/request {:url "http://localhost:81" :as :byte-array} 
                            #(println (String. (:body %) "cp1251")))