How to enable drakma to handle non-latin-1 characters in URL

521 Views Asked by At

I encountered an error caused by non-Latin-1 characters used in a given url using sbcl e.g.:

(drakma:http-request "http://www.youtube.com/„weird-url")

debugger invoked on a FLEXI-STREAMS:EXTERNAL-FORMAT-ENCODING-ERROR in thread
#<THREAD "initial thread" RUNNING {1002998D23}>:
  #\DOUBLE_LOW-9_QUOTATION_MARK (code 8222) is not a LATIN-1 character.

Type HELP for debugger help, or (SB-EXT:QUIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [ABORT] Exit debugger, returning to top level.

(FLEXI-STREAMS::SIGNAL-ENCODING-ERROR
 #<FLEXI-STREAMS::FLEXI-LATIN-1-FORMAT (:ISO-8859-1 :EOL-STYLE :LF)
   {1002F196E3}>
 "~S (code ~A) is not a LATIN-1 character."
 #\DOUBLE_LOW-9_QUOTATION_MARK
 8222)

Apparently Headers are defined to be sent in Latin-1 by RFC2616(this is the ticket I opended at github after encountering this error) and therefore the URL has to be properly encoded before being passed to drakma. But I have no clue how, as apparently it is impossible (as it is no LATIN-1 character) to do so?

What would be the working call for my example (besides the fact that the URL is bogus und could be shortened to http://www.youtube.com)?

(drakma:http-request (magic-encoding-function "http://www.youtube.com/„weird-url"))
2

There are 2 best solutions below

1
On

This problem don't concern DRAKMA. It's fault PURI. I use my fork of PURI: https://github.com/archimag/puri-unicode.

0
On

Just figured it out that if the flaw rests in post-processing of the newly-instantiated object then the work-around might be to split the process in two parts:

  1. Construct the URI with only the Latin-1 part.
  2. Set the path

It would be like:

(let ((uri (puri:uri "https://wikimedia.org"))) (setf (puri:uri-path uri) (concatenate 'string "/" (drakma:url-encode "/кадабра" :utf-8))) uri) Produces:

#<PURI:URI https://wikimedia.org/%D0%BA%D0%B0%D0%B4%D0%B0%D0%B1%D1%80%D0%B0>

Drakma then accepts this URI without any additional processing.