I have a little perl script that I updated to download images from tvrage. But I have a problem. This is the code line I have problems with:
system "wget -P '/home/user/script/cache/posters' $imgurl";
It usually works just fine but from time to time it fails with the same error.
HTTP request sent, awaiting response... 200 OK
Length: 16758 (16K) [image/jpeg]
Saving to: â/home/user/script/cache/posters/28386.jpgâ
ERROR! Wide character in syswrite at IO/Handle.pm line 207.
ERROR! Wide character in syswrite at IO/Handle.pm line 207.
Compilation failed in require.
Wide character in syswrite at IO/
I have located the problem to be that wget changes ‘ and ’ to â
â/home/user/script/cache/posters/28386.jpgâ
All successful downloads have the ‘ and ’
HTTP request sent, awaiting response... 200 OK
Length: 28218 (28K) [image/jpeg]
Saving to: ‘/home/user/script/cache/posters/6597.jpg’
I just tried adding this
system "wget --restrict-file-names=nocontrol -P '/home/tup/tuper4/cache/posters' $imgurl";
In the hope that it would work better and so far it has not failed but I suspect it's not the issue and would like some guidance if possible.
Should I maybe try
system "cd /location/ && wget $imgurl";
Would it make any difference?
I guess my real question here is: What could cause wget to change from ‘ and ’ to â ?
Thank you in advance for any help!
Output of locale is:
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
And the images are also UTF-8
I did suspect that it had to do with the encoding and hence added
--restrict-file-names=nocontrol
Remains to see if it will work.
Edit: Several days later and I have not seen the error again so it looks like "nocontrol" helped.
It's not
wget
changing the character.The character encoding seems to be set to something wrong.
When the real encoding is
UTF-8
, as it probably is, but set to something else, showing the quote as characterâ
is a typical symptom. Sometimes it's followed by more characters.So it should work if you set the encoding to
UTF-8
.--
What is the output of the command
locale
?Background info:
http://askleo.com/why_do_i_get_odd_characters_instead_of_quotes_in_my_documents/
Googling "â quote" gives some good results.