I'm trying to upload files containing special characters on our platform via the exec
command but the characters are always interpreted and it fails.
For example if I try to upload a mémo.txt file I get the following error:
/bin/cp: cannot create regular file `/path/to/dir/m\351mo.txt': No such file or directory
The UTF8 is correctly configured on the system and if I run the command on the shell it works fine.
Here is the TCL code:
exec /bin/cp $tmp_filename $dest_path
How can I make it work?
The core of the problem is what encoding is being used to communicate with the operating system. For
exec
and filenames, that encoding is whatever is returned by theencoding system
command (Tcl has a pretty good guess at what the correct value for that is when the Tcl library starts up, but very occasionally gets it wrong). On my computer, that command returnsutf-8
which says (correctly!) that strings passed to (and received from) the OS are UTF-8.You should be able to use the
file copy
command instead of doingexec /bin/cp
, which will be helpful here as that's got less layers of trickiness (it avoids going through an external program which can impose its own problems). We'll assume that that's being done:If that fails, we need to work out why. The most likely problems relate to the encoding though, and can go wrong in multiple ways that interact horribly. Alas, the details matter. In particular, the encoding for a path depends on the actual filesystem (it's formally a parameter when the filesystem is created) and can vary on Unix between parts of a path when you have a mount within another mount.
If the worst comes to the worst, you can put Tcl into ISO 8859-1 mode and then do all the encoding yourself (as ISO 8859-1 is the “just use the bytes I tell you” encoding);
encoding convertto
is also useful in this case. Be aware that this can generate filenames that cause trouble for other programs, but it's at least able to let you get at it.Care might be needed to convert different parts of the path correctly in this case: you're taking full responsibility for what's going on.
If you're on Windows, please just let Tcl handle the details. Tcl uses the Wide (Unicode) Windows API directly so you can pretend that none of these problems exist. (There are other problems instead.)
On macOS, please leave
encoding system
alone as it is correct. Macs have a very opinionated approach to encodings.