I have a file that contain some lines like this:
StatsLearning_Lect1_2a_111213_v2_%5B2wLfFB_6SKI%5D_%5Btag22%5D.mp4
Respect to this lines, i have some files on disk, but saved on decoded form:
StatsLearning_Lect1_2a_111213_v2_[2wLfFB_6SKI]_[tag22].mp4
I need get file name from first file list and correct file name from second file and change file name to second name. For this goal, i need decode html entity from file name, so i do somthing like this:
import os
from html.parser import HTMLParser
fpListDwn = open('listDwn', 'r')
for lineNumberOnList, fileName in enumerate(fpListDwn):
print(HTMLParser().unescape(fileName))
but this action doesn't have any effect on run, some run's result is:
meysampg@freedom:~/Downloads/Practical Machine Learning$ python3 changeName.py
StatsLearning_Lect1_2a_111213_v2_%5B2wLfFB_6SKI%5D_%5Btag22%5D.mp4
StatsLearning_Lect1_2b_111213_v2_%5BLvaTokhYnDw%5D_%5Btag22%5D.mp4
StatsLearning_Lect3_4a_110613_%5BWjyuiK5taS8%5D_%5Btag22%5D.mp4
StatsLearning_Lect3_4b_110613_%5BUvxHOkYQl8g%5D_%5Btag22%5D.mp4
StatsLearning_Lect3_4c_110613_%5BVusKAosxxyk%5D_%5Btag22%5D.mp4
How i can fix this?
This is actually "percent encoding", not HTML encoding, see this question:
How to percent-encode URL parameters in Python?
Basically you want to use
urllib.parse.unquote
instead: