threads and function 'print'

48 Views Asked by At

I'm trying to parallelize a script that prints out how many documents, pictures and videos there are in a directory as well as some other informations. I've put the serial script at the end of this message. Here's one example that shows how it outputs the informations about the directory given :

7 documents use 110.4 kb (      1.55 % of total size)
2 pictures use 6.8 Mb (     98.07 % of total size)
0 videos use 0.0 bytes (      0.00 % of total size)
9 others use 26.8 kb (      0.38 % of total size)

Now, I would like to use threads to minimize the execution time. I've tried this :

import threading
import tools
import time
import os
import os.path

directory_path="Users/usersos/Desktop/j"
cv=threading.Lock()

type_=["documents","pictures","videos"]
e={}
e["documents"]=[".pdf",".html",".rtf",".txt"]
e["pictures"]=[".png",".jpg",".jpeg"]
e["videos"]=[".mpg",".avi",".mp4",".mov"]


class type_thread(threading.Thread): 
    def __init__(self,n,e_):
        super().__init__()
        self.extensions=e_
        self.name=n
    def __run__(self):
        files=tools.g(directory_path,self.extensions)
        n=len(files)
        s=tools.size1(files)
        p=s*100/tools.size2(directory_path)
        cv.acquire()
        print("{} {} use {} ({:10.2f} % of total size)".format(n,self.name,tools.compact(s),p))
        cv.release()


types=[type_thread(t,e[t]) for t in type_]
for t in types:
    t.start()
for t in types:
    t.join()

When I run that, nothing is printed out ! And when I key in 't'+'return key' in the interpreter, I get <type_thread(videos, stopped 4367323136)> What's more, sometimes the interpreter returns the right statistics with these same keys.

Why is that ?


Initial script (serial) :

import tools
import time
import os
import os.path

type_=["documents","pictures","videos"]
all_=type_+["others"]
e={}
e["documents"]=[".pdf",".html",".rtf",".txt"]
e["pictures"]=[".png",".jpg",".jpeg"]
e["videos"]=[".mpg",".avi",".mp4",".mov"]

def statistic(directory_path):

    #----------------------------- Computing ---------------------------------

    d={t:tools.g(directory_path,e[t]) for t in type_}
    d["others"]=[os.path.join(root,f) for root, _, files_names in os.walk(directory_path) for f in files_names if os.path.splitext(f)[1].lower() not in e["documents"]+e["pictures"]+e["videos"]]
    n={t:len(d[t]) for t in type_}
    n["others"]=len(d["others"])
    s={t:tools.size1(d[t]) for t in type_}
    s["others"]=tools.size1(d["others"])
    s_dir=tools.size2(directory_path)
    p={t:s[t]*100/s_dir for t in type_}
    p["others"]=s["others"]*100/s_dir

    #----------------------------- Printing ---------------------------------

    for t in all_: 
        print("{} {} use {} ({:10.2f} % of total size)".format(n[t],t,tools.compact(s[t]),p[t]))
    return s_dir  
1

There are 1 best solutions below

0
On

Method start() seems not to work. When I replace

for t in types:
    t.start()
for t in types:
    t.join()

with

for t in types:
    t.__run__()

It works fine (at least for now, I don't know if it will still when I'll add other commands).