Odd behavior of yield inside a if then block

430 Views Asked by At

I have a function that returns a generator or a list depending on a flag.

Yet even when I set the flag to list the function still returns a generator; also doesn't print the flag.

I expect the print statement prior to the yield command to be evaluated first. Also for that matter if the flag is set to list I do not expect the generator block to evaluate at all.

import os

def get_iterator_all_files_name(dir_path, flag):
    if flag == 'generator':
        print(flag)
        for (dirpath, dirnames, filenames) in os.walk(dir_path):
            for f in filenames:
                yield os.path.join(dirpath, f)
    elif flag == 'list':
        print(flag)
        paths = list()
        for (dirpath, dirnames, filenames) in os.walk(dir_path):
            for f in filenames:
                paths.append(os.path.join(dirpath, f))
        return paths

Using the function...

file_path = 'path/to/files'
flag = 'list'
foo = get_iterator_all_files_name(file_path, flag)
type(foo)

Which produces the result...

generator

Which is not what I expect; I was expecting list.

2

There are 2 best solutions below

3
iz_ On BEST ANSWER

If a function has the word yield in it, it's a generator. No exceptions. The code will not be evaluated until iteration is attempted.

Just call list on the result instead:

def get_iterator_all_files_name(dir_path):
    for (dirpath, dirnames, filenames) in os.walk(dir_path):
        for f in filenames:
            yield os.path.join(dirpath, f)

file_path = 'path/to/files'
foo = list(get_iterator_all_files_name(file_path))

You can modify your function to return a generator if you really wanted to preserve the flag functionality. You can also make paths a list comprehension, which can simplify the function:

def get_iterator_all_files_name(dir_path, flag):
    if flag == 'generator':
        return (os.path.join(dirpath, f) for (dirpath, dirnames, filenames) in os.walk(dir_path) for f in filenames)
    elif flag == 'list':
        return [os.path.join(dirpath, f) for (dirpath, dirnames, filenames) in os.walk(dir_path) for f in filenames]
0
fountainhead On

Any function that has a yield statement becomes a generator function, and therefore, its return statement has a different meaning (as compared to the return statement of a normal function). If you really need the get_iterator_all_files_name function to return a generator sometimes and a list at other times, one way of doing it is this:

  1. Define one more function (say my_gen_func) to do what you're doing now inside your if clause. This new function would therefore be a generator function (since it would have a yield statement).
  2. Inside the get_iterator_all_files_name function, modify the if clause, to just have a call to my_gen_func, and return its return-value. (Now, your get_iterator_all_files_name function is no longer a generator function, but is just a normal function, since it doesn't have a yield statement)
  3. Your elif clause can remain the same.