Class MRJob Python TypeError: super(type, obj): obj must be an instance or subtype of type

37 Views Asked by At

Problem

Running Class MRJob custom code in Python 3.x. The MRJob class is define. Unit testing in Jyputer Notebook ok running fine on unit test. Saved Jyputer Notebook into Python .py file to run on console for final unit test.

Seeking help on the Type Error for super on configure_args object, and what I did wrong not to have the instance object be a correct instance or subtype of JoinJob() object.

Main Process for running instance of class

Issued a new object instance on JoinJoin() class:

    instance = JoinJob()
    instance.testJob()

Class class JoinJob() has defined a function, configure_args, which is the class method definition.

    def configure_args(self)

Area of line 122 for Type Error

This class, MapReduceJoinJob, subclasses MRJob, and has /// line 122 /// where the Type Error is raised. It can be observed that the function definition for configure_args() is properly setup to issue a super on the main class, JoinJob and its defined configure_args() to override it. However, no custom overriding code is customized on this example.

class MapReduceJoinJob(MRJob):
    OUTPUT_PROTOCOL = RawValueProtocol
    /// line 122 ///
    def configure_args(self):
        super(JoinJob, self).configure_args()

In main python routine, called the driver class, MapReduceJoinJob(), by creating a new instance of MapReduceJoinJob(), to then process and invoke the

if __name__ == "__main__":
    driver_reduce_join()
    instance = MapReduceJoinJob() #// This is the area of Type Error

Error

TypeError: super(type, obj): obj must be an instance or subtype of type
line 122, in configure_args
    super(JoinJob, self).configure_args()
TypeError: super(type, obj): obj must be an instance or subtype of type

MRJob class definition

# MRJob MapReduce using MRJob (MRUnit in Python)
# MRJob / MRUnit Test unit class

class JoinJob():
///line 122///
    def configure_args(self):

        JoinJob.configure_args()

    def showResults(self, df_h_data, df_v_data, combine_row):

        print('MRJob Test Case : Columns from Dataset Homocides Non-Fatal')
        print('\n')
        df_h_data['HOMICIDE'].iloc[:9]
        print(df_h_data)

        print('MRJob Test Case : Columns from Dataset Victim Demographics')
        print('\n')
        df_v_data['HOMICIDE'].iloc[:8]

        print('MRJob Test Case : Combined Map Reduced Dataset Homocides Non-Fatal & Victim Demographics')
        print('\n')
        combine_row

        print(combine_row)


    # test classs function
    def testJob(self):

        h_data, v_data = MapReduceJoin()
        combined_row = CombineDatasets(h_data, v_data)
        df_combine_row = pd.DataFrame(combined_row)
        by='BATTERY'
        combine_row = SortVictimData(df_combine_row, sort_order=False, column=by)
        df_h_data = pd.DataFrame(h_data)
        df_v_data = pd.DataFrame(v_data)

        self.showResults(df_h_data, df_v_data, combine_row)


#%%
# MRJob MapReduce using MRJob (MRUnit in Python)

# Test: test differnt row filter, test sort

class MapReduceJoinJob(MRJob):
    OUTPUT_PROTOCOL = RawValueProtocol

    def configure_args(self):
        super(JoinJob, self).configure_args()

    instance = JoinJob()
    instance.testJob()


1

There are 1 best solutions below

0
On

Generals

The super(type, obj) function in Python is used to call a method from the parent class (superclass) of a subclass. It is commonly used when you want to access and invoke methods from the parent class that have been overridden or extended in the subclass. The super() function helps ensure that the appropriate method from the superclass is called.

Here's how to understand super(type, obj):

type: This is the class that you want to start the search for the method. It is usually the subclass (child class) that you are currently in.

obj: This is the instance of the current object. It is the instance of the subclass from which you want to call the superclass method.


Status Quo

So after this explanation, let's see the status quo :

TypeError: super(type, obj): obj must be an instance or subtype of type

which is logical, because the self isn't related in any way to the JoinJob (doesn't inherit from JoinJob). This was the point to ask you about the implementation of the the MRJob, just to see if there is a relation between the MRJob and JoinJob classes.


Further points to clarify

What is the point of the definiton here, you are making a circular dependacy, which could explode in RecursionError ?

class JoinJob(): 
     def configure_args(self):
         JoinJob.configure_args() # very dangerous - circular ependency

What is the main point of this call super(JoinJob, self).configure_args() in your MapReduceJoinJob ? If you want to call the implementation of the MRJob you need only super().configure_args(), which supposes that the parent class has it also.

I hope beening somehow useful and good luck :)