Problem
Running Class MRJob custom code in Python 3.x. The MRJob class is define. Unit testing in Jyputer Notebook ok running fine on unit test. Saved Jyputer Notebook into Python .py file to run on console for final unit test.
Seeking help on the Type Error for super on configure_args object, and what I did wrong not to have the instance object be a correct instance or subtype of JoinJob() object.
Main Process for running instance of class
Issued a new object instance on JoinJoin() class:
instance = JoinJob()
instance.testJob()
Class class JoinJob() has defined a function, configure_args, which is the class method definition.
def configure_args(self)
Area of line 122 for Type Error
This class, MapReduceJoinJob, subclasses MRJob, and has /// line 122 /// where the Type Error is raised. It can be observed that the function definition for configure_args() is properly setup to issue a super on the main class, JoinJob and its defined configure_args() to override it. However, no custom overriding code is customized on this example.
class MapReduceJoinJob(MRJob):
OUTPUT_PROTOCOL = RawValueProtocol
/// line 122 ///
def configure_args(self):
super(JoinJob, self).configure_args()
In main python routine, called the driver class, MapReduceJoinJob(), by creating a new instance of MapReduceJoinJob(), to then process and invoke the
if __name__ == "__main__":
driver_reduce_join()
instance = MapReduceJoinJob() #// This is the area of Type Error
Error
TypeError: super(type, obj): obj must be an instance or subtype of type
line 122, in configure_args
super(JoinJob, self).configure_args()
TypeError: super(type, obj): obj must be an instance or subtype of type
MRJob class definition
# MRJob MapReduce using MRJob (MRUnit in Python)
# MRJob / MRUnit Test unit class
class JoinJob():
///line 122///
def configure_args(self):
JoinJob.configure_args()
def showResults(self, df_h_data, df_v_data, combine_row):
print('MRJob Test Case : Columns from Dataset Homocides Non-Fatal')
print('\n')
df_h_data['HOMICIDE'].iloc[:9]
print(df_h_data)
print('MRJob Test Case : Columns from Dataset Victim Demographics')
print('\n')
df_v_data['HOMICIDE'].iloc[:8]
print('MRJob Test Case : Combined Map Reduced Dataset Homocides Non-Fatal & Victim Demographics')
print('\n')
combine_row
print(combine_row)
# test classs function
def testJob(self):
h_data, v_data = MapReduceJoin()
combined_row = CombineDatasets(h_data, v_data)
df_combine_row = pd.DataFrame(combined_row)
by='BATTERY'
combine_row = SortVictimData(df_combine_row, sort_order=False, column=by)
df_h_data = pd.DataFrame(h_data)
df_v_data = pd.DataFrame(v_data)
self.showResults(df_h_data, df_v_data, combine_row)
#%%
# MRJob MapReduce using MRJob (MRUnit in Python)
# Test: test differnt row filter, test sort
class MapReduceJoinJob(MRJob):
OUTPUT_PROTOCOL = RawValueProtocol
def configure_args(self):
super(JoinJob, self).configure_args()
instance = JoinJob()
instance.testJob()
Generals
The
super(type, obj)
function in Python is used to call a method from the parent class (superclass) of a subclass. It is commonly used when you want to access and invoke methods from the parent class that have been overridden or extended in the subclass. Thesuper()
function helps ensure that the appropriate method from the superclass is called.Here's how to understand
super(type, obj)
:type: This is the class that you want to start the search for the method. It is usually the subclass (child class) that you are currently in.
obj: This is the instance of the current object. It is the instance of the subclass from which you want to call the superclass method.
Status Quo
So after this explanation, let's see the status quo :
which is logical, because the self isn't related in any way to the
JoinJob
(doesn't inherit from JoinJob). This was the point to ask you about the implementation of the theMRJob
, just to see if there is a relation between theMRJob
andJoinJob
classes.Further points to clarify
What is the point of the definiton here, you are making a circular dependacy, which could explode in
RecursionError
?What is the main point of this call
super(JoinJob, self).configure_args()
in your MapReduceJoinJob ? If you want to call the implementation of the MRJob you need onlysuper().configure_args()
, which supposes that the parent class has it also.I hope beening somehow useful and good luck :)