Creating subsets of data using bash and measuring comparisons in java programs

164 Views Asked by At

I am stumped on this question I have to do. Its part of some extra exercises I am doing for my Computer Science course.

We have to conduct an experiment to count the number of comparisons that a binary search tree performs versus a traditional array. I have written both programs in java and read data from a large data set (that contains dam information like names, levels, locations etc.), extracting the necessary information and storing the data in objects in the array/binary tree.

I am stuck on this question:

Conduct an experiment with DamArrayApp (array) and DamBSTApp (binary search tree) to demonstrate the speed difference for searching between a BST and a traditional array.

You want to vary the size of the dataset (n) and measure the number of comparison operations in the best/average/worst case for every value of n (from 1-211). For each value of n:

Create a subset of the sample data (hint: use the Unix head command).

Run both instrumented applications for every dam name in the subset of the data file. Store all operation count values.

Determine the minimum (best case), maximum (worst case) and average of these count values.

It is recommended that you use Unix or Python scripts to automate this process.

I do not know how to even start this. I know to use BASH, but have not been taught it. The data set is a csv file that has 211 rows of data, so I need to make that shorter and count the operations for each case. Any help would greatly be appreciated. Even help with the bash script would suffice.

I need the bash script to turn the data file from this:

row1
row2
.
.
.
row211

To something like:

row1
row2

And then another subset like:

row1
row2
row3
row4

(Basically a subset for every n up to 211)

0

There are 0 best solutions below