WEKA pattern investigation with Apriori , I don't get results

73 Views Asked by At

I want to extract knowledge from the competition data KDD Cup 1999. This dataset includes records about connections to the Internet, where each connection refers to an attack or a normal usage. The dataset contains 41 attributes and 4.898.431 records . I want to perform data correlation using the Apriori algorithm , investigate the patterns extracted and evaluate the results . All that using WEKA .

A sample of the arff file before any preprocessing looks like this : enter image description here

So I understand that in order to run Apriori in WEKA all attributes must be nominal so I use the "NumericToNominal" filter in WEKA to convert some numeric fields to nominal . The new file looks like this : enter image description here

So I take my new file and run Apriori . but I get no results . First I tried running the whole file but it kept loading untill it crashes , out of memory .I increased the memory but it keeps crashing so I took my original file and left only 20 records out of all just to test . Even then the programm just keeps loading without giving me any results .After a lot of searching I am assuming it has something to do with the data format . I thought that after using the built in filter my data would be in correct form ,but it seems there is more preprocessing that must be done . So how am I supposed to restructure my data and in what format excactly ? It is a really complex file for me so I am a bit lost, any help is really appreciated .

Thanks a lot for your time in advance !

0

There are 0 best solutions below