I am having large date set in which some of columns are Date and other are categorical Data like Status, Department Name, Country Name.
So how this data is treated in graphlab when i call the  graphlab.linear_regression.create method, does i have to pre-process this data and convert them into numbers or can directly provide to graphlab.
 
                        
Graphlabis mostly used for computing tabular and graph based datasets, and have highscalabilityandperformance. Ingraphlab.linear_regression.create,graphlabhave inbuilt feature of understanding the type of data and giving most suitable method oflinear regressionfor optimizing results. For Example, for numeric data of target and feature both, most of the time,graphlabtakesNewtons Methodof linear regression. Similarly, depending on the dataset, understands the need and gives method accordingly.Now, about preprocessing,
graphlabonly takesSFramefor learning that need to be parsed correctly before any learning. While creating anSFrame, unprocessed and error creating data are always reflected and throws an error. So, in order to go through any learning, you need to have a clean data. IfSFrameaccepts the data, and also your chosen target and feature for learning that you want, you are good to go butpre-processingandcleaning datais always recommended. Also, its always a good practice to dofeature engineeringbefore any learning algorithm, and redefining data types before learning is always recommended for accuracy.About your point on how data is treated in
Graphlab, I would say, it depends!. Some datasets are tabular and are treated accordingly and some in graph structure. Graphlab performs very well when comes toregression treeandboosted classifierswhich followsdecision treeconcept and are quite time and resource consuming in other libraries thangraphlab.For me,
graphlabperformed very well while creating recommendation engine where I had dataset of nodes and edges andboosted tree classifierwith 18 iterations too worked flawless in quite scalable time and I must say, even for tree structured data,graphlabperforms very well. I hope this answer helps.