SociocaRtography R · Machine Learning · GIS

Decision Trees 101

What is supervised viz unsupervised analysis ? What is the difference between parametric and non-parametric models ? How to decide between RandomForest viz GBM ?A rudimentary, non-jargonish intuitive explanation on the working of recursive partioning algorithm Tutorial:Decision Trees (using R)

Data Musings: Much ado about Big Data

The recent brouhaha over Big Data and Pattern (read Data) Sciences distracts us from tracing it's genesis, emnating from our subconcious. Remember and try to think the common cord between the last book on Sherlock Holmes or the last movie on an alien invasion or a National Geography piece of gladiatorial struggle between a snake and mongoose. It is only after a string of high profile murders that Doyle's invulnerable suspect leaves a clue to help Holmes create a pattern. It's only after the nerd protagonist has tried permutations of armheads on different parts of the alien exoskeleton that he/she finds a literal chink in the armor. Similarily, it takes the mongoose couple of hits before it accurately gauges speed and mobility of the snake. All these calculated assessment or patterns are not much different from the estimated probabilties of lab experiments.

A repeated event creates a pattern and a minute iterated observation of patterns is what inculcates learning. We have evolved ourselves by doing this subliminal process of observing, learning and taking actions. During pre-historic ages we feared natural catastrophes (since we did not know what caused them) and attributed ignorance of it's causes by mapping unique pagan gods to each type of natural disaster. Now there exists a specialized insurance product with enumerable exposure for each natural catastrophy. The entire sentient spectrum higlights how we percieved and acted upon patterns as changes in form of interpretation. Even the most of intricate algorithms applied to unearth patterns are not completely new. Base logic for many machine learning algorithms (neural network, decision trees) are predicated on a common foundation of fine-tuned recursive learning.

So if perceptional Data Sciences has existed as a part of our psyche then what is the new data ? In dearth of a universal defintion of Big Data, on simple deconstrcution Big Data should simply more number of rows and columns. And that is precisely what is different about Data Sciences as being evolved today. I think it's only now that we can layer and condense multiple sources of information on single event, transaction or individual together. It's like having multiple perspectives on the same event only in the case of data they get harmoniously alligned to serve an objective.

The intent of this page is not unimaginatively different from the other blogs/io pages which I believe all subscribe of an overarching idea of creating infomration symmetry and reduce time for learning new skills and techniques and produce more works of scholarship benefitial to the society.