Data constipation. Data liberation. Data science.

I’m writing this post during a week of workshops focused on liberating data, getting disparate petabytes of the stuff flowing and combining to inform decision making, inspire innovation, and enhance responsiveness. Braintribe redefines how we wield enterprise data to competitive advantage.

For any reader familiar with the advantages of Gartner’s bimodal approach, we’re effectively developing the membrane between modes 1 and 2, between the safe, monolithic, legacy systems, and the agile, explorative, and dynamic business reality. In our enthusiam, a colleague commented:

I love data science!

… just the perfect provocation for a short blog post.

In their book, Data Science for Business, Foster Provost and Tom Fawcett refer to data science as:

… the extraction of useful information and knowledge from large volumes of data, in order to improve business decision-making.

They’re not slow in distinguishing a few fundamental concepts underpinning data science from the plethora of data mining techniques – a data miner or data analyst is not the same thing as a data scientist.

Amongst the concepts, Provost and Fawcett highlight four in particular:

  1. Extracting useful knowledge from data to solve business problems can be treated systematically by following a process with reasonably well-defined stages
  2. From a large mass of data, information technology can be used to find informative descriptive attributes of entities of interest
  3. If you look too hard at a set of data, you will find something – but it might not generalize beyond the data you’re looking at
  4. Formulating data mining solutions and evaluating the results involves thinking carefully about the context in which they will be used.

Drew Conway offers up a Venn diagram by way of explanation, with data science at the intersection of hacking skills, mathematical knowledge, and substantive (ie, deep domain) expertise:

Braintribe helps us hack. Its connectors and accesses and models and ‘datapedia’ will tame your organisation’s myriad data, transforming it to meet the needs of your business right now, this moment. I particularly enjoy the direct participation of non-techies in this process.

This is truly liberating when contrasted with today’s norm – ie, having to live with the legacy guesses as to what those needs might be, legacy guesses that came a distant second to the technological and systematic constraints of the time and that took so long to realise that the world had moved on by the time they were delivered.

I’ll finish here with this polemic from an email I just received from Mark Anderson, Future In Review:

Instead of worrying about Big Data because of its size, thought leaders in the design of advanced computer systems need to be focused on the difference this shift brings. Instead of “What size is that database?” – a typical IT question today – the new question will be “What is the data flow rate inside that stream?”

Data is nothing without flow, context, and domain expertise, and its value is suffocated or liberated by org design. We love it, although it has to be said that even our excitement stops just shy of the exclamation in Harvard Business Review – Data Scientist: The Sexiest Job of the 21st Century!

Share your thoughts

No Comments

Sorry, the comment form is closed at this time.