Tuesday, February 18, 2014

STRUCTURING BIG DATA: Approach to realize the value of Big Data

Big Data defined:

Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, storage, search, sharing, transfer, analysis and visualization. Industry analysts articulated the now mainstream definition of big data as the four Vs and a C: Volume, Velocity, Variety, Variability and Complexity.

Big Data Analytics:

Big Data analytics is the process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations and other useful information. Such information can provide competitive advantages over rival organizations and result in business benefits, such as more effective marketing and increased revenue.
The primary goal of big data analytics is to help companies make better business decisions by enabling data scientists and other users to analyze huge volumes of transaction data as well as other data sources that may be left untapped by conventional business intelligence programs. These other data sources may include Web server logs and Internet clickstream data, social media activity reports, mobile-phone call detail records and information captured by sensors. 

Big Data Management:

Big data management is the organization, administration and governance of large volumes of both structured and unstructured datakpsko. The goal of big data management is to ensure a high level of data quality and accessibility for business intelligence and big data analytics applications. Corporations, government agencies and other organizations employ big data management strategies to help them contend with fast-growing pools of data, typically involving many terabytes or even petabytes of information saved in a variety of file formats. Effective big data management helps companies locate valuable information in large sets of unstructured data and semi-structured datakpsko from a variety of sources, including call detail records, system logs and social media sites. Tools used for Big Data Management are: Hadoop, MapReduce, NoSQL, Cassandra and Hive.
Important business questions will be:
  • Why are our customers leaving us?
  • What is the value of a 'tweet' or a 'like'?
  • What products are our customers most likely to buy?
  • What is the best way to communicate with our customers?
  • Are our investments in customer service paying off?
  • What is the optimal price for my product right now?
The value of data is only realized through insight. And insight is useless until it’s turned into action. To strike upon insight, you first need to know where to dig. Finding the right questions will lead you to the well.

Big Data as a Service (BDaaS):

Big data as a service (BDaaS) is the delivery of statistical analysis tools or information by an outside provider that helps organizations understand and use insights gained from large information sets in order to gain a competitive advantage. Given the immense amount of unstructured data generated on a regular basis, big data as a service is intended to free up organizational resources by taking advantage of the predictive analytics skills of an outside provider to manage and assess large data sets, rather than hiring in-house staff for those functions.
Big data as a service can take the form of software that assists with data processing or a contract for the services of a team of data scientists. BDaaS is a form of managed services, similar to Software as a Service or Infrastructure as a Service. Big data as a service often relies upon cloud storage to preserve continual data access for the organization that owns the information as well as the provider working with it.

The Challenges:

Many organizations are concerned that the amount of amassed data is becoming so large that it is difficult to find the most valuable pieces of information.
  • What if your data volume gets so large and varied you don't know how to deal with it?
  • Do you store all your data?
  • Do you analyze it all?
  • How can you find out which data points are really important?
  • How can you use it to your best advantage?
Until recently, organizations have been limited to using subsets of their data, or they were constrained to simplistic analyses because the sheer volumes of data overwhelmed their processing platforms. But, what is the point of collecting and storing terabytes of data if you can't analyze it in full context, or if you have to wait hours or days to get results? On the other hand, not all business questions are better answered by bigger data. You now have two choices: Incorporate massive data volumes in analysis or determine upfront which data is relevant.

Final words:

Big data transforms the data management landscape by changing fundamental notions of data governance and IT delivery. Though big data is still at its early stage, the advantages of big data will feed the development of new capabilities in sensing, understanding, and playing an active role in the world for the next 20 years and will change all walks of life. However, the underlying analytics and interpretations of results will still require human cognition to connect the dots and see the big picture.
An organization needs a strategic plan to adopt the big data technologies. The ability to collect and analyze massive amounts of data will be a key competitive advantage across all industries, including government. Such analytics projects can be complicated, idiosyncratic, and disruptive—thus they require a strategic plan to be successful.
It takes time to change the culture of depending only on traditional data analytics. There will be occasions of unethical, abuse or misuse of big data applications as big-data analytics and technologies are implemented. Therefore, it is better to be cautious and start small and simple.
It takes the whole society to implement big datakpsko technologies. Since big data will affect all of us in our life and collaboration and partnership are essential to make big data successful, we are all responsible to work together in dealing with the issues raised along with application of the technologies on this journey for the next decade or two.

"Now you can run hundreds and thousands of models at the product level - at the SKU level - because you have the big data and analytics to support those models at that level."

References :

[1] http://www.thoughtworks.com/big-data-analytics
[2] http://www.sas.com/en_us/insights/big-data/what-is-big-data.html
[6] http://en.wikipedia.org/wiki/Big_data