Friday, May 3, 2013

Big Data for Ecommerce





We live in a world full of superabundant data. The retail giant Wal-Mart deals with more than one million customer transactions each hour and imports the data into databases at about 2.5petabytes. Decoding the human genome requires working with 3 billion base pairs. Facebook hosts 40 billion photos and the number is going up fast [1]. Average American households were bombarded with 3.6 zettabytes of data mainly in the forms of video games and television in 2008 [2]. The large amount of data is given the name “big data”, which is defined as a collection of structured or unstructured data sets so large and complex that it is too hard for traditional data processing applications or database management tools to handle.

Challenges and Solutions of Big Data
The challenges of big data can be concluded using the “4Vs”: volume, velocity, variety and value.
Four "Vs" of Big Data
The Volume refers to the size of big data. Most businesses generate much more data than what
their systems are able to handle [3].

The Velocity is the speed of data going in and out. This challenge exists if an organization's data
analysis or data storage operates slower than the speed of data generation. The velocity problem could happen if millions of customers click on the organization’s website at the same time or thousands of sales transactions take place every second [3].

The Variety is the various forms of data structure. This challenge exists due to the need to process different types of data, structured, unstructured or semi-structured, to produce the desired insights.
For example, a company may analyze data from social networks, databases and customer service call records at the same time [3].
 
The Value challenge refers to deriving valuable insights from data, which is consider as the most important feature of all V's. It is easier for a company to collect all the data but it is difficult to ask the right questions to get the most value out of big data [3].
Big data brings huge headaches to business analytics in terms of storage, process and management because standard tools and procedures are not designed for handling big data [4]. IDC reports that the amount of information has gone beyond available storage and the size of the digital universe in 2012 would be ten times of the size five years earlier. According to ForresterResearch, data for average organizations will increase by 50 percent this year and overall corporate data will grow by 94 percent. The good news is that scale-out architectures have been developed to meet the large storage need and purpose-built applications have been enhanced to process big data [5]. Organizations have data from various instruments, so the data sets may not be unique. To efficiently manage big data, organizations must first get rid of duplicated and synthesized data so that the amount of data to be managed is reduced. Next, organizations should take advantage of the virtualization technology so that multiple applications can access and reuse the same data set and the small data set is able to be stored on storage device [5].
           
Big Data Toolbox
Hadoop Big Data Tool
Apache Hadoop is an open-source application designed to handle massive amount of structured, unstructured and semi-structured data. It uses Google’s MapReduce and file system techniques as its building foundation, through which it spreads out data and allows users to ask complicated computing questions. Hadoop is known for the following features.
      High Scalability: New nodes can be added when needed without having to change data formats, data loading methods, and the applications on top [6].
Affordable Cost: Hadoop enables commodity servers to do parallel computing, which results in a sizeable reduce in the cost of storage per terabyte [6].
Flexible: Hadoop can absorb both structured and unstructured data from any number of sources. It joins and aggregates data from multiple sources arbitrarily to allow deeper analyses [6].
Fault Tolerant: When a node is lost, the system switches to another location of the data and continues processing [6].
 

      Use of Big Data for Ecommerce
Personalization. Data from customers’ purchasing history should be processed in real-time to offer them individualized experience, including items to browse and deals to offer. For example, online retailers may want to differentiate how they treat loyal customers and new customers. They can reward existing customers for their loyalty and hold special campaigns to attract new customers [3].
Dynamic pricing. Online retailers can use dynamic pricing to compete on price with other websites, which requires incorporating data from multiple sources, including competitor pricing, regional preferences, product sales, and customer actions to determine the best price to close the sale. Ecommerce giant like Amazon already has this functionality in place, which gives its business a huge
      competitive advantage [3].
Customer service. Outstanding customer service is a critical contributor to the success of an ecommerce site. Look at the success of Zappos and Netflix. They are excellent examples of good customer service. However, big data has made customer service challenging by using every interaction with the customer to serve the same customer. To continue to excel at customer service, ecommerce websites need to overcome this challenge. For example, if a customer complains via the online chat on the website and also tweets about it, it will be good to be aware of the complains when he calls customer service. This will help the customer feel listened to and valued[3].
Predictive analytics. Analytics is crucial for all online retails, regardless of size. Without analytics it is difficult to sustain your business. Big Data has helped businesses identify events before they occur. This is called "predictive analytics." Predictive analytics is becoming an important tool for many businesses. A good example of this is predicting the revenue from a certain product in the next quarter. Knowing this, a merchant can better manage its inventory costs and avoid key out-of-stock products [3].

Conclusion
To conclude, although big data sounds overwhelming and mysterious, it is manageable with the right tool. Organizations which make good use of big data will benefit enormously as big data is the unavoidable trend.

References
  1.  http://www.economist.com/node/15557443
  2. http://www.economist.com/node/15557421
  3. http://www.practicalecommerce.com/articles/3960-6-Uses-of-Big-Data-for-Online-Retailers
  4. http://dauofu.blogspot.com/2013/01/big-data-why-students-need-to-know.html
  5. http://www.forbes.com/sites/ciocentral/2012/07/05/best-practices-for-managing-big-data/
  6.  http://www-01.ibm.com/software/data/infosphere/hadoop/
  7.  http://www.icc-usa.com/insights/hadoop-continued-emergence/