Saturday 29 April 2017

5 ways to improve the model accuracy of Machine Learning!!

Today we are into digital age, every business is using big data and machine learning to effectively target users with messaging in a language they really understand and push offers, deals and ads that appeal to them across a range of channels.

With exponential growth in data from people and & internet of things, a key to survival is to use machine learning & make that data more meaningful, more relevant to enrich customer experience.

Machine Learning can also wreak havoc on a business if improperly implemented. Before embracing this technology, enterprises should be aware of the ways machine learning can fall flat. Data scientists have to take extreme care while developing these machine learning models so that it generate right insights to be consumed by business.

Here are 5 ways to improve the accuracy & predictive ability of machine learning model and ensure it produces better results.

·       Ensure that you have variety of data that covers almost all the scenarios and not biased to any situation. There was a news in early pokemon go days that it was showing only white neighborhoods. It’s because the creators of the algorithms failed to provide a diverse training set, and didn't spend time in these neighborhoods. Instead of working on a limited data, ask for more data. That will improve the accuracy of the model.

·       Several times the data received has missing values. Data scientists have to treat outliers and missing values properly to increase the accuracy. There are multiple methods to do that – impute mean, median or mode values in case of continuous variables and for categorical variables use a class. For outliers either delete them or perform some transformations.

·       Finding the right variables or features which will have maximum impact on the outcome is one of the key aspect. This will come from better domain knowledge, visualizations. It’s imperative to consider as many relevant variables and potential outcomes as possible prior to deploying a machine learning algorithm.

·       Ensemble models is combining multiple models to improve the accuracy using bagging, boosting. This ensembling can improve the predictive performance more than any single model. Random forests are used many times for ensembling.

·       Re-validate the model at proper time frequency. It is necessary to score the model with new data every day, every week or month based on changes in the data. If required rebuild the models periodically with different techniques to challenge the model present in the production.

There are some more ways but the ones mentioned above are foundational steps to ensure model accuracy.

Machine learning gives the super power in the hands of organization but as mentioned in the Spider Man movie – “With great power comes the great responsibility” so use it properly.


Saturday 22 April 2017

Beyond SMAC – Digital twister of disruption!!

Have your seen the 1996 movie Twister, based on tornadoes disrupting the neighborhoods? A group of people were shown trying to perfect the devices called Dorothy which has hundreds of sensors to be released in the center of twister so proper data can be collected to create a more advanced warning system and save people.

Today if we apply the same analogy – digital is disrupting every business, if you stand still and don’t adapt you will become digital dinosaur. Everyone wants to get that advance warning of what is coming ahead.

Even if your business is doing strong right now, you will never know who will disrupt you tomorrow.

We have seen these disruption waves and innovations in technologies – mainframe era, mini computers era, personal computers & client-server era and internet era. Then came the 5th wave of SMAC era comprising Social, 
Mobile, Analytics and Cloud technologies.

Gone are the days when we used to wait for vacations to meet our families and friends by travelling to native place or abroad. Today all of us are interacting with each other on social media rather than in person on Facebook, Whastapp, Instagram, Snapchat and so on.

Mobile enablement has helped us anytime, anywhere, any device interaction with consumers. We stare at smarphone screen more than 200 times a day.

Analytics came in to power the hyper-personalization in each interaction and send relevant offers, communications to customers. The descriptive analytics gave the power to know what is happening to the business right now, while predictive analytics gave the insight of what may happen. Going further prescriptive analytics gave the foresight of what actions to be taken to make things happens.

Cloud gave organizations the ability to quickly scale up at lower cost as the computing requirements grow with secure private clouds.

Today we are in the 6th wave of disruption beyond SMAC era - into Digital Transformation, bringing Big Data, Internet of things, APIs, Microservices, Robotics, 3d printing, augmented reality/virtual reality, wearables, drones, beacons and blockchain.

Big Data allows to store all the tons of data generated in the universe to be used further for competitive edge.

Internet of Things allows machines, computers, smart devices communicate with each other and help us carry out various tasks remotely.

APIs are getting lot of attention as they are easy, lightweight, can be plugged into virtually any system and highly customizable to ensure data flows between disparate systems.

Microservices are independently developed & deployable, small, modular services. 

Robotics is bringing the wave of intelligent automation with help of cognitive computing.

3D printing or additive manufacturing is taking the several industries like medical, military, engineering & manufacturing by storm.

Augmented reality / virtual reality is changing the travel, real estate and education.

Wearables such as smart watches, health trackers, Google Glass can help real time updates,  ensure better health & enable hands-free process optimization in areas like item picking in a warehouse.

Drones have come out of military zone and available for common use. Amazon, Dominos are using it for delivery while Insurance & Agriculture are using it for aerial surveys.

Beacons are revolutionizing the customer experience with in-store analytics, proximity marketing, indoor navigation and contact less payments.

The new kid on the block is blockchain where finance industry is all set to take advantage of this technology.

As products and services are getting more digitized, traditional business processes, business models and even business are getting disrupted.

The only way to survive this twister is to get closer to your customers by offering a radically different way of doing business that’s faster, simpler and cheaper.

Saturday 15 April 2017

A to Z of Analytics

Analytics has taken the world by storm & It is the powerhouse for all the digital transformation happening in every industry.

Today everybody is generating tons of data – we as consumers leaving digital footprints on social media, IoT generating millions of records from sensors, Mobile phones are used from morning till we sleep. All these varieties of data formats are stored in Big Data platform. But only storing this data is not going to take us anywhere unless analytics is applied to it. Hence it is extremely important to close the loop with Analytics insights.

Here is my version of A to Z for Analytics:

Artificial Intelligence: AI is the capability of a machine to imitate intelligent human behavior. BMW, Tesla, Google are using AI for self-driving cars. AI should be used to solve real-world tough problems like climate modeling to disease analysis and betterment of humanity.

Boosting and Bagging: it is the technique used to generate more accurate models by ensembling multiple models together

Crisp-DM: is the cross-industry standard process for data mining.  It was developed by a consortium of companies like SPSS, Teradata, Daimler and NCR Corporation in 1997 to bring the order in developing analytics models. Major 6 steps involved are business understanding, data understanding, data preparation, modeling, evaluation, and deployment.

Data preparation: In analytics deployments, more than 60% time is spent on data preparation. As a normal rule is a garbage in garbage out. Hence it is important to cleanse and normalize the data and make it available for consumption by model.

Ensembling: is the technique of combining two or more algorithms to get more robust predictions. It is like combining all the marks we obtain in exams to arrive at the final overall score. Random Forest is one such example combining multiple decision trees.

Feature selection: Simply put this means selecting only those feature or variables from the data which really makes sense and remove nonrelevant variables. This uplifts the model accuracy.

Gini Coefficient: it is used to measure the predictive power of the model typically used in credit scoring tools to find out who will repay and who will default on a loan?

Histogram: This is a graphical representation of the distribution of a set of numeric data, usually a vertical bar graph used for exploratory analytics and data preparation step.

Independent Variable: is the variable that is changed or controlled in a scientific experiment to test the effects on the dependent variable like the effect of increasing the price of Sales.

Jubatus: This is an online Machine Learning Library covering Classification, Regression, Recommendation (Nearest Neighbor Search), Graph Mining, Anomaly Detection, Clustering

KNN: K nearest neighbor algorithm in Machine Learning used for classification problems based on distance or similarity between data points.

Lift Chart: These are widely used in campaign targeting problems, to determine which decile can we target customers for a specific campaign. Also, it tells you how much response you can expect from the new target base.

Model: There are more than 50+ modeling techniques like regressions, decision trees, SVM, GLM, Neural networks, etc present in any technology platform like SAS Enterprise Miner, IBM SPSS or R. They are broadly categorized under supervised and unsupervised methods into classification, clustering, association rules.

Neural Networks: These are typically organized in layers made up by nodes and mimic the learning like the brain does. Today Deep Learning is an emerging field based on deep neural networks.
 
Optimization: It the Use of simulations techniques to identify scenarios which will produce best results within available constraints e.g. Sale price optimization, identifying optimal Inventory for maximum fulfillment & avoid stockouts

PMML: this is XML based file format developed by data mining group to transfer models between various technology platforms and it stands for predictive model markup language.

Quartile: It is dividing the sorted output of the model into 4 groups for further action.

R: Today every university and even corporates are using R for statistical model building. It is freely available and there are licensed versions like Microsoft R. more than 7000 packages are now available at disposal to data scientists.

Sentiment Analytics: Is the process of determining whether an information or service provided by business leads to positive, negative or neutral human feelings or opinions. All the consumer product companies are measuring the sentiments 24/7 and adjusting their marketing strategies.

Text Analytics: It is used to discover & extract meaningful patterns and relationships from the text collection from social media site such as Facebook, Twitter, Linked-in, Blogs, Call center scripts.

Unsupervised Learning: These are algorithms where there is only input data and expected to find some patterns. Clustering & Association algorithms like k-means & apriori are best examples.

Visualization: It is the method of enhanced exploratory data analysis & showing the output of modeling results with highly interactive statistical graphics. Any model output has to be presented to senior management in the most compelling way. Tableau, QlikView, Spotfire are leading visualization tools.

What-If analysis: It is the method to simulate various business scenarios questions like what if we increased our marketing budget by 20%, what will impact on sales? Monte Carlo simulation is very popular.

What do you think should come for X, Y, Z?

Saturday 8 April 2017

From Bullock Cart to Hyperloop – Digital Transformation of Travel

Remember when you were teenager and wanted to go on vacation with parents-you were asked to go to travel agent and get all the printed brochures of exotic locations?  

Then came the dot.com wave and online booking sites like Expedia, Travelocity, Makemytrip paved so much that took travel agencies out of equation.

We used to send holiday postcards to our friends and families back home, which are gone out of business due to social media postings on Facebook, Instagram.

Lonely Planet used to be the traveler’s bible, but now we go to tons of websites like TripAdvisor, Priceline which provide us with advice and reviews on hotels, tours and restaurants.

Now I am able to book my flight online, have my boarding pass on my phone, check in with machines, go through automated clearance gates and even validate my boarding pass to board the plane

The travel industry, like many others, is being disrupted by great ideas powered by digital technology and innovation.

Some of the digital innovations travel industry taken so far:
·     Online booking sites like Expedia, Travelocity, MakeMyTrip, Trivago
·     Mobile optimization with Wi-Fi enablement
·     Targeting and hyper-personalization with Big Data Analytics
·     Digital discounts on travel by Kayak, Tripadvisor
·     Smartphones for research vacations, deals, feedbacks
·     Wearables like Disney band for payments, room keys
·     Bluetooth beacons to guide travelers in the vicinity at airports
·     Virtual reality – see the places without even getting out of home

All such digital footprint of customers are collected and then analyzed by big data analytics to hyper personalized the experience.

With extensively networked digital properties and deep hooks into customer data collected via travel booking sites and social media channels, travel companies are delivering customized dream vacations according to the likes and preferences of today’s travelers.

Today’s trend is towards spending money on memories & experiences instead of material possessions.

Accordingly, travel companies are investing in their digital storefronts and omni-channels to keep today’s hyper-connected travelers snapping, sharing, researching and reviewing on the fly – leaving immense data footprints for marketers to leverage.

Bluesmart is a high-quality carry-on suitcase that you can control from your phone. From the app you can lock and unlock it, weigh it, track its location, be notified if you are leaving it behind and find out more about your travel habits.

Thomas Cook have introduced virtual reality experiences across select stores.

Digital disrupters like Airbnb have already put tremendous pressure on hotels.

Starwood Hotels have launched “Let’s chat”, enabling guests to communicate with its front desk associates via WhatsApp, Blackberry messenger or iPhone before or during their stay.

World has gone from Bullock Cart to Hyperloop today. The future will belong to those using data-based intelligence to offer better experiences, encourage exotic longer and more frequent stays, and build long-term loyalty.

Do you want to get on Digital bandwagon? 



Saturday 1 April 2017

Digital Transformation in Manufacturing

Manufacturing companies have traditionally been slow to react to the advent of digital technologies like intelligent robots, drones, sensor technology, artificial intelligence, nanotechnology & 3d Printing.

Industry 4.0 has changed manufacturing. At a high-level, Industry 4.0 represents the vision of the interconnected factory where all equipment is online, and in some way, is also intelligent and capable of making its own decisions.

The explosion in connected devices and platforms, abundance of data from field devices and rapidly changing technology landscape has made it imperative for companies to quickly adapt their products and services and move from physical world to a digital world.

Today, Manufacturing is transforming from mass production to the one characterized by mass customization. Not only must the right products be delivered to the right person for the right price, the process of how products are designed and delivered must now be at a level of sophistication.

First step in digitization is to analyze current state of all systems starting R&D, procurement, production, warehousing, logistics, marketing, sales & service.

The digitization of manufacturing impacts every aspect of operations and the supply chain. It starts with equipment design, and continues through product design, production process improvement and, ultimately, monitoring and improving the end-user experience.

Digital transformation revolutionizes the way manufacturers share and manage product & engineering design, specs on the cloud by collaborating across geographies.

Down time and reliability are critical when it comes to the operation of equipment and machines on a shop floor. With Big data Analytics, the quick and easy access to this operation data, production information, inventory, quality data gives ability to quickly adjust to machine status across the enterprise.

Quality and yield is directly related to manufacturing processes as to how raw materials are used, inspected, manufactured, and how everything comes together. This really determines the quality level of the products. Cognitive computing enables earlier identification of nascent quality problems, increased production yield, and reduction of problems that lead to service and warranty costs.

Implementing smarter resource and supply chain optimization strategies helps to improve the cost efficiency of these resources like energy consumption, worker safety, and employee resource efficiency.

Service Excellence is also an important part of the strategy that companies are using to achieve digital transformation in the manufacturing space. Connected Devices (IoT) are changing the paradigm of delivering after-sales service. Some of the advantage are most prevalent in several selected industries, such as industrial equipment, power generation and HVAC providers:
·       Push Service Notifications
      o   How is your asset health?
      o   How is your asset usage?
·       Predictive/ PreventiveMaintenance
·       Break-Down Assistance
·       Usage-based Billing
·       Spares Fulfillment

General Electric’s jet engines combine cloud-based services, analytics and on-line sensors to report usage and status and help predict potential failures. The result is improved uptime and lower cost of ownership.

Additive manufacturing (3D printers) for prototyping help shorten the iteration cycles in the design process and help to turn innovation into value. 3D printing is also quickly gaining ground in the commercial manufacturing of customized products in low volumes.

Smart machines integrated with forklifts, storage shelves and production equipment. These machines are able to take autonomous decisions and communicate with each other to drive material replenishment, trigger manufacturing and much more.

Industry 4.0, allowing manufacturers to have more flexible manufacturing processes that can better react to customer demands.


360TotalSecurity WW