Machine Learning: For Beginners
Introduction to Machine Learning
Machine Learning works in a similar way to human learning. For example, if a child is shown images with specific objects on them, they can learn to identify and differentiate between them. Machine Learning works in the same way: Through data input and certain commands, the computer is enabled to “learn” to identify certain objects (persons, objects, etc.) and to distinguish between them.
Machine learning (ML) is the study of computer algorithms that improve automatically through experience. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as “training data”, in order to make predictions or decisions without being explicitly programmed to do so.
When Should You Use Machine Learning
Consider using machine learning when you have a complex task or problem involving a large amount of data and lots of variables, but no existing formula or equation.
There are much more examples of ML in use.
- Prediction : ML can also be used in the prediction systems. Considering the loan example, to compute the probability of a fault, the system will need to classify the available data in groups.
- Image recognition : MLcan be used for face detection in an image as well. There is a separate category for each person in a database of several people.
- Speech Recognition : It is the translation of spoken words into the text. It is used in voice searches and more. Voice user interfaces include voice dialing, call routing, and appliance control. It can also be used a simple data entry and the preparation of structured documents.
- Medical diagnoses : ML is trained to recognize cancerous tissues.
- Financial industry and trading : companies use ML in fraud investigations and credit checks.
According to Arthur Samuel “ Machine Learning algorithms enable the computers to learn from data, and even improve themselves, without being explicitly programmed.”
Types of Machine Learning
Machine learning uses 3 types of techniques: supervised learning, which trains a model on known input and output data so that it can predict future outputs, and unsupervised learning, which finds hidden patterns or intrinsic structures in input data and Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment in order to maximize the notion of cumulative reward.
Supervised Learning
In machine learning and artificial intelligence, supervised learning refers to a class of systems and algorithms that determine a predictive model using data points with known outcomes. The model is learned by training through an appropriate learning algorithm (such as linear regression, random forests, or neural networks) that typically works through some optimization routine to minimize a loss or error function.
Supervised machine learning builds a model that makes predictions based on evidence in the presence of uncertainty. A supervised learning algorithm takes a known set of input data and known responses to the data (output) and trains a model to generate reasonable predictions for the response to new data. Use supervised learning if you have known data for the output you are trying to predict.
Put another way, supervised learning is the process of teaching a model by feeding it input data as well as correct output data. This input/output pair is usually referred to as “labeled data.” Think of a teacher who, knowing the correct answer, will either reward marks to or take marks from a student based on the correctness of her response to a question. Supervised learning is often used to create machine learning models for two types of problems.
As shown in the above example, we have initially taken some data and marked them as ‘Spam’ or ‘Not Spam’. This labeled data is used by the training supervised model, this data is used to train the model.
Once it is trained we can test our model by testing it with some test new mails and checking of the model is able to predict the right output.
Types of Supervised learning
- Classification: A classification problem is when the output variable is a category, such as “red” or “blue” or “disease” and “no disease”. Classification techniques predict discrete responses — for example, whether an email is genuine or spam, or whether a tumor is cancerous or benign. Classification models classify input data into categories. Typical applications include medical imaging, speech recognition, and credit scoring.Use classification if your data can be tagged, categorized, or separated into specific groups or classes.
- Regression: A regression problem is when the output variable is a real value, such as “dollars” or “weight”. Regression techniques predict continuous responses — for example, changes in temperature or fluctuations in power demand. Typical applications include electricity load forecasting and algorithmic trading.Use regression techniques if you are working with a data range or if the nature of your response is a real number, such as temperature or the time until failure for a piece of equipment.
Unsupervised Learning
Unsupervised machine learning algorithms infer patterns from a dataset without reference to known, or labeled, outcomes. Unlike supervised machine learning, unsupervised machine learning methods cannot be directly applied to a regression or a classification problem because you have no idea what the values for the output data might be, making it impossible for you to train the algorithm the way you normally would. Unsupervised learning can instead be used to discover the underlying structure of the data.
Furthermore unsupervised machine learning purports to uncover previously unknown patterns in data, but most of the time these patterns are poor approximations of what supervised machine learning can achieve. Additionally, since you do not know what the outcomes should be, there is no way to determine how accurate they are, making supervised machine learning more applicable to real-world problems.In unsupervised learning, an AI system is presented with unlabeled, uncategorized data and the system’s algorithms act on the data without prior training. The output is dependent upon the coded algorithms. Subjecting a system to unsupervised learning is one way of testing AI.
In the above example, we have given some characters to our model which are ‘Ducks’ and ‘Not Ducks’. In our training data, we don’t provide any label to the corresponding data. The unsupervised model is able to separate both the characters by looking at the type of data and models the underlying structure or distribution in the data in order to learn more about it.
Types of Unsupervised learning
- Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior.Moreover Clustering allows you to automatically split the dataset into groups according to similarity. Often, however, cluster analysis overestimates the similarity between groups and doesn’t treat data points as individuals. For this reason, cluster analysis is a poor choice for applications like customer segmentation and targeting.
- Association: An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y. In addition association rules allow you to establish associations amongst data objects inside large databases. This unsupervised technique is about discovering interesting relationships between variables in large databases. For example, people that buy a new home most likely to buy new furniture.
Reinforcement Learning
A reinforcement learning algorithm, or agent, learns by interacting with its environment. The agent receives rewards by performing correctly and penalties for performing incorrectly. The agent learns without intervention from a human by maximizing its reward and minimizing its penalty. It is a type of dynamic programming that trains algorithms using a system of reward and punishment.
In the above example, we can see that the agent is given 2 options i.e. a path with water or a path with fire. A reinforcement algorithm works on reward a system i.e. if the agent uses the fire path then the rewards are subtracted and agent tries to learn that it should avoid the fire path. If it had chosen the water path or the safe path then some points would have been added to the reward points, the agent then would try to learn what path is safe and what path isn’t.
How Do You Decide Which Machine Learning Algorithm to Use?
Choosing the right algorithm can seem overwhelming, there are dozens of supervised and unsupervised machine learning algorithms, and each takes a different approach to learning.
There is no best method or one size fits all. Finding the right algorithm is partly just trial and error , even highly experienced data scientists can’t tell whether an algorithm will work without trying it out. But algorithm selection also depends on the size and type of data you’re working with, the insights you want to get from the data, and how those insights will be used.
- Choose supervised learning if you need to train a model to make a prediction — for example, the future value of a continuous variable, such as temperature or a stock price, or a classification — for example, identify makes of cars from webcam video footage.
- Choose unsupervised learning if you need to explore your data and want to train a model to find a good internal representation, such as splitting data up into clusters.
Advantages of Machine Learning
Continuous Improvement
Machine Learning algorithms are capable of learning from the data we provide. As new data is provided, the model’s accuracy and efficiency to make decisions improve with subsequent training. Giants like Amazon, Walmart, etc collect a huge volume of new data every day. The accuracy of finding associated products or recommendation engine improves with this huge amount of training data available.
Automation for everything
A very powerful utility of Machine Learning is its ability to automate various decision-making tasks. This frees up a lot of time for developers to use their time to more productive use. For example, some common use we see in our daily life is social media sentiment analysis and chatbots. The moment a negative tweet is made related to a product or service of a Company, a chatbot instantly replies as first-level customer support. Machine Learning is changing the world with its automation for almost everything we can think of.
Trends and patterns identification
This advantage is a no brainer. All of us interested in Machine Learning technology are well aware of how the various Supervised, Unsupervised and Reinforced learning algorithms can be used for various classification and regression problems. We identify various trends and patterns with a huge amount of data using this technology. For example, Amazon analyzes the buying patterns and search trends of its customers and predicts products for them using Machine Learning algorithms.
Wide range of applications
Machine Learning is used in every industry these days, for example from Defence to Education. Companies generate profits, cut costs, automate, predict the future, analyze trends and patterns from the past data, and many more. Applications like GPS Tracking for traffic, Email spam filtering, text prediction, spell check and correction, etc are a few used widely these days.
Machine Learning: The technology leaders
In addition to Microsoft, Google, Facebook, IBM and Amazon, Apple also spends enormous financial resources on the use and further development of Machine Learning. IBM’s Watson supercomputer is still the best-known appliance for Machine Learning. Watson is mainly used in the medical and financial sectors. As already mentioned, Facebook uses Machine Learning for image recognition, Microsoft for the speech recognition system Cortana, Apple for Siri. Of course, Machine Learning is also used at Google, both in the area of image services and search engine ranking.
Cloud providers such as Google, Microsoft, Amazon Webservice and IBM have now created services for Machine Learning. With their help it is also possible for developers who do not have specific Machine Learning knowledge to develop applications. These applications are able to learn from a freely definable set of data. Depending on the provider, these platforms have different names:
- IBM: Watson
- Amazon: Amazon Machine Learning
- Microsoft: Azure ML Studio
- Google: Tensorflow
Summary
In this blog, I have presented you with the basics concepts of Machine learning and I hope this blog was helpful and would have motivated you enough to get interested in the topic.
You may also like to read Data Mining vs. Machine Learning: What’s The Difference?