One of the simplest and most effective method for classification problem is Naive Bayes. Not only this, it is a powerful algorithm as well for classification. So, here in this blog let's discover the Naive Bayes algorithm for machine learning.
Why to Learn Naive Bayes?
- It is very fast, easy to implement and fast.
- It needs less training data.
- It can make probabilistic predictions.
- It can handle continuous and discrete data.
- It is highly scalable algorithm.
- It is light to train as no complicated optimization is required.
- It also has small memory footprint.
- It is easily updateable if new training data is received.
What is Naive Bayes algorithm?
Naive Bayes is a probabilistic classification method and it is based on Baye's Theorem. In Baye's Theorem, it defines the relationship between the probabilities of two events and their conditional probabilities.
Baye's Theorem: P(H|D)=[P(D|H)*P(H)]/P(B)
Note: H is hypothesis and D is data.
Here in this theorem,
- P(H|D) = Probability of hypothesis H given the data D is known as the posterior probability.
- P(D|H) = Probability of data D given that the hypothesis H was true.
- P(H) = Probability of hypothesis H being true.
- P(D) = Probability of data.
Applications of Naive Bayes Algorithm
- Real time Prediction: It can be used for making predictions in real time as Naive bayes is super fast.
- Multi-Class Prediction: This algorithm is well-known for its multi-class prediction feature as it can predict the posterior probability of multiple classes of the target variable.
- Text Classification/ Sentiment Analysis/ Spam Filtering: Naive Bayes classifiers are mostly used in
- Text Classification: For better results in muti-clas s problems and independence rule to attain higher success rate in comparison to other algorithms.
- Sentiment Analysis: Also known as social-media analysis and it is used to identify the positive and negative customer sentiments.
- Spam Filtering: It is widely used to identify spam e-mails.
- Recommendation System: To filter unseen information and also to predict whether a user would like a given resource or not by using machine learning and data mining techniques.
What are the advantages and disadvantages of Naive Bayes?
Advantages of using Naive Bayes for Classification
- Naive Bayes is simple and fast to predict class of test data set and also performs great in multi class prediction.
- It is highly scalable algorithm.
- It can be use for binary and multiclass classification.
- It provides various types of Naive Bayes Algorithms like GaussianNB, BernoulliNB and MultinomialNB.
- As it is a simple algo which depends on doing a bunch of counts.
- It is a best choice for Text Classification problems and also for spam email classification.
- Small dataset can be easily trained.
Disadvantages of using Naive Bayes for Classification
- It considers all the features to be unrelated that's why it cannot learn the relationship between features.
- It makes a very strong assumption as per the data distribution. The two features are independent to give the output class.
- There is a possibility of loss of accuracy.
- It can't modify dependencies because dependencies exist between variables.
Types of Naive Bayes Classifier
- Multinomial Naive Bayes: It is used for discrete counts.
- Bernoulli Naive Bayes: It is mostly used for continuous values and follows normal distribution.
- Gaussian Naive Bayes: This Naive Bayes Classifier is used for binary vectors.