dzone python training institute in jaipur

Naive Bayes for Machine Learning

One of the simplest and most effective method for classification problem is Naive Bayes. Not only this, it is a powerful algorithm as well for classification. So, here in this blog let's discover the Naive Bayes algorithm for machine learning.

Why to Learn Naive Bayes?

  • It is very fast, easy to implement and fast.
  • It needs less training data.
  • It can make probabilistic predictions.
  • It can handle continuous and discrete data.
  • It is highly scalable algorithm.
  • It is light to train as no complicated optimization is required.
  • It also has small memory footprint.
  • It is easily updateable if new training data is received.

What is Naive Bayes algorithm?

Naive Bayes is a probabilistic classification method and it is based on Baye's Theorem. In Baye's Theorem, it defines the relationship between the probabilities of two events and their conditional probabilities.

			
			

Baye's Theorem: P(H|D)=[P(D|H)*P(H)]/P(B)

Note: H is hypothesis and D is data.
Here in this theorem,

  • P(H|D) = Probability of hypothesis H given the data D is known as the posterior probability.
  • P(D|H) = Probability of data D given that the hypothesis H was true.
  • P(H) = Probability of hypothesis H being true.
  • P(D) = Probability of data.

Applications of Naive Bayes Algorithm

  • Real time Prediction: It can be used for making predictions in real time as Naive bayes is super fast.
  • Multi-Class Prediction: This algorithm is well-known for its multi-class prediction feature as it can predict the posterior probability of multiple classes of the target variable.
  • Text Classification/ Sentiment Analysis/ Spam Filtering: Naive Bayes classifiers are mostly used in
    • Text Classification: For better results in muti-clas s problems and independence rule to attain higher success rate in comparison to other algorithms.
    • Sentiment Analysis: Also known as social-media analysis and it is used to identify the positive and negative customer sentiments.
    • Spam Filtering: It is widely used to identify spam e-mails.
  • Recommendation System: To filter unseen information and also to predict whether a user would like a given resource or not by using machine learning and data mining techniques.

What are the advantages and disadvantages of Naive Bayes?

Advantages of using Naive Bayes for Classification

  • Naive Bayes is simple and fast to predict class of test data set and also performs great in multi class prediction.
  • It is highly scalable algorithm.
  • It can be use for binary and multiclass classification.
  • It provides various types of Naive Bayes Algorithms like GaussianNB, BernoulliNB and MultinomialNB.
  • As it is a simple algo which depends on doing a bunch of counts.
  • It is a best choice for Text Classification problems and also for spam email classification.
  • Small dataset can be easily trained.

Disadvantages of using Naive Bayes for Classification

  • It considers all the features to be unrelated that's why it cannot learn the relationship between features.
  • It makes a very strong assumption as per the data distribution. The two features are independent to give the output class.
  • There is a possibility of loss of accuracy.
  • It can't modify dependencies because dependencies exist between variables.

Types of Naive Bayes Classifier

  • Multinomial Naive Bayes: It is used for discrete counts.
  • Bernoulli Naive Bayes: It is mostly used for continuous values and follows normal distribution.
  • Gaussian Naive Bayes: This Naive Bayes Classifier is used for binary vectors.