Saturday, 7 September 2024

Naive Bayes

 

Suppose you are a product manager, you want to classify customer reviews in positive and negative classes. Or As a loan manager, you want to identify which loan applicants are safe or risky? As a healthcare analyst, you want to predict which patients can suffer from diabetes disease. All the examples have the same kind of problem to classify reviews, loan applicants, and patients.

Naive Bayes is the most straightforward and fast classification algorithm, which is suitable for a large chunk of data. Naive Bayes classifier is successfully used in various applications such as spam filtering, text classification, sentiment analysis, and recommender systems. It uses Bayes theorem of probability for prediction of unknown class.


Naive Bayes is a statistical classification technique based on Bayes Theorem. It is one of the simplest supervised learning algorithms. Naive Bayes classifier is the fast, accurate and reliable algorithm. Naive Bayes classifiers have high accuracy and speed on large datasets.

Naive Bayes classifier assumes that the effect of a particular feature in a class is independent of other features. For example, a loan applicant is desirable or not depending on his/her income, previous loan and transaction history, age, and location. Even if these features are interdependent, these features are still considered independently. This assumption simplifies computation, and that's why it is considered as naive. This assumption is called class conditional independence.

What is Naive Bayes Classifier?
  • P(h): the probability of hypothesis h being true (regardless of the data). This is known as the prior probability of h.
  • P(D): the probability of the data (regardless of the hypothesis). This is known as the prior probability.
  • P(h|D): the probability of hypothesis h given the data D. This is known as posterior probability.
  • P(D|h): the probability of data d given that the hypothesis h was true. This is known as posterior probability.
Example: To get the probability of Diamond Queen, the above formulae could be substituted as below:


No comments:

Post a Comment

Building a ReAct Agent with LangGraph & LangSmith

In this post, I walk through building a ReAct (Reasoning + Acting) agent using LangGraph and Groq's openai/gpt-oss-120b model, where the...