Artificial Intelligence by Leela Prasad: August 2024

Tuesday, 20 August 2024

Logistic Regression

Logistic regression estimates the probability of an event occurring, such as voted or didn’t vote, based on a given data set of independent variables.

There are three types of logistic regression models, which are defined based on categorical response.

Binary logistic regression: In this approach, the response or dependent variable is dichotomous in nature—i.e. it has only two possible outcomes (e.g. 0 or 1). Some popular examples of its use include predicting if an e-mail is spam or not spam or if a tumor is malignant or not malignant. Within logistic regression, this is the most commonly used approach, and more generally, it is one of the most common classifiers for binary classification.
Multinomial logistic regression: In this type of logistic regression model, the dependent variable has three or more possible outcomes; however, these values have no specified order. For example, movie studios want to predict what genre of film a moviegoer is likely to see to market films more effectively. A multinomial logistic regression model can help the studio to determine the strength of influence a person's age, gender, and dating status may have on the type of film that they prefer. The studio can then orient an advertising campaign of a specific movie toward a group of people likely to go see it.
Ordinal logistic regression: This type of logistic regression model is leveraged when the response variable has three or more possible outcome, but in this case, these values do have a defined order. Examples of ordinal responses include grading scales from A to F or rating scales from 1 to 5.

Logistic regression is commonly used for prediction and classification problems. Some of these use cases include:

Fraud detection: Logistic regression models can help teams identify data anomalies, which are predictive of fraud. Certain behaviors or characteristics may have a higher association with fraudulent activities, which is particularly helpful to banking and other financial institutions in protecting their clients. SaaS-based companies have also started to adopt these practices to eliminate fake user accounts from their datasets when conducting data analysis around business performance.
Disease prediction: In medicine, this analytics approach can be used to predict the likelihood of disease or illness for a given population. Healthcare organizations can set up preventative care for individuals that show higher propensity for specific illnesses.
Churn prediction: Specific behaviors may be indicative of churn in different functions of an organization. For example, human resources and management teams may want to know if there are high performers within the company who are at risk of leaving the organization; this type of insight can prompt conversations to understand problem areas within the company, such as culture or compensation. Alternatively, the sales organization may want to learn which of their clients are at risk of taking their business elsewhere. This can prompt teams to set up a retention strategy to avoid lost revenue.

References: https://www.ibm.com/topics/logistic-regression#:~:text=Logistic%20regression%20estimates%20the%20probability,data%20set%20of%20independent%20variables.

https://www.youtube.com/watch?v=zM4VZR0px8E&list=PLeo1K3hjS3us_ELKYSj_Fth2tIEkdKXvV&index=47

Monday, 19 August 2024

Dummy Variables & One Hot Encoding

Usage of Dummy Variables / One Hot Encoding is applicable in the case of Categorical Variables which are Nominal in Nature.

Nominal variables are the one's those does not have any order representation.

Source: https://www.youtube.com/watch?v=9yl6-HEY7_s&list=PLeo1K3hjS3us_ELKYSj_Fth2tIEkdKXvV&index=45

Save Model and Reuse

In Python we have an option to save the Model as a file and Reuse it. There are 2 libraries available to save the Models.

1. Pickle

2. Joblib

Please follow the below code to save the Model that demonstrates the usage of above mentioned 2 libraries and loading back the Model from file

https://github.com/LeelaPrasadG/AILearning/blob/main/ML/3_SaveModel/Savemodel_PickleandJoblib.ipynb

Sunday, 18 August 2024

Gradient Descent (GD) is a widely used optimization algorithm in machine learning and deep learning that minimises the cost function of a neural network model during training. It works by iteratively adjusting the weights or parameters of the model in the direction of the negative gradient of the cost function until the minimum of the cost function is reached

Gradient Descent is a fundamental optimization algorithm in machine learning used to minimize the cost or loss function during model training.

It iteratively adjusts model parameters by moving in the direction of the steepest decrease in the cost function.

The algorithm calculates gradients, representing the partial derivatives of the cost function concerning each parameter.

These gradients guide the updates, ensuring convergence towards the optimal parameter values that yield the lowest possible cost.

Gradient Descent is versatile and applicable to various machine learning models, including linear regression and neural networks. Its efficiency lies in navigating the parameter space efficiently, enabling models to learn patterns and make accurate predictions. Adjusting the learning rate is crucial to balance convergence speed and avoiding overshooting the optimal solution.

https://www.geeksforgeeks.org/gradient-descent-algorithm-and-its-variants/#gradient-descent-in-machine-learning

Linear Regression

Linear regression analysis is used to predict the value of a variable based on the value of another variable. The variable you want to predict is called the dependent variable. The variable you are using to predict the other variable's value is called the independent variable.

Linear Regression Single Variable: https://www.youtube.com/watch?v=8jazNUpO3lQ&list=PLeo1K3hjS3us_ELKYSj_Fth2tIEkdKXvV&index=41

Linear Regression Multiple Variable: https://www.youtube.com/watch?v=J_LnPL3Qg70&list=PLeo1K3hjS3us_ELKYSj_Fth2tIEkdKXvV&index=42&pp=iAQB

Git hub like for Exercises:

https://github.com/codebasics/py/tree/master/ML

https://github.com/LeelaPrasadG/AILearning/tree/main/ML/1.LinearRegression