AI Arcade — 2

Aastha Thakker
Oct 28, 2025
5 min read

Hey guys! Welcome back to AI Arcade, Part 2. In Part 1 we have learned the fundamentals of AI, distinguishing it from traditional learning, the basic difference between machine and deep learning, Applications of AI in real life and cyber security along with some ethical considerations.

Now, in this blog, our focus will be Machine Learning. We’ll start with the basics and will go deeper later on.

What is machine Learning?

From the previous blog, you must know, that Machine learning is a branch of AI that focuses on algorithms’ ability to learn from data. These algorithms can identify patterns and relationships within the data, enabling them to make predictions or decisions without being explicitly programmed for every situation. In simpler terms, Machine Learns. Yes, we give it the ability to learn things from different types of data, coming from different sources.

Applications of Machine Learning

Well, this part has a different list. If one starts to write the list for the applications of ML, it will go on and on. AI & ML is not limited to any field or domain. It can be health, finance, drones, army, marketing, personalization, sales and ofc many more.

The trending or I can say my interest is in TinyML too. Yes, I have recently started learning more about it, but what is TinyML? Tiny ML, as the word suggests, fitting the large ML algorithms into tiny little machines, which are barely noticeable. It has a long list of applications like measuring the glucose of a person in real-time, Neura Link, wearable electronics and consumer electronics like Ok Google or Hey Siri! TinyML has tiny hardware but huge applications.

Well-known ML applications are:

Personalized treatments
Malware detection
Stock Price prediction
Automatic music composition
Predictive maintenance
Self-driving cars
Twist your neck, see your surroundings, and list more.

Data In Machine Learning

Data means raw facts and figures. E.g., 19,9,22,5 (random meaningless figures).
Information is the data that has been processed and organized in a meaningful format. E.g., Age is 19 or date is 23rd May.

There are two broad types of data i.e. Structured (organized like rows and columns, spreadsheet) and Unstructured (unorganized like images, audios, inbox etc.) data

In applications of machine learning, we saw that we need lots of data, which is also in real-time. If we have random datasets having no meaning, then how we can use that data for our benefit?

So, we need to understand the steps of Data processing

Collection: Collecting huge amounts of data from various required sources.
Cleaning and normalization: Removing duplicates, missing values, errors and redundancy and converting it to standardized data formats.
Model Building: Creating an algorithm for a specific task so that it can learn from the data.
Model testing: Checking the model’s performance and results in real-time with required results.
Cross Validation/Prediction: Testing the model’s accuracy by giving unseen data or testing it on different data subsets.
Evaluation and Improvement: Continuous monitoring and refining of the model is necessary for better performance and scalability.

Types of Learning

1. Supervised:

This model is trained based on “labelled data sets”. A Labelled data set means that both input and output are given to the model to learn.

Classification: This algorithm is used when we need categorical output like ‘Yes or No’, ‘spam or not spam’ or ‘tomato or carrot’. The Algorithms used here are Random Forest algorithm, Logistic Regression and Decision tree algorithm.
Regression: This algorithm is used when there is a linear relationship between input and output. Algorithms used are Linear regression, Decision tree and Lasso regression.

Example: A Labelled dataset of emails classified as “spam” or “not spam.” You train a model on this labeled data to predict the label of new, unseen emails. [Fraud Detection, Image Segmentation, NLP]

Advantages

Working with labelled data sets gives us the proper insight into the classes of objects.
It can predict the outcome based on prior knowledge.

Disadvantages

Time- consuming & Costly
May struggle with unseen data or patterns.

2. Unsupervised:

This model is not given any “labelled data”. Instead, raw data is given where the algorithm discovers hidden patterns & relationships among datasets. It can be used for various purposes like data exploration, visualization, dimensionality reduction etc.

Clustering: This is used when we want to group some data points based on some similarities. Algorithms like K-means clustering and Mean-shift algorithm use this technique.
Association learning: This is used to derive meaningful relationships & find dependencies between data points. Algorithms like Market Basket analysis, Eclat etc. use this technique.
Example: You have a collection of customer purchase data with no labels. You use clustering algorithms to group customers with similar buying behaviors, discovering patterns without predefined categories. [Dimensionality Reduction, Anomaly Detection, Exploratory data analysis]

Advantages

Used to discover hidden trends and relationships.
Reduces the effort of data labelling, as it doesn’t require labelled data.

Disadvantages

Can be less accurate, as data is not labelled.
Working with it becomes difficult in comparison to labelled data sets.

3. Semi-Supervised:

It is the middle ground between supervised & unsupervised learning. It uses few labels and most unlabeled data sets for model training. This is mostly applicable where the data sets have more images.

Example: You have a small set of labelled images of apples and pineapples and a larger set of unlabeled images. You train a model using both the labelled and unlabeled data to improve the accuracy of distinguishing between apple and pineapple.

Advantages

Simple & easy to understand along with good efficiency.
Solves the drawbacks of supervised and unsupervised learning.

Disadvantages

Iteration (loops) results may not be stable.
Can’t apply these algorithms to network-level data.

4. Reinforcement Learning:

This model resembles psychological reinforcement present in animals. Starting is made with the trial-and-error method. While training the model, if the algorithm gives the correct output, then it gets rewarded for good performance else punishment (removing the points or any kind of punishing like action) for bad performance. In this way model continues to learn, as the goal is to collect as many rewards as possible.

Examples: NLP (Natural Language Processing), Autonomous Vehicle, Robotics etc.
Positive Reinforcement — It’s like motivating a person (here algorithm) to perform the required behavior again and again. It enhances the strength of the model.
Negative Reinforcement — It’s like teaching a kid not to do certain tasks by giving punishment. It ensures that if a certain behavior has occurred, the algorithm can avoid the negative condition.

Advantages

It is similar to human reinforcement learning. (Operant Conditioning)
Better suited for achieving long-term results.

Disadvantages

Not suitable for simpler problems
Too much reinforcement can lead to an overload of states which decreases the accuracy of results.

In upcoming blogs, we’ll do, hands-on tutorials using Python. Python is a popular choice for machine learning because of its well-known powerful libraries like NumPy, TensorFlow, Pandas, Matplotlib, PyTorch, Scikit-Learn, SciPy etc. Having a basic understanding of their functions will be helpful.

Thanks for reading! See you next Thursday!