In what way do AI models operate

#ai #deeplearning

The Rise of AI
Over the past nearly two years since 2025, artificial intelligence has developed at a rapid pace and become a major trend. AI, intelligent agents and related technologies can be seen everywhere in daily life. It seems that AI is being applied in countless scenarios, with seemingly no rivals and capable of accomplishing almost anything. Nearly everyone has used large language models such as GPT, DeepSeek, Doubao and Yuanbao.
Back in 2025, we believed AI programs were merely based on chatbots, designed only for tasks related to the internet and information technology. Today, however, we realize they are far more sophisticated.
In the traditional internet industry, we have long witnessed the internet bubble economy and profit models decoupled from real industries, which rely heavily on venture capital. For this reason, many once thought AI could never be integrated into traditional trades like hairdressers, plumbers and maintenance workers, and that these occupations would remain largely unaffected by AI. Yet with the advancement of robots and smart hardware, all these possibilities have become tangible. From a short-term economic perspective, the only current limitation is the high cost of applying AI to solve simple problems.
To truly understand AI programs at their core, it is necessary to learn some basic professional terms and figure out how they actually operate.

The Evolution of AI
In its early days, AI existed mainly in the form of chatbots — intelligent conversational robots. The early version of ChatGPT is a typical example. Back then, people interacted with AI by entering text on websites to get automated responses. Nowadays, modern AI agent models are able to handle tasks for a wide range of specialized scenarios, such as generating reports and creating videos.
Accordingly, AI can be divided into the following major categories:

Demystifying AI
We can elaborate on these categories as follows:
ANI (Artificial Narrow Intelligence)
Also known as narrow AI. It is designed for specific scenarios, with autonomous driving models as a typical example.
Generative AI
It refers to generative artificial intelligence. Tools like ChatGPT fall into this category and can be applied to numerous scenarios.
AGI (Artificial General Intelligence)
This represents the ultimate goal of AI. It can accomplish all tasks that humans are capable of, and even handle work beyond human ability.
Machine Learning (ML)
Machine learning serves as a core technical pillar driving the development of AI. A key branch is Supervised Learning. Its fundamental objective is to convert given inputs into the desired outputs.

Here are simple examples. Suppose we develop an email program empowered by AI to detect spam emails. This AI-powered function is called spam filtering. If we feed audio as input and get text transcripts as output, the corresponding AI technology is speech recognition. When you input text, known as the source text, and expect another piece of text as the result, namely the target text, this is how a chatbot works — it generates text by making accurate predictions.
LLM (Large Language Models)
Large Language Models are trained based on machine learning, specifically supervised learning, to predict the next word in a sequence. To put it simply, you can draw parallels to the word segmentation mechanism of Elastic. As I mentioned before, here is a straightforward illustration:
Take the input sentence: My most commonly used database is Elastic.

Input Output
My most commonly used database
My most commonly used database is Elastic

When a model is trained on massive datasets, it evolves into systems like ChatGPT. Given an initial prompt, it can generate relevant responses. In fact, GPT models do far more than just predicting the next single word. They also filter and refine language to deliver more accurate replies.
Over the past two years, many companies have been engaged in data annotation for model training. They hire staff to label large volumes of images and texts. Such work is essential to build up the recognition capabilities of models under supervised learning. Thanks to the advancement of computers and the internet, vast and diverse data resources are readily available, which has led to the remarkable leap in AI model performance in recent years.
Lastly, let's talk about a hugely popular concept in modern AI: neural networks.

As data volume keeps increasing, model performance will first improve and then level off. Traditional AI models cannot grow smarter endlessly with more data. Nevertheless, high performance relies heavily on big data. Meanwhile, the advancement of GPUs has greatly boosted large-scale computing power, providing stronger resources for model training.
The core concepts of artificial intelligence are machine learning and supervised learning, which essentially map inputs to corresponding outputs.

DataSet
Datasets are fundamental to AI systems. Here is a simple real-life example.
We can create a basic dataset consisting of delivery distance (kilometers) and estimated delivery time (minutes):

Delivery Distance (km) Estimated Time (min)
1 15
2 20
3 25
4 30
5 35

In this case, the delivery distance serves as the input, and the estimated delivery time is the output. We can also add more input features, such as the number of traffic lights, to build an extended dataset:

Number of Traffic Lights Delivery Distance (km) Estimated Time (min)
1 1 15
1 2 20
2 3 30
2 4 35
2 5 34

Likewise, we can set delivery time as the input to predict whether a delivery route can meet the time requirement.
Apart from numerical data, there are many other application scenarios. For instance, security verification is commonly required when accessing certain websites, such as the widely discussed:

Messy Data
Many enterprises consider leveraging their existing operational data to conduct AI predictive analysis. However, there is a harsh reality.
A great number of CEOs assume that with abundant user and production data and an AI team, they can easily carry out industrial predictive analysis and generate tangible value. This idea is actually questionable, for the reasons listed below:
First, the data lacks continuity. When building big data systems, most internet companies store data in data warehouses via stream messages and other approaches. Such data reflects various business metrics but often has little practical value.
Second, the data contains massive junk content. A large portion of the data is useless and unfit for AI model training.
Third, the data is incomplete and discontinuous. Issues like missing values, unknown fields and even manually tampered business data make the data invalid for regression training.
Neural Net
From the above content, we can clearly see the performance gap between traditional AI and neural networks. Do not be misled by the terminology. Artificial neural networks draw inspiration from the nervous system of the human body and are composed of numerous neurons. We can explain its characteristics with the previous delivery distance example:

This scenario reminds me of the regression process in M&V for energy consumption forecasting. It adopts linear regression to make phased predictions. To be specific, it performs discrete regression using data from Phase A to B to generate a linear regression formula, which is then applied to forecast outcomes for Phase C. This method is only used for subsequent predictive analysis, and the corresponding regression formula is displayed in charts. Its implementation requires calculations involving numerous factors. We can regard these factors as neurons, each acting as a channel between input and output with its own computing logic.
Simply put, a neural network is a set of conditions made up of various neurons. The more neurons there are, the larger the neural network will be, leading to more accurate output results.
What Machine Learning Is Good At
At first, I thought the rise of AI would have little impact on traditional industries such as manual labor and service sectors. But the reality turns out to be quite different. Currently, businesses are reluctant to spend huge costs replacing low-wage jobs with AI. However, as AGI continues to evolve, it will gradually take over more human work on a large scale.
Meanwhile, many people hold overly high expectations for AI. We need to realize that AI is not a panacea. The distinction is easy to understand. For example, asking AI to generate reports and process data works well, because humans can give clear and complete instructions with definite operating logic. In contrast, it is unrealistic to expect AI to predict stock market trends or winning lottery numbers. Even humans cannot accomplish such tasks. Though we can feed massive complex data — including historical market trends, corporate operation reports and traffic statistics — for reference, strong randomness still cannot be eliminated.
The same applies to popular smart vehicles and autonomous driving. Cars are equipped with cameras and radars to capture driving information, which is a typical input-output application. Yet they cannot effectively recognize human body movements. For instance, autonomous vehicles can plan routes and detect obstacles via sensors, but fail to identify passengers hailing cars by waving. Diverse body gestures make it impossible to form a fixed mapping from input to output.
Likewise, many hospitals have introduced AI to analyze medical reports such as CT scans. Such systems work well only when scans follow standard rules: all images are placed correctly, and CT slices of the heart are positioned uniformly during supervised training. If applied to non-standard scans, for example images taken when the patient lies sideways, the recognition error will increase dramatically.
To sum up, we can judge the applicable scenarios of machine learning by the following rules:
Scenarios where ML performs well
Learning relatively simple concepts
Having a large volume of available data
Scenarios where ML performs poorly
Learning complex concepts with limited data
Processing unseen and brand-new types of data

DEV Community

In what way do AI models operate

Top comments (0)