DEV Community: Alex Troitsky

Stop Guessing AI Metrics: Regression Explained with MSE, RMSE, MAE, R & MAPE

Alex Troitsky — Tue, 16 Dec 2025 22:50:04 +0000

What is regression?

A regression task in machine learning is a type of AI learning where a model is trained on data with a continuous value and learns to predict that value based on one or more input features.

The key difference between regression and classification is that regression predicts continuous values (for example, house price, temperature, number of sales), while classification predicts categorical labels (for example, yes/no, red/blue/green).

In other words, a regression task predicts a number, while a classification task is like choosing an answer from multiple options in a test.

Example

Let’s imagine that we run a classifieds platform similar to Idealista. We want to suggest to users, directly in the interface, the best price at which they should list their apartment, based on many factors, such as:

Apartment location
Size
Floor
Renovation quality
Year the building was constructed

As a result, we show the user a recommended price in euros.

We predicted the prices for 10 apartments, and a month later we learned the actual prices at which they were sold.

Next, we will perform some simple calculations with these results:

Subtract the actual price from the predicted price (first column)
Square this difference (second column)
Take the square root of this value (third column)

This gives us the following results for our example:

MSE

If we take the second column from the green table above, add up all the values in it, and then divide by the number of values (i.e., take the average), we get MSE, or Mean Squared Error.In our case:

MSE = 3,353,809,295

That’s a big number! Because of its scale, it’s hard to interpret from a business perspective. MSE is more often used during model development, where it’s important to penalize large errors more heavily than small ones, since the error grows quadratically. This makes MSE sensitive to outliers. MSE is useful when large errors are unacceptable and should strongly affect the model.

RMSE

RMSE, or Root Mean Squared Error, is the younger brother of MSE. To calculate it, you simply take the square root of MSE.

In our case, it equals 57,912.

RMSE also penalizes large errors, but unlike MSE, the scale of the error is the same as the original data, which makes it easier to interpret. This makes RMSE a good choice for many practical tasks where interpretability matters.

MAE

MAE, or Mean Absolute Error, is calculated using the third column of the green table above. You take the sum of the square roots of the squared differences between the predicted and actual prices and divide it by the number of observations. In simple terms, you take the average of the third column.

In our example, MAE = 49,243.

MAE is less sensitive to outliers compared to MSE and RMSE. This makes it a preferred option when outliers exist in the data but should not have a strong impact on the overall performance of the model.

Let’s make our green table a bit more complex

To understand how R-squared and MAPE are calculated, we need to add two more columns to our green table:

Subtract the mean predicted pricefrom the predicted price and square the result (the fourth green column).P.S. Don’t ask why this is needed or what the practical meaning is — just do it 🙂

Divide the third green column by the predicted apartment price from the yellow table. In other words, divide the absolute difference between the predicted and actual price by the predicted apartment price (the fifth green column).

Coefficient of Determination (R-squared)

To calculate it, we subtract from 1 the ratio of the sum of the second and the fourth green columns:

R-squared = 1 − (sum of column 2 / sum of column 4)

In our case, R-squared = 85.2%.

R-squared measures how much of the variability of the dependent variable is explained by the independent variables in the model. It’s a good way to evaluate how well the model fits the data: the closer the value is to 1, the better the model explains the data. R-squared is best suited for comparing models trained on the same dataset.

MAPE

Mean Absolute Percentage Error (MAPE) is simply the average of the fifth green column.

In our case, MAPE = 14.2%.

MAPE measures how far predictions deviate from actual values in percentage terms and is a good choice when you need an easily interpretable error expressed as a percentage. However, MAPE can be unreliable when the data contains zero or very small values.

Conclusion

Congratulations! You’ve learned about the core metrics used in regression problems.

The 11 Largest AI Companies in the World (December 2025 Edition)

Alex Troitsky — Tue, 16 Dec 2025 22:00:10 +0000

Hey everyone! It's my first post here, so I'm a bit excited!

AI became a thing just 3-4 years ago, and by now we see startups pilled with cash from bigtech giants making 100mln ARR in less than a year with new cool models and application. In this article I tried to summarize data from open source to make a list of the biggest AI companies as by now (December 2025).

What’s the methodology behind this?

This ranking includes only AI-first companies—organizations whose core business model is developing and monetizing their own foundation models (LLMs, multimodal models, code models, image/video generators, or autonomous AI systems).

To be included, a company must:

Sell direct access to its models through subscriptions or API usage
Generate the majority of its revenue from those models
Have a publicly reported or credibly estimated annual recurring revenue (ARR) above $100M as of the second half of 2025
I intentionally excluded:

Big Tech companies like Google, Meta, Microsoft, Amazon, and Apple, because AI is not their primary revenue driver, even though they produce world-class models.
I also excluded infrastructure providers, consulting firms, and startups that rely solely on open-source models without monetizing their own. ARR data comes from verified reports, executive interviews, investor briefings, and reputable financial journalism—including Reuters, The Information, TechCrunch, and VC research platforms.
Below is a breakdown of the 11 biggest AI-first companies in the world, ranked by their most recent ARR (second half of 2025). Numbers are based on public statements, financial reports, and reputable media analysis.

1. OpenAI — $13–20 Billion ARR

OpenAI is the unchallenged leader of the AI-first economy.
Fueled by ChatGPT, GPT-4.1, enterprise contracts, and a massive API ecosystem, the company scaled from $1B ARR to over$13B in roughly two years—and is now approaching $20B.

With more than 800 million weekly active users, OpenAI has the largest distribution channel in the history of software. Its enterprise suite, now used by over a million businesses, is the engine behind the explosive revenue growth.

2. Anthropic — $7–9 Billion ARR

Anthropic, creator of Claude, is the fastest-growing enterprise AI provider ever recorded.
By late 2025, its ARR jumped to$7B, with internal forecasts nearing $9B.

Claude’s reputation for reliability and long-context reasoning made it the preferred option in finance, legal, and security sectors. The company reports that 80% of revenue comes from enterprise API deals—illustrating how deeply AI has penetrated core business operations.

3. Midjourney — ~$500 Million ARR

The most profitable creative AI company in the world.
Midjourney built a $500M+ ARR business withno venture capital, a team of fewer than 50 people, and a fanatically loyal creative community.

Their image-generation models (V6, V7) set the quality bar for the entire industry. With subscription-only revenue and zero enterprise sales, Midjourney remains one of the purest product-market-fit success stories in modern tech.

4. Cursor (Anysphere) — $500+ Million ARR

Cursor is the breakout star of the AI coding revolution (“vibe-coding”).
In 2025 it became the fastest-growing developer tool ever, passing$500M ARR and securing a valuation near $30B.

Cursor combines a lightweight IDE with a powerful AI pair programmer built on frontier models, enabling developers to modify multi-file projects, refactor code, and even generate full systems.

5. xAI — ~$500 Million ARR

Elon Musk’s xAI went from zero to half-a-billion ARR in under a year.
Its model Grok, integrated deeply into X (Twitter), created a massive built-in distribution channel, while government and enterprise contracts added predictable revenue streams.

xAI’s fundraising—over $10B in 2025—ensures it will remain a major contender in frontier model development.

6. Replit — ~$150 Million ARR

Replit transformed from an online coding sandbox into a fully AI-powered app creation platform.
In less than a year, the company jumped from ~$3M ARR to$150M+, driven by its AI agent that can build, run, and deploy applications with minimal human intervention.

Replit democratizes software creation, bringing coding capabilities to millions who never wrote code before.

7. Hugging Face — ~$130 Million ARR

The open-source powerhouse of the AI ecosystem.
Hugging Face does not rely on a single model; instead, it monetizesinfrastructure, enterprise hosting, private model hubs, model inference, and professional AI tooling.

With over 100,000 organizations using its platform and ARR exceeding $130M, Hugging Face has become the GitHub of AI.

8. Cohere — $100–150 Million ARR

Cohere dominates the enterprise private-LLMsegment.
Trusted by banks, governments, and regulated industries, Cohere provides models that run privately—on a company’s own servers or cloud.

By focusing on privacy, security, and sovereignty rather than mass adoption, Cohere reached $100M ARR by mid-2025 and now pushes toward the $150M mark.

9. Lovable — ~$100 Million ARR

Lovable became a phenomenon in the “vibe coding” wave.
In under a year, the company grew from zero to$100M ARR, turning app creation into a creative, remixable experience.

Beyond writing code, Lovable generates full application structures—frontend, backend, and deployment—making it one of the fastest-scaling SaaS products in AI history.

10. AI21 Labs — ~$100 Million ARR

AI21 Labs specializes in long-context reasoning and enterprise-grade text models.
Known for its Jamba and Jurassic families of LLMs, the company secured major partnerships with NVIDIA and Google, pushing revenue toward the$100M mark.

Its orchestration engine, Maestro, positions AI21 as a precision-focused alternative to giants like OpenAI and Anthropic.

11. Stability AI — < $100 Million ARR (but influential)

Stability AI, creator of Stable Diffusion, is the only company on this list not above $100M ARR—yet its impact on the generative ecosystem remains unmatched.

Despite a challenging transition from open-source diffusion models to enterprise monetization, the company continues to innovate across image, audio, and video generation.