Maksym Koval

Posted on Jun 11 • Originally published at Medium

An AI Module Smaller Than a Smartphone Photo. Are We Underestimating the Computer We Carry Every Day?

#android #machinelearning #computervision #software

The Idea and the Main Engineering Challenges

Recently, I released a new offline AI feature for my Android application as a separate module. The entire AI module ended up occupying just 4.78 MB, while recognition on modern smartphones typically takes less than 100 ms.

Working on this feature made me think more about why some modern applications seem to grow in size faster than their functionality.

While developing and improving my app, Morse Code Interpreter, I regularly visited communities such as /r/morse and /r/morsecode to better understand what users actually needed and which problems had already been solved. One thing quickly became apparent: many posts were asking for help decoding Morse code that had been handwritten or printed and captured in an image. That caught my attention, so I started looking for tools that could solve this problem.

To my surprise, I couldn’t find any dedicated solutions in either Google Play or the App Store. That seemed odd, so I dug deeper and found several online services that claimed to support this use case. Only one of them was able to correctly recognize a simple example, although it struggled with more complex inputs. The others either failed entirely or simply hallucinated answers.

After realizing that there was no reliable solution available, I decided to build one myself.

For this task, I chose to explore ML and Edge AI. Machine learning has always interested me, especially now that modern smartphones have reached a level of computing power capable of running games such as Total War and Subnautica while simultaneously handling real-time AI workloads directly on the device.

What particularly interests me about Edge AI today is the ability to solve problems locally using mobile sensors and on-device inference rather than relying on cloud processing. This can improve responsiveness, privacy, offline availability, and infrastructure costs.

Around the same time, Google began promoting LiteRT as the successor to TensorFlow Lite. The hardware was ready. The software stack was ready. Yet most of the LiteRT demos and discussions I encountered focused almost entirely on local LLMs.

I wanted to explore a very different Edge AI use case on Android: building a lightweight offline ML + CV (Computer Vision) solution for a problem that still lacks practical real-time mobile implementations.

At first glance, recognizing a few dots and dashes in a photograph does not sound like an unsolved problem. Reality turned out to be much more interesting. The deeper I went, the less this looked like a simple pet project and the more it resembled a full-scale engineering challenge. Perhaps that is why there were so few good implementations. Or perhaps that is why there were none at all. At some point, I stopped being surprised by that fact.

Fortunately, or unfortunately, I belong to the category of developers who become even more interested in a problem immediately after hearing the phrase, “This is going to be difficult.”

This is also where the most interesting part of the story begins. Or perhaps the most boring one. Engineers usually refer to these things as “interesting challenges” only after they have successfully solved them.

After months of development, experimentation, optimization, and Android-specific edge cases, I released a new offline AI feature for Morse Code Interpreter: fully on-device Morse code recognition from images and live camera frames without any cloud processing.

I also decided to ship it as a separate module that users could download on demand and remove at any time directly from the application. Since the base application was only about 4 MB in size, I wanted to avoid inflating the main download with functionality that many users might never need.

To reassure readers that none of this was a fever dream and that an AI module smaller than a smartphone photo actually exists, here is a short demonstration of the feature in action.

The solution itself uses a custom hybrid ML + CV (Computer Vision) approach together with LiteRT and Dynamic Feature Delivery, Google’s mechanism for downloading functionality on demand.

Some of the more interesting engineering challenges along the way included:

• Designing and building the entire ML + CV pipeline from scratch
• Training custom models and integrating them into Android
• Moving the AI functionality into a separate on-demand Dynamic Feature module
• Keeping the base APK at approximately 4 MB while the optional offline AI module remained around 5 MB
• Ensuring the feature was ready to use immediately after installation
• Allowing users to remove the AI functionality at any time
• Dealing with Android fragmentation and older devices
• Supporting legacy 32-bit Android devices

This turned into a significantly larger engineering challenge than I initially expected, although I am very happy with the final result.

One particularly interesting issue was that LiteRT currently does not provide official prebuilt binaries for 32-bit Android devices, while I wanted the feature to remain available on legacy hardware as well.

As a result, I ended up experimenting with custom LiteRT armeabi-v7a builds using Bazel and integrating them into the Dynamic Feature configuration.

I also documented the LiteRT armeabi-v7a build process here in case it helps someone dealing with legacy Android device support:

GitHub Guide

To be honest, after all the optimization work, the feature ended up being fast enough for interactive offline camera usage, while still processing images efficiently on-device and keeping the AI module lightweight.

What surprised me most was not that the solution fit into just a few megabytes.

It was how, after doing so, some ordinary applications suddenly started looking much more mysterious.

Software Bloat, Dynamic Features, and the User’s Right to Their Own Device

Minification and Obfuscation

Over the last few years of my career, I have started paying much more attention to application size. What surprised me was how large some applications had become despite offering fairly limited functionality.

In my opinion, the technical reasons are usually quite simple: excessive dependency usage, a tendency to add a library for every small task, and in some cases the absence of code minification. Even more surprisingly, some applications are shipped without obfuscation enabled at all — although that is a separate discussion.

Why does this happen?

Perhaps time pressure. Perhaps fear of breaking something. Perhaps a lack of knowledge. Or perhaps something else entirely.

What makes it more disappointing is that this occasionally happens in well-known products. These applications were undoubtedly built by talented engineers who understand software engineering best practices and use them daily. Yet for some reason, they either cannot — or choose not to — embrace minification and obfuscation.

The strange part is that we are not talking about months of work or some exotic optimization techniques. Many of these capabilities are already built into modern development tools.

The irony is that while working on an AI module under 5 MB (4.78 MB to be precise), what surprised me most was not the neural networks. It was discovering applications that occupied ten or twenty times more storage while offering nothing nearly as complex.

Perhaps they were powered by some exceptionally heavy and sophisticated code inaccessible to ordinary developers.

The funniest part is that I was never actively trying to make the module as small as possible. I simply enabled minification, obfuscation, and resource shrinking from the start, and whenever I needed additional functionality, I asked myself a simple question: “Do I really need another dependency for this, or would it be simpler to implement it myself?” As a result, the AI module ended up surprisingly small.

I have also spent time adding these optimization practices to real-world production applications where they had either been intentionally postponed or simply forgotten. Suffice it to say, it was a memorable experience.

I do not consider myself an extremist obsessed with best practices for the sake of following best practices. I would describe myself as a pragmatic engineer. Sometime willing to cut corners when appropriate and willing to invest additional effort when it truly matters. To me, minification, obfuscation, and resource shrinking are simply engineering tools — with both advantages and disadvantages.

These techniques are not about chasing the smallest possible APK. They are about avoiding situations where users download, store, and potentially execute code and resources that an application does not actually need. A smaller application is merely a side effect. The primary benefits are reduced complexity, improved security, and greater respect for the user’s resources.

These tools are often viewed solely as ways to reduce APK size.

In reality, everyone benefits.

Users get smaller downloads, faster updates, reduced bandwidth consumption, and a smaller attack surface.

Businesses get products that are more difficult to reverse engineering, lower update delivery costs, better control over technical debt, and fewer risks associated with legacy code.

Developers gain tools for identifying technical debt, discovering unused code, analyzing dependencies, and managing project complexity. Some of these benefits are not immediately obvious. Personally, I discovered a few of them much later than I would like to admit.

Of course, these approaches also come with trade-offs.

For users, incorrectly configured minification can introduce bugs that only appear in release builds.

For businesses, there are additional costs associated with testing, maintaining optimization rules, and storing artifacts such as mapping.txt files for crash analysis.

For developers, there is the need to understand how R8 works, maintain keep rules, and troubleshoot issues related to reflection, serialization, and third-party libraries.

Like any engineering tool, minification, obfuscation, and resource shrinking are not free.

However, in most cases, their benefits significantly outweigh their maintenance costs.

That said, these techniques are not universally good, nor should they be considered mandatory for every project. Like any engineering decision, they involve trade-offs.

In some situations, postponing optimization may be entirely reasonable in order to deliver functionality faster or validate a business hypothesis. The important part is recognizing the risks involved and being aware of the technical debt being accumulated.

As software development repeatedly demonstrates, nothing is more permanent than a temporary solution. What was intended to be fixed next month often remains in production for years.

That is why, even when optimization is postponed, it should be a conscious engineering compromise rather than the result of simply forgetting the problem exists.

A lighter application is not just about megabytes.

It is about ensuring that users do not download, store, and execute code that will never actually be used.

Dynamic Features and the Real Size of an Application

Given the small size of my application and the fact that, in my opinion, the AI functionality would not be needed by every user, I decided to ship it as a separate module.

The idea was simple: users who wanted the feature could download it, while everyone else could keep the application lightweight. I also wanted users to be able to remove the module whenever they wished.

It sounded like a good idea. Then I started implementing it. And I quickly understood why this approach is not as common as one might expect.

As usual, I looked at how other applications handled similar functionality. What I found was disappointing. Several applications proudly advertised a relatively small download size on Google Play. However, immediately after installation — and sometimes before the user even launched the application — they quietly downloaded hundreds of megabytes of additional content.

No permission. No explanation. No meaningful choice. And, quite often, no way to remove it later.

I understand why this happens. A smaller reported download size improves installation conversion rates. From a business perspective, that makes perfect sense.

What bothered me was not the optimization itself. It was the lack of transparency. If additional functionality is being downloaded, users should at least know what is happening and why.

Instead, it often feels as if the decision has already been made on their behalf. Whether intentionally or unintentionally, the opinions of the people actually using the product become secondary.

Before going further, it is worth briefly looking at the advantages and disadvantages of Dynamic Features.

Advantages of Dynamic Features

For users, Dynamic Features can reduce the initial download size, save storage space, and allow them to install only the functionality they actually need. In some cases, they may even be able to remove that functionality later — although I would not always count on that.

For businesses, Dynamic Features can lower the barrier to entry for new users, improve installation conversion rates, and separate core functionality from rarely used features.

For developers, they provide better project modularity, improved dependency management, and a way to isolate large or highly specialized functionality from the base application.

Disadvantages of Dynamic Features

For users, there is an additional dependency on an internet connection during the first download, extra waiting time, and the possibility of installation failures.

For businesses, there are additional testing costs, more download scenarios to support, and more error handling to maintain.

For developers, Dynamic Features introduce architectural complexity, installation and removal workflows, and a growing collection of edge cases that must be considered during development and testing.

Like minification, Dynamic Features are not a free tool. They increase complexity.

However, they also make it possible to avoid forcing users to download functionality they may never need.

Personally, I like the concept. The underlying idea is simple and logical: Do not make users download features they may never use.

Unfortunately, there are times when it feels as though the industry looked at this idea and reached exactly the opposite conclusion. An application proudly announces that it occupies only a few dozen megabytes. The user installs it. Then additional modules, resources, and other “temporarily necessary” functionality quietly begin downloading in the background. Usually without explanation. Usually without the ability to decline. And, perhaps most interestingly, often without the ability to remove them later.

At some point, you begin to suspect that the only person unaware that a new module is being installed is the user themselves.

Of course, there are cases where automatic downloads are entirely justified. Large games frequently need to download textures, maps, or other assets without which they simply cannot function. In those situations, the behavior is more of a technical necessity than a product decision.

But when it comes to optional functionality, I always find myself asking a simple question: If users were given the choice not to install a module, why shouldn’t they also have the choice to remove it?

Perhaps the most underrated resource in modern smartphones is not the processor. Not the memory. Not even the battery. It is the user’s right to decide what exists on their own device.

Have We Forgotten That a Smartphone Is a Computer?

As I mentioned earlier, Edge AI is what initially drew me into this journey.

Perhaps Edge AI is interesting not simply because it allows models to run on-device. Perhaps it is interesting because, for the first time in a long while, it forces us to reconsider the role of the smartphone itself.

For years, we have become accustomed to thinking of smartphones primarily as clients connected to servers. Yet modern mobile hardware is gradually transforming them into personal computers. Computers that are always with us and increasingly capable of understanding the world around them through their sensors.

Today’s smartphone contains a multi-core CPU, a GPU, an NPU, multiple cameras, dozens of sensors, and computing power that would have made desktop computers jealous not that long ago.

And yet a significant portion of modern software still behaves as though it were running on a Nokia 3310 whose sole purpose is to display a loading indicator.

Of course, some functionality is difficult — or even impossible — to implement without servers.

However, modern smartphones are increasingly capable of handling many tasks on their own.

Naturally, neither online nor offline approaches are universally correct. Each comes with its own strengths and weaknesses.

It is worth briefly looking at both perspectives.

Advantages of Online Functionality

For users, online functionality provides access to real-time data, synchronization across devices, and the ability to use sophisticated services without requiring powerful hardware.

For businesses, it offers centralized control over functionality, the ability to update logic without requiring application updates, and easier analytics collection.

For developers, it simplifies algorithm updates, provides access to powerful server resources, and allows issues to be fixed without releasing a new version of the application.

Disadvantages of Online Functionality

For users, it introduces dependence on internet connectivity and server availability, potential latency, privacy concerns, and the risk of losing functionality if a service is discontinued.

For businesses, it requires infrastructure investments, server maintenance, scaling efforts, and creates risks associated with service outages.

For developers, it means maintaining backend systems, handling network failures, synchronization issues, caching strategies, and overall architectural complexity.

Advantages of Offline Functionality

For users, offline functionality offers immediate responsiveness, improved privacy, and independence from servers and external services.

For businesses, it can reduce infrastructure costs, decrease server load, and lower operational risks associated with service availability.

For developers, it reduces external dependencies and eliminates many network-related problems.

Disadvantages of Offline Functionality

For users, it may increase application size, consume additional device resources, and make access to up-to-date information more difficult.

For businesses, it can complicate updates, expose limitations of mobile hardware, and make intellectual property protection more challenging.

For developers, it requires optimization across a wide variety of devices while balancing performance, memory consumption, and battery usage. Updating algorithms after release can also become more difficult.

Like most engineering decisions, there is no universally correct answer. Both approaches have strengths and weaknesses, and the right choice often depends on the specific problem, product requirements, and available resources.

I do not believe the future belongs exclusively to online solutions or exclusively to offline solutions. The most interesting products will likely combine both.

The real question is how much work truly needs to be sent to a server and how much modern smartphones are already capable of doing on their own.

And this, at least to me, is where things become interesting.

For many years, the default answer to any sufficiently complex problem was almost automatic: “Let’s send it to the server.” Often, that was the correct decision.

But today, smartphones have long since evolved beyond being simple screens displaying results generated elsewhere.

What continues to surprise me is the strange paradox that our industry has somehow learned to fit neural networks into a few megabytes while occasionally struggling to include a static QR code in an application that works offline.

Perhaps the problem is not application size at all. Perhaps we have simply become so accustomed to treating smartphones as clients connected to servers that we no longer notice how powerful they have become.

That is one of the reasons Edge AI interests me. Not only because it allows models to run locally, but because it enables entirely new categories of mobile applications.

Many things that once belonged exclusively to science fiction, video games, and futuristic imagination are gradually becoming reality: understanding the surrounding world through a camera, real-time object recognition, local AI assistants, intelligent sensor processing, and countless other experiences that no longer require permanent connectivity.

Perhaps Edge AI is not merely another branch of artificial intelligence. Perhaps it is one of the first steps toward a new generation of mobile applications where smartphones begin behaving like true personal computers once again — computers that are always with us, capable of understanding their surroundings, and able to solve real-world problems locally.

At this point, the article could probably end with an optimistic conclusion about the future of mobile development.

We have smartphones.

We have Edge AI.

We have new possibilities.

We have a few strange questions about modern software and the way it is built.

Everyone is happy.

Unfortunately, I still have one more observation.

And if you have not fallen asleep yet, I now have a reasonably good chance of fixing that.

One unfortunate characteristic of good engineering problems is that sooner or later they stop being purely technical.

Because neural networks, Dynamic Features, servers, megabytes, and architectures rarely make decisions on their own.

People do.

And that is why the final section is not really about smartphones, models, or application sizes. It is about people, priorities, and the reasons why some good engineering solutions never make it beyond the stage of being good ideas.

When Technical Problems Stop Being Technical

While working on this feature, I gradually arrived at another observation: The problem is not always the technology.

Modern software development has excellent tools at its disposal: minification, Dynamic Features, on-device models, powerful mobile hardware, and countless other ways to make products faster, smaller, and more autonomous.

The problem is that tools, by themselves, solve nothing.

At some point, I began to suspect that many strange decisions are born not from technical limitations, but from organizational ones.

When engineering discussions are replaced by deadlines, and developers start being viewed not as partners in building a product but as resources for completing tasks, priorities gradually begin to shift.

In that environment, it becomes difficult to find time for questions such as:

“Do we really need this dependency?”

“Could this be done locally?”

“Would this genuinely make the product better for users?”

It is much easier to complete the assigned task, close the ticket, and move on to the next one. And that is entirely rational behavior. Especially when asking additional questions is viewed as an obstacle rather than a part of engineering work.

Perhaps that is why some of the problems of modern software do not begin with bad technology or bad engineers. Perhaps they begin the moment we stop asking questions.

The irony is that the industry has never had more tools for accelerating software development than it does today. There is also another interesting trend.

Modern software development increasingly rewards the speed of feature delivery. AI assistants, code generation, automation, and countless other tools allow us to create software faster than ever before.

These are wonderful tools, and I use them myself.

But the speed of producing code is not always the same thing as the speed of understanding it. Part of engineering is not merely writing new code.

It is also about asking uncomfortable questions:

Do we need this dependency?

Could this be simpler?

Does this logic really need to live on a server?

Will this actually improve the experience for users?

Sometimes those questions create more value than the code itself.

Yet they are often the first things sacrificed when time becomes limited.

At this point, I realized that an article about an AI module smaller than a smartphone photo was slowly attempting to transform itself into a collection of reflections on the modern software industry.

Which, I must admit, looks somewhat suspicious.

Throughout the article, I used my own application as an example of why it can be worthwhile to fight for every megabyte, remove unnecessary complexity, and keep things compact. And then, almost without noticing, I managed to inflate the article itself to a size that is beginning to concern even me.

If I am being honest, that is slightly alarming.

Because nearly every topic we touched along the way — Edge AI, software bloat, Dynamic Features, the role of smartphones, engineering trade-offs, and the pace of modern development — could easily become a separate article. And sometimes an entire series of articles.

But I decided not to abuse the patience of my readers and move on to the conclusion.

At least this time.

Conclusion

In the end, this article did not begin with Edge AI, Dynamic Features, or software bloat.

It began with a simple question: Why does there still not seem to be a tool that properly solves a very real problem users encounter?

While searching for an answer, it became clear that the question was much broader than I initially expected.

Perhaps the biggest challenge in modern software development is not a lack of technology.

The industry already has plenty of tools, computing power, and knowledge to build fast, compact, and user-friendly products.

The real question is how we choose to use them.

Does every feature really need to run through a server?

Is every dependency truly necessary?

Will any of this actually make the product better for users?

And do we sometimes forget that software is built not only for businesses or developers, but primarily for the people who use it every day?

Good software reminds me a little of a good electrician. As long as everything works, almost nobody thinks about him. But the moment something stops working, his absence becomes immediately noticeable.

That is why I believe it is worth appreciating the engineers, teams, and companies that manage to balance the needs of users, businesses, and development — even when doing so requires additional effort.

Perhaps those are the decisions that make software genuinely good.

P.S.

If reading this article made you curious about what an AI module smaller than a smartphone photo actually looks like, feel free to take a look at Morse Code Interpreter.

And if you happen to leave a review, a rating, or simply tell me that I made all of this up and that QR codes secretly require 200 MB to function, I would appreciate that as well.

After all, if an AI module managed to end up smaller than a smartphone photo, it has probably earned at least a few reviews 😉