In programming, is it better to have many small files or one large file?

Cécile Lebleu on February 05, 2019

The title says it all. In general, is it better to have multiple small files linked together, or one large file? At least in the web, I know that ... [Read Full]
markdown guide
 

The idea of "files" is, for the most part, an abstraction developed to help people make sense of the contents of a disk. It's a lot easier to envision chunks of storage as documents with names and extensions and so forth than it is to try to work with byte offsets and lengths. Think of trying to find your way: street signs and buildings and landmarks help you orient yourself by dividing and delimiting space, but if you're lost in the desert, one sand dune looks much like the next.

It is perfectly possible to write programs in a single file in many languages, but tools that allow multiple source files to be "linked" into the finished product date back half a century for a reason: it makes an enormous difference to our ability to comprehend how individual parts of the system work and interact with each other. The ways in which programs may be divided into source files vary wildly depending on language, purpose, convention, and taste. But those are all essentially human factors. It doesn't matter to the computer.

 

Extending on this, I'd say that my reasoning is that you should split files by logical group, and ensure that they stay on topic. It's quite easy when you look at the difference between a CSS or JS file. However, each of those can be split by topic / purpose. You'll then find that there's opportunity for reuse rather than duplication :-)

I do like that you mention that the notion of "many files" is mostly a human construct, and I agree fully :-D

 

AgentDenton is right in the sense that files in a program should be logically grouped.

To extend on Dian's awesome answer, the amount of files (or the splitting of files in a program) depends on what programming language you use and what the purpose of your program is. Generally speaking, in a production-ready application (or a service, etc), the program should be split into multiple files that are grouped together based on the behaviour they provide.

For example, if you are building an application that reads files into memory and then puts them in a database, it makes sense to split the multiple steps into files that are grouped accordingly (e.g you can have the file load step, the decoding step and the persistence step).

It is quite important to get a feel for splitting up your program into multiple files so that (and quoting Dian here) it can "help you orient by dividing and delimiting". This also applies to functions, packages (or modules) etc. In object-oriented programming (think Java, C++) one good rule is the "single responsibility principle" which states that "every function, class (in our case file) or module should have responsibility over a single part of the functionality provided by the software". This is basically saying that in most cases (certainly not all or not in all programming languages) your files should do one thing only and do it well. Try using this rule the next time you code a small application and see if it helps your code become more readable and clean.

(ref: en.wikipedia.org/wiki/Single_respo...)

 

Enlightening! I guess it does depend greatly on the people writing, using, and maintaining the code.
Thank you for your answer!

 

1/ Do not confuse the files that are used to write your code with the files that you deploy. You may perfectly work with many small files that are packaged in one single file when deploying.

2/ As far as version control and collaboration are concerned, many small files is certainly the way to go. Merges will be far easier.

 

I'd like to share this amazing talk by Evan Czaplicki, the creator of Elm: The life of a file.

In Haskell and Elm, I like to think of files (i.e. modules) as small libraries that offer good APIs. Then, other modules build upon them to offer higher-level APIs, and so on. I try to apply this idea in other languages when possible.

 

It is a lot easier to find what you need in a nested directory structure of many files, rather than a single file or even a flat directory of files with no structure. Use the filesystem to your advantage!

I have a general guideline of trying to keep files, of any kind, under 200-300 lines each. Regardless of whether it is CSS, JS, Java, Kotlin, raw HTML, or anything else, you will do well to find a framework/toolkit/workflow that will compile multiple files together so you can break them apart logically. If a file is getting larger than just a couple hundred lines, it is probably doing too much and should be broken up, so that each file can focus on doing a single thing well.

 

Actually, Version control encourages us to have plenty of small files vs one big file. Working on different files across a team helps avoiding merge conflicts !

 

At least in the web, I know that you should have one or very few CSS stylesheets, to better control the style across different pages and all.

That only true for the final build, not for the development environmental.

It is much better to have small files with good names that target only small sections or modules. It's easier to browse, easier to understand and to manage even. Then, of course, they should be "compiled" to one file only because that's HTTP1 optimization.

In fact, modern JS frameworks also encourage this approach, like Components in React.

I don't know about other languages, though.

 

I prefer small to mid-sized files.

I think that files should act like bookmarks for your code that help you and other developers understand what code does what, and easily find a specific group of code when something needs to be changed.

My advice is to use one file per module or class, or when it comes to languages that don't use modules and classes (like CSS), to try to pick a specific 'theme' for what the code in that file does.

Large files in and of themselves aren't a necessarily a problem, but I find they often indicate a class or module that's trying to do to much, and thus breaking the single responsibility principle.

 

When developing I'd say the best practice is to have a good balance between the number of files and their size. If you have too many very small files or too few huge files your productivity will suffer anyway.

When delivering to clients instead you do the best thing for the client platform/interpreter (browser, java vm, os, etc).
For browsers specifically you usually want to reduce to the minimum the number of requests needed to get all the resources so that the page can be rendered quickly

 

Wow, so many answers! Thank you everybody for your valuable input.
It's great to see different opinions from different backgrounds, but I see that generally the agreement is that having more small files tends to be better for the human side of building and maintaining programs. As the computer doesn't really "care" about how many files there are, it's a matter of keeping all the developers, present and future, in mind, when deciding how to organize code. Thanks!

 

Spiting your JS into multiple files is more readable for me. And I don't think it's a bad thing because even if you call several scripts in an html file for example, the navigator will simply put all your scripts in one giant script.

 

For front end assets. it's straightforward these days to break it down logically into smaller files to work with them more easily, and then during the build process to concatenate them. So you can have many CSS/SCSS and JS files, but only serve one of each in production. That way you can have the best of both worlds.

It's nearly always easier to understand lots of small files for server-side code too, and it's not likely to affect performance.

 

Basically, if you need (almost) all data from a file, keep it in one file. If you need just a small subset at a time and you can explicitly request a specific subset, keep subsets in separate files.

Let's say you have to create an address book for a small company. It contains a few offices and you will want to show all of them at once. It makes sense to keep the addresses in one file. But what if you were to create an address book of all companies in your country?

You would probably need to find a way to group addresses and keep them in separate files. Otherwise, it would be difficult to send so much data to a client or even open it on a server.

One solution would be to group addresses alphabetically. One file would contain companies starting with 'a', another starting with 'b' and so on. Then, your address book could allow a user to pick the first letter and reduce the amount of data loaded unnecessarily. You could further reduce file size by grouping addresses by the first two letters: 'aa', 'ab', 'ac'... Then, you could easily create a search on your site that would work when a user enters the first two letters.

This is pretty much how databases work with indexes and partitioning.

Sometimes, even if you need to send all the data to a user, you might want to keep it in separate files. Back in the days of floppy disks, a game would consist of 19 archive files (rar/zip), each under 1.44. This is still a relevant approach in the days of the internet as you can deliver "just enough" data to a user more quickly and load subsequent packages while the user is enjoying the data.

 
 
code of conduct - report abuse