DEV Community

Cover image for Good File Structure
Herbert Breunung
Herbert Breunung

Posted on

Good File Structure

My last post ended on the note, that code files need a dependable structure. Work becomes more productive and enjoyable if you know where is what, without heaving to read a character - just from scanning the contours. And you don't have to even memorize the order of sections, if the file structure adheres to one overarching rule: FROM THE GENERAL TO THE DETAIL. This rule is intuitive and supports you to understand the code in one read. Let me explain.

Even Japanese or Jews start to read a code file in the upper left corner. At that point they might - or not - have some knowledge of the content. So it is better, while starting to read, to get a broad picture of the content explicitly. Later will be smaller and smaller details added as needed - FROM THE GENERAL TO THE DETAIL.

The most general are meta information, like author, date and license. In my opinion the license is way too much text to be included in a code file. Just insert a reference to the file or put the file in an expected place, like the projects root directory, which is also legally binding. And keep the other meta info including its formatting to a minimum, since its rarely of interest. But put it on top or bottom so its easy to skip.

Often you see in the meta header also one or few lines, that summarize the content or purpose of the file. The ladder is much more important and should stand just above the name of the namespace / package / object, so both build a unit that sets up your orientation and expectation. This is part is crucial and worth to spend time on. It helps the author to sharpen his understanding what is to be achieved here, which can dramatically improve code clarity and thus quality. It also ensures that the namespace / package / object name is the expressive and poignant summary of the summary. Finding such names is a good part of professional programming and this arrangement supports you in that. Because if you have the feeling some meaning is missing in the name, you have to add it in the summary. Just don't stop here, but think hard what about the summary can be deleted by choosing a better name. This technique can be applied also to any identifier.

The next less general information is the version number of this namespace and of the used language. Subsequent are pragmas (optional language features) and libraries used here. First the ones from the language core, than third party libs at last the internal ones. Global constants, enums and variables are following. If all these item are more than a few - group them for better oversight. Now we only have the routines, functions or methods. Visual separators are helpful to distinct between the just described head section and the various types of methods. The following paragraphs will be just about methods, because if you got only functions you can group them by topic without much deeper thought. But you should apply also some of the principles that are best taught in an object oriented context.

It is logical to start with the constructor. Not only because it's the first method you will use and probably the first a reader will seek. The constructor will tell you also about many of the arguments used in any method and more importantly about the internal data structure of the class. With that knowledge any subsequent method becomes far easier to grok. And - while being on the topic of the life cycle of an object - the destructor method and even serialisation (if present) should be part of this section too. Just for the reason alone, that code that employs the same special knowledge should be pooled together to minimize searches and make code more self documenting.

The only other methods that reach into parts of the internal data structure should be the getter- and setter- methods - also known as accessors. These are the content of the second block of methods. They are usually very small and give you a good and fast outline over the internal and external interface (API) and its data flow.

Next are the methods that contain the most lines of code - lets call the workhorse methods. There should be only a few and well commented.

The fourth section contains helper functions and method, that are so specific to this class, that they should not be abstracted into an own namespace. Some might protest heavily and mark this bad practice and violation of the stated goal: being able to understand the class in one reading. They want to know the content of a method before it is called, to have a thorough understanding what is happening. While I sympathize with this stance, it contains misunderstandings. First off, most constructors do already call methods that appear in our order below - so it is no rule we can comply with while holding to the proposed order. Secondly, an implication of our stated main rule is the sub rule: from the public API on inward. This alone places the auxiliary methods last. But the best defense recalls the purpose of a method. It is a piece of abstraction. If you need t see the internals you created a flawed abstraction. Either the function name does not tell you what is going on or the method is too big (complex) or even worse: has side effects.

Sure abiding to all this takes discipline at first (read: pain). But it pays off in long run. Your code gets easier to read, maintain and to extend (all what was promised by using OO). Even writing new classes becomes easier, since you no longer think about a lot of details. You became free to concentrate on the irreducible problems that your class solves. So have fun and fine tune the rules to your sensibilities and needs - but be consistent.

Top comments (0)