A real story from building a manuscript submission system at a scientific publishing company.
We were building a system at a Publishing Company that accepts a research manuscript and sends it through an automated scientific publishing pipeline.
Then we had to write the class that an API calls to create a submission.
class ManuscriptSubmissionService:
def __init__(self, title, abstract, authors, affiliations, keywords,
references, funding, ethics, conflicts,
supplementary_files, cover_letter, supporting_files,
manuscript_attachment):
self.title = title
self.abstract = abstract
self.authors = authors
self.affiliations = affiliations
self.keywords = keywords
self.references = references
self.funding = funding
self.ethics = ethics
self.conflicts = conflicts
self.supplementary_files = supplementary_files
self.cover_letter = cover_letter
self.supporting_files = supporting_files
self.manuscript_attachment = manuscript_attachment
And creating an object looked like this:
submission = ManuscriptSubmissionService(
"Covid-19 study",
"some random abstract goes here",
"a1, a2, a3",
"aff1, aff2, aff3",
"keyword1, keyword2",
"references1, references2",
"funding1",
"ethic1",
"conflict1",
"supplementary1",
"manuscript1",
"supporting_file1",
"manuscript_attachment1",
)
Look at that. Thirteen positional arguments. No context. No idea what the fourth parameter is without scrolling back up to the constructor definition every single time.
This is called the monster constructor problem — one init that keeps growing until nobody can read it.
Fix 1: Named Parameters
The first instinct is to use named parameters with defaults:
class ManuscriptSubmissionService:
def __init__(self, title: str = None, abstract: str = None,
authors: list = None, affiliations: list = None,
keywords: list = None, references: list = None,
funding: str = None, ethics: str = None,
conflicts: str = None, supplementary_files: list = None,
cover_letter: str = None, supporting_files: list = None,
manuscript_attachments: list = None):
self.title = title
self.abstract = abstract
self.authors = authors
self.affiliations = affiliations
self.keywords = keywords
self.references = references
self.funding = funding
self.ethics = ethics
self.conflicts = conflicts
self.supplementary_files = supplementary_files
self.cover_letter = cover_letter
self.supporting_files = supporting_files
self.manuscript_attachments = manuscript_attachments
submission = ManuscriptSubmissionService(
title="Covid-19 study",
abstract="some random abstract goes here",
authors="a1, a2, a3",
affiliations="aff1, aff2, aff3",
)
This looks cleaner at the call site. But now everything is optional — None by default. A researcher can submit a manuscript with no title, no authors, no actual manuscript file. The system accepts it silently and breaks somewhere downstream.
The problem: no mandatory field enforcement.
Fix 2: Validation Inside the Constructor
Okay, so add validation:
class ManuscriptSubmissionService:
def __init__(self, title: str = None, abstract: str = None,
authors: list = None, affiliations: list = None,
keywords: list = None, references: list = None,
funding: str = None, ethics: str = None,
conflicts: str = None, supplementary_files: list = None,
cover_letter: str = None, supporting_files: list = None,
manuscript_attachments: list = None):
if title is None:
raise IOError("ManuscriptSubmissionService requires a title")
if abstract is None:
raise IOError("ManuscriptSubmissionService requires an abstract")
if not authors:
raise IOError("ManuscriptSubmissionService requires authors")
if not affiliations:
raise IOError("ManuscriptSubmissionService requires affiliations")
if not manuscript_attachments:
raise IOError("ManuscriptSubmissionService requires a manuscript attachment")
self.title = title
self.abstract = abstract
self.authors = authors
self.affiliations = affiliations
self.keywords = keywords
self.references = references
self.funding = funding
self.ethics = ethics
self.conflicts = conflicts
self.supplementary_files = supplementary_files
self.cover_letter = cover_letter
self.supporting_files = supporting_files
self.manuscript_attachments = manuscript_attachments
This looks like it works. But stop and think about what the constructor is doing now.
It's validating the input AND initialising the object's state. Two responsibilities. One method.
That's a Single Responsibility Principle violation.
And there's a deeper problem hiding here. By the time validation runs, the object already exists in memory. Python allocated space for it the moment ManuscriptSubmissionService(...) was called. You're validating an object that's already been born — and potentially destroying it right after.
An invalid manuscript should never exist in the first place.
Fix 3: Setters
What about an empty constructor and setters?
class ManuscriptSubmissionService:
def __init__(self):
self.title = None
self.abstract = None
self.authors = []
self.affiliations = []
self.keywords = []
self.references = []
self.funding = None
self.ethics = None
self.conflicts = None
self.supplementary_files = []
self.cover_letter = None
self.supporting_files = []
self.manuscript_attachments = []
def set_title(self, title: str):
self.title = title
return self
def set_abstract(self, abstract: str):
self.abstract = abstract
return self
def set_author(self, author: str):
self.authors.append(author)
return self
def set_affiliation(self, affiliation: str):
self.affiliations.append(affiliation)
return self
def set_keyword(self, keyword: str):
self.keywords.append(keyword)
return self
def set_references(self, reference: str):
self.references.append(reference)
return self
def set_funding(self, funding: str):
self.funding = funding
return self
def set_ethics(self, ethics: str):
self.ethics = ethics
return self
def set_conflict(self, conflict: str):
self.conflicts = conflict
return self
def set_supplementary_files(self, supplementary_file: str):
self.supplementary_files.append(supplementary_file)
return self
def set_cover_letter(self, cover_letter: str):
self.cover_letter = cover_letter
return self
def set_supporting_files(self, supporting_file: str):
self.supporting_files.append(supporting_file)
return self
def set_manuscript_attachment(self, manuscript_attachment: str):
self.manuscript_attachments.append(manuscript_attachment)
return self
def validate(self):
if self.title is None:
raise IOError("ManuscriptSubmissionService requires a title")
if self.abstract is None:
raise IOError("ManuscriptSubmissionService requires an abstract")
if len(self.authors) == 0:
raise IOError("ManuscriptSubmissionService requires authors")
if len(self.affiliations) == 0:
raise IOError("ManuscriptSubmissionService requires affiliations")
if len(self.manuscript_attachments) == 0:
raise IOError("ManuscriptSubmissionService requires a manuscript attachment")
# Usage
submission = ManuscriptSubmissionService()
submission.set_title("Covid-19 study")
submission.set_abstract("This paper studies...")
submission.set_author("Dr. Smith")
submission.set_affiliation("Oxford University")
submission.set_manuscript_attachment("covid19_paper.pdf")
submission.validate()
Now the object exists with no data. Nothing stops you from calling send_to_pipeline(submission) before calling any setters. The half-built object silently enters the publishing pipeline and breaks somewhere downstream.
The problem: object exists in an invalid state before validate() is ever called.
Three Approaches. Three Failures.
| Approach | Problem |
|---|---|
| Monster constructor | Unreadable, positional confusion |
Named params + validation in __init__
|
SRP violation, object born before validation |
| Setters | Invalid object state, nothing enforced |
None of them fully solve it. What we actually need is a system that builds the object step by step and only creates it in memory when everything is in place.
This is exactly where the Builder pattern comes in.
What Is the Builder Pattern?
Builder is a creational design pattern — it falls under the umbrella of patterns that control how objects are created.
Its job: separate the construction and validation of an object from its representation. An object should only come to life when it has everything it needs.
Two classes. One job each:
-
ManuscriptSubmissionService— holds data. Nothing else. -
ManuscriptBuilder— sets values, validates, and only then creates the object.
Building It Step by Step
Step 1: Separate the Classes
ManuscriptSubmissionService.__init__ does exactly one thing — stores what it receives. No validation. No logic. One responsibility.
class ManuscriptSubmissionService:
def __init__(self, title: str, abstract: str, authors: list,
affiliations: list, keywords: list, references: list,
funding: str, ethics: str, conflicts: str,
supplementary_files: list, cover_letter: str,
supporting_files: list, manuscript_attachments: list):
# just stores data — nothing else
self.title = title
self.abstract = abstract
self.authors = authors
self.affiliations = affiliations
self.keywords = keywords
self.references = references
self.funding = funding
self.ethics = ethics
self.conflicts = conflicts
self.supplementary_files = supplementary_files
self.cover_letter = cover_letter
self.supporting_files = supporting_files
self.manuscript_attachments = manuscript_attachments
@staticmethod
def builder():
return ManuscriptSubmissionService.ManuscriptBuilder()
Step 2: The Builder with Method Chaining
Every setter returns self. This enables method chaining — calling multiple setters in one readable fluent line.
Also notice: the builder initialises all fields to None or empty lists. This way, if someone calls build() without setting a field, our validation runs correctly — no AttributeError, just a clean IOError.
class ManuscriptBuilder:
def __init__(self):
self.title = None
self.abstract = None
self.authors = []
self.affiliations = []
self.keywords = []
self.references = []
self.funding = None
self.ethics = None
self.conflicts = None
self.supplementary_files = []
self.cover_letter = None
self.supporting_files = []
self.manuscript_attachments = []
def set_title(self, title: str):
self.title = title
return self
def set_abstract(self, abstract: str):
self.abstract = abstract
return self
def set_author(self, author: str):
self.authors.append(author)
return self
def set_affiliation(self, affiliation: str):
self.affiliations.append(affiliation)
return self
def set_keyword(self, keyword: str):
self.keywords.append(keyword)
return self
def set_references(self, reference: str):
self.references.append(reference)
return self
def set_funding(self, funding: str):
self.funding = funding
return self
def set_ethics(self, ethics: str):
self.ethics = ethics
return self
def set_conflict(self, conflict: str):
self.conflicts = conflict
return self
def set_supplementary_files(self, supplementary_file: str):
self.supplementary_files.append(supplementary_file)
return self
def set_cover_letter(self, cover_letter: str):
self.cover_letter = cover_letter
return self
def set_supporting_files(self, supporting_file: str):
self.supporting_files.append(supporting_file)
return self
def set_manuscript_attachment(self, manuscript_attachment: str):
self.manuscript_attachments.append(manuscript_attachment)
return self
Step 3: build() — Validate First, Create Second
This is the heart of the pattern. Validation happens entirely inside build() — before ManuscriptSubmissionService(...) is ever called. If validation fails, the object is never created. Not even for a millisecond.
def build(self):
if self.title is None:
raise IOError("ManuscriptSubmissionService requires a title")
if self.abstract is None:
raise IOError("ManuscriptSubmissionService requires an abstract")
if len(self.authors) == 0:
raise IOError("ManuscriptSubmissionService requires authors")
if len(self.affiliations) == 0:
raise IOError("ManuscriptSubmissionService requires affiliations")
if len(self.manuscript_attachments) == 0:
raise IOError("ManuscriptSubmissionService requires a manuscript attachment")
# only NOW does the object get created
return ManuscriptSubmissionService(
self.title, self.abstract, self.authors, self.affiliations,
self.keywords, self.references, self.funding, self.ethics,
self.conflicts, self.supplementary_files, self.cover_letter,
self.supporting_files, self.manuscript_attachments
)
Step 4: Block Direct Instantiation
There's still a loophole. Nothing stops someone from bypassing the builder entirely:
# bypasses all validation — object created in invalid state
submission = ManuscriptSubmissionService(None, None, [], [], [], [], None, None, None, [], None, [], [])
Python doesn't support private constructors. But we can enforce it with a UUID token and a custom exception:
from uuid import uuid4, UUID
class InitializationError(Exception):
"""Raised when ManuscriptSubmissionService is instantiated directly instead of via builder."""
def __init__(self, message: str):
super().__init__(message)
class ManuscriptSubmissionService:
_Builder_token = uuid4() # generated once at class definition time
def __init__(self, token_: UUID, title: str, ...):
if token_ != self._Builder_token:
raise InitializationError(
"ManuscriptSubmissionService cannot be initialized directly. "
"Use ManuscriptSubmissionService.builder()"
)
# rest of init...
The token is generated once when the class is defined. The builder knows it. Nobody outside does. Direct instantiation raises InitializationError immediately.
Why a custom exception instead of ValueError? In production, specific exceptions mean:
- Easier debugging — you know exactly what failed
- Callers can catch
InitializationErrorspecifically - Cleaner, more descriptive logs
The Complete Implementation
from uuid import uuid4, UUID
class InitializationError(Exception):
"""Raised when ManuscriptSubmissionService is instantiated directly instead of via builder."""
def __init__(self, message: str):
super().__init__(message)
class ManuscriptSubmissionService:
_Builder_token = uuid4()
def __init__(self, token_: UUID, title: str, abstract: str, authors: list,
affiliations: list, keywords: list, references: list,
funding: str, ethics: str, conflicts: str,
supplementary_files: list, cover_letter: str,
supporting_files: list, manuscript_attachments: list):
if token_ != self._Builder_token:
raise InitializationError(
"ManuscriptSubmissionService cannot be initialized directly. "
"Use ManuscriptSubmissionService.builder()"
)
self.title = title
self.abstract = abstract
self.authors = authors
self.affiliations = affiliations
self.keywords = keywords
self.references = references
self.funding = funding
self.ethics = ethics
self.conflicts = conflicts
self.supplementary_files = supplementary_files
self.cover_letter = cover_letter
self.supporting_files = supporting_files
self.manuscript_attachments = manuscript_attachments
@staticmethod
def builder():
return ManuscriptSubmissionService.ManuscriptBuilder()
class ManuscriptBuilder:
def __init__(self):
self.title = None
self.abstract = None
self.authors = []
self.affiliations = []
self.keywords = []
self.references = []
self.funding = None
self.ethics = None
self.conflicts = None
self.supplementary_files = []
self.cover_letter = None
self.supporting_files = []
self.manuscript_attachments = []
def set_title(self, title: str):
self.title = title
return self
def set_abstract(self, abstract: str):
self.abstract = abstract
return self
def set_author(self, author: str):
self.authors.append(author)
return self
def set_affiliation(self, affiliation: str):
self.affiliations.append(affiliation)
return self
def set_keyword(self, keyword: str):
self.keywords.append(keyword)
return self
def set_references(self, reference: str):
self.references.append(reference)
return self
def set_funding(self, funding: str):
self.funding = funding
return self
def set_ethics(self, ethics: str):
self.ethics = ethics
return self
def set_conflict(self, conflict: str):
self.conflicts = conflict
return self
def set_supplementary_files(self, supplementary_file: str):
self.supplementary_files.append(supplementary_file)
return self
def set_cover_letter(self, cover_letter: str):
self.cover_letter = cover_letter
return self
def set_supporting_files(self, supporting_file: str):
self.supporting_files.append(supporting_file)
return self
def set_manuscript_attachment(self, manuscript_attachment: str):
self.manuscript_attachments.append(manuscript_attachment)
return self
def build(self):
if self.title is None:
raise IOError("ManuscriptSubmissionService requires a title")
if self.abstract is None:
raise IOError("ManuscriptSubmissionService requires an abstract")
if len(self.authors) == 0:
raise IOError("ManuscriptSubmissionService requires authors")
if len(self.affiliations) == 0:
raise IOError("ManuscriptSubmissionService requires affiliations")
if len(self.manuscript_attachments) == 0:
raise IOError("ManuscriptSubmissionService requires a manuscript attachment")
return ManuscriptSubmissionService(
ManuscriptSubmissionService._Builder_token,
self.title, self.abstract, self.authors, self.affiliations,
self.keywords, self.references, self.funding, self.ethics,
self.conflicts, self.supplementary_files, self.cover_letter,
self.supporting_files, self.manuscript_attachments
)
Usage — Plain English at the Call Site
submission = (
ManuscriptSubmissionService.builder()
.set_title("Covid-19 and its effects on respiratory systems")
.set_abstract("This paper studies the long-term effects of Covid-19...")
.set_author("Dr. Smith")
.set_author("Dr. Jones")
.set_affiliation("Oxford University")
.set_affiliation("Imperial College London")
.set_keyword("covid")
.set_keyword("pandemic")
.set_keyword("respiratory")
.set_funding("WHO Grant 2023")
.set_manuscript_attachment("covid19_paper.pdf")
.build()
)
print(submission.title) # Covid-19 and its effects on respiratory systems
print(submission.authors) # ['Dr. Smith', 'Dr. Jones']
The Director — Presets for Known Configurations
Builder has an optional companion — the Director. It doesn't build anything itself. It just knows the right combination of steps to create commonly used presets.
we had different submission types — standard journal, fast-track, open access. Each had different default configurations. Instead of copy-pasting builder chains everywhere, the Director encapsulates those presets.
The Director should get a fresh builder for every preset call — not reuse the same builder instance, which would cause state pollution across calls.
class ManuscriptDirector:
def __init__(self, builder):
self.__builder = builder
def standard_submission(self):
return self.__builder().set_ethics("standard-ethics-declaration").set_conflict("no-conflicts-declared")
def fast_track_submission(self):
return self.__builder().set_ethics("expedited-ethics").set_cover_letter("fast-track-request-letter")
def open_access_submission(self):
return self.__builder().set_funding("open-access-fund-2024").set_ethics("standard-ethics-declaration")
director = ManuscriptDirector(ManuscriptSubmissionService.builder)
submission = (
director.standard_submission()
.set_title("Effects of Climate Change on Biodiversity")
.set_abstract("This paper explores...")
.set_author("Dr. Patel")
.set_affiliation("Cambridge University")
.set_manuscript_attachment("climate_paper.pdf")
.build()
)
When do you need a Director? When you have multiple known configurations of the same object and you want a single place to manage those presets. When you only have one configuration — skip it. Don't over-engineer.
The Tradeoffs
What Builder Gives You
- ✅ Readable object creation
- ✅ Mandatory field enforcement
- ✅ No SRP violation
- ✅ Object never exists in an invalid state
- ✅ Optional fields with sensible defaults in the builder
- ✅ Method chaining
What Builder Costs You
1. Boilerplate — the OCP violation
Every new mandatory field means changes in 5 places:
-
ManuscriptSubmissionService.__init__— add the parameter -
ManuscriptBuilder.__init__— add defaultNone - Add a new setter method
- Add validation in
build() - Pass it when calling
ManuscriptSubmissionService(...)
This is an Open/Closed Principle violation — adding a field forces modification of existing code in multiple places.
2. More Code for Simple Cases
A class with 2-3 fields doesn't need a Builder. You've added a nested class, a static method, and 3 setters for what could have been a simple constructor with keyword arguments. Apply Builder when the complexity justifies it.
3. Direct Instantiation is Not Truly Private
The UUID token trick works in practice, but it's a convention enforced at runtime — not a language feature. Python doesn't support private constructors. A determined developer can inspect ManuscriptSubmissionService._Builder_token and bypass the guard. This is an accepted limitation of the pattern in Python.
5. No Compile-Time Safety
Unlike Java or Kotlin where you can enforce mandatory fields at compile time, Python's Builder enforcement is entirely at runtime. A missing mandatory field only surfaces when build() is called — not when the setter is forgotten. Type checkers and tests help, but they don't eliminate the gap.
Top comments (0)