DEV Community

Grizzly Coda
Grizzly Coda

Posted on • Originally published at confdroid.com

Puppet with Foreman - Pilot

Understanding Puppet Core with Foreman as ENC

Introduction to Puppet DSL

Puppet DSL (Domain-Specific Language) serves as a declarative language designed to define the desired state of systems within an infrastructure. This approach facilitates automation and ensures consistent configuration management across varied environments.
Puppet boasts a vibrant community and offers flexibility in its application. Much like Linux, it can be tailored to meet the specific needs and requirements of any given infrastructure. While many infrastructures share common configuration practices, each one invariably includes unique elements not found in others. This diversity often prompts a key decision: whether to rely on community modules or to develop custom ones.
This series explores the process of building a custom infrastructure using proprietary modules, with Foreman employed as the External Node Classifier (ENC).

With over a decade of evolution in Puppet (from its open-source roots to what is now known as Puppet Core), extensive experience has been gained by the author through the development of approximately 75 modules. This expertise forms the foundation for sharing insights and best practices in this series.

Configuration Management vs. Infrastructure as Code

Configuration management (CM) and Infrastructure as Code (IaC) represent foundational concepts in modern IT operations, both aimed at automating and standardizing the deployment and maintenance of systems. However, they differ in scope, focus, and application.

Basics and Principles

Configuration management involves the systematic handling of changes to ensure that systems remain in a desired, consistent state over time. It emphasizes repeatability, auditability, and control, often through tools that enforce policies and remediate deviations automatically. The core principle here is idempotency—meaning operations can be applied multiple times without changing the result beyond the initial application—combined with version control for configurations.
Infrastructure as Code, on the other hand, extends this idea by treating infrastructure elements (such as servers, networks, and databases) as software code. IaC enables the provisioning, configuration, and management of entire environments through code repositories, allowing for versioning, collaboration, and automated testing similar to software development practices. Its principles include declarative definitions (describing "what" the infrastructure should look like) or imperative scripts (detailing "how" to achieve it), with a strong emphasis on scalability and integration with CI/CD pipelines.
In essence, CM focuses more on ongoing maintenance and compliance of existing systems, while IaC encompasses the full lifecycle, from creation to decommissioning, promoting infrastructure as a programmable entity.

Common Tools

Several tools dominate the landscape for both CM and IaC, each with strengths suited to different scenarios:

Chef: A Ruby-based CM tool that uses "recipes" and "cookbooks" for declarative or imperative configurations. It excels in environments requiring fine-grained control and is often favored for its extensibility in large-scale operations.
Ansible: Known for its agentless architecture and simplicity, Ansible employs YAML playbooks for automation. It's ideal for quick setups and ad-hoc tasks, blending CM with IaC through modules for provisioning cloud resources.
Puppet: A declarative CM system using its DSL to model system states. It supports master-agent or masterless modes and integrates well with IaC for consistent enforcement across hybrid environments.
SaltStack: Offers high-speed execution via its event-driven model, with "states" for CM and "pillars" for data management. It's particularly strong in real-time orchestration and remote execution.
Terraform: Primarily an IaC tool from HashiCorp, it uses HCL (HashiCorp Configuration Language) to provision and manage infrastructure across providers like AWS, Azure, and Google Cloud. Terraform shines in multi-cloud scenarios with its state management and modular design.

Where and When to Use Which

The choice between these tools depends on factors such as team expertise, environment complexity, and specific goals:

Opt for Chef or Puppet in traditional CM scenarios where long-term configuration drift prevention is critical, such as in enterprise servers or legacy systems requiring detailed auditing.
Choose Ansible for simpler, agentless deployments or when speed and ease of learning are priorities, like in DevOps teams handling diverse tasks from configuration to orchestration.
Use SaltStack for high-performance needs in large-scale, dynamic environments, such as cloud-native setups demanding real-time responses.
Turn to Terraform when IaC is the focus, particularly for provisioning infrastructure from scratch in cloud-heavy workflows, ensuring reproducibility across development, staging, and production.

In hybrid cases, tools can be combined—for instance, using Terraform for initial provisioning followed by Puppet for ongoing CM. Ultimately, the best tool aligns with the infrastructure's scale, the team's workflow, and the need for integration with existing systems.

Deep Dive into Puppet

Puppet Hiera vs. ENC

In Puppet, data management and node classification are handled through mechanisms like Hiera and External Node Classifiers (ENC), each serving distinct yet complementary roles.
Hiera functions as a built-in, hierarchical key-value lookup system integrated directly into Puppet. It allows data (such as configuration parameters, secrets, or environment-specific values) to be separated from the code, promoting reusability and modularity. Data is organized in a hierarchy—often based on facts like operating system, environment, or hostname—and queried during catalog compilation. For example, a default value might be overridden by a more specific one higher in the hierarchy. Hiera supports backends like YAML, JSON, or even custom databases, making it efficient for static data retrieval without external dependencies.
An ENC, conversely, is an external system that classifies nodes and provides class inclusions and parameters to Puppet. Rather than being embedded in Puppet's codebase, an ENC operates as a separate service or script that Puppet queries via an API during the node request process. This external approach enables dynamic classification based on external data sources, such as databases or inventory systems. ENCs return YAML-formatted responses specifying which classes to apply to a node and any associated parameters, offering greater flexibility for complex, enterprise-level setups.
The key difference lies in scope and integration: Hiera is primarily for data lookup within Puppet's internal workflow, focusing on parameterizing manifests without altering class assignments. Through the use of eyaml, lookup files can be fully encrypted on rest and dynamically decrypted at the puppet/compile master during compilation. ENC, however, handles both classification (deciding which modules or classes apply to a node) and parameterization externally, allowing for more programmatic control outside Puppet's core.

What Makes Foreman Superior as an ENC

Foreman stands out as a superior ENC option compared to relying solely on Hiera, particularly in environments demanding comprehensive management and scalability.
Unlike Hiera's internal, file-based hierarchy, Foreman provides a full-featured web-based interface for managing nodes, hosts, and configurations. As an ENC, it integrates seamlessly with Puppet by classifying nodes based on rich metadata, including host groups, smart classes, and facts. This enables automated, rule-based assignments—such as applying specific classes to nodes matching certain criteria (e.g., all web servers in a production environment)—without manual manifest edits.
Foreman's superiority is evident in several areas:

Provisioning Integration: Foreman doubles as a provisioning tool, supporting bare-metal, virtual, and cloud instances through integrations with tools like DHCP, DNS, and libvirt. This lifecycle management—from initial boot to ongoing configuration—surpasses Hiera's data-only focus, reducing silos in operations.
User-Friendly UI and Reporting: Administrators can visualize and edit node classifications via a graphical dashboard, complete with search, auditing, and reporting features. This contrasts with Hiera's command-line or file-based interactions, making Foreman more accessible for teams and enabling better collaboration.
Extensibility and Plugins: With plugins for facts importation, smart proxies (for distributed environments), and integrations with tools like Ansible or Chef, Foreman offers a modular ecosystem. It handles complex scenarios, such as multi-tenant setups or compliance reporting, where Hiera alone would require custom scripting.
Dynamic and Scalable: Foreman's database-backed approach supports real-time updates and queries, ideal for large-scale infrastructures. It avoids the performance bottlenecks of deeply nested Hiera hierarchies and provides failover through smart proxies.

In summary, while Hiera excels at simple data separation, Foreman as an ENC elevates Puppet deployments by offering an all-in-one platform for classification, provisioning, and oversight, making it the preferred choice for robust, enterprise-grade automation.

Next to come

  • planning your Puppet infrastructure
  • setting up your first puppetmaster with Foreman

Did you find this post helpful? You can support me.

Hetzner Referral

Top comments (0)