DEV Community

mzandinia
mzandinia

Posted on

Automating Linux Distribution Updates with Ansible and Monitoring with Splunk


When you think about automating your project, it's essential to understand what you aim to achieve at the end. Automation is not just a task that can be accomplished with some tools; these tools are merely aids to make our lives easier. As an architect responsible for the automation process, Stephen Covey's principle of "Begin with the end in mind" is crucial. You must consider every aspect of your automation project.

This project streamlines the updating process of the most popular Linux distributions (RedHat and Debian-based) by leveraging a Nexus repository as a centralized source for installation files and packages.
The project consists of the following key components:

  1. Nexus Repository: A private Nexus repository is set up to store and manage installation files and packages required for updating the Linux instances. This ensures a reliable and controlled source for the necessary files.
  2. Splunk Installation: An Ansible playbook is used to install Splunk on a dedicated instance within a private subnet. The Splunk instance retrieves the required installation files from the Nexus repository, ensuring a secure and efficient installation process.
  3. Ansible-based Updates: Ansible is employed to automate the updating process for all the Linux instances. It executes the necessary tasks to update the instances and logs the results for further analysis and monitoring.
  4. Splunk Dashboard: The update statuses and outcomes are visualized through a comprehensive dashboard in Splunk. This dashboard provides a clear and centralized view of the update statuses across different environments, enabling easy monitoring and troubleshooting.

Repository Structure

  • Terraform files: Essential files for AWS resource creation.

  • Scripts: Scripts for proper instance configuration.

  • ansible-project: Contains playbooks and roles for various configurations.

Usage
Prerequisites
Before you can use these configurations, you need to have the following:

  • An AWS account with appropriate permissions to create resources such as VPC, EC2 instances, and security groups.
  • Terraform installed on your local machine. Install Terraform
  • Your AWS credentials, including your Access Key and Secret Access Key, are configured for use with the AWS CLI or Terraform.

Deployment all the infrastructure with Terraform

git clone https://github.com/mzandinia/sample-project-on-AWS
cd sample-project-on-AWS/week4
terraform init
terraform plan
terraform apply
Enter fullscreen mode Exit fullscreen mode

After terraform creates all necessary resources terraform output will provide you three values:

  1. Nexus initial password
  2. Load Balancer address
  3. Nexus login URL

terraform output

At this moment, log in to Nexus via the URL and password that were provided to you in the Terraform output.

nexus signin

Click on the "Sign In" button on the top right of the screen and enter the admin user and the password provided to you in the last step.

Image description

Click Next to continue.

click next

Enter a new strong password.

enter password

Choose enable anonymous access and then click next to continue. Enabling anonymous access allows unauthenticated users to access certain repositories or resources.

Image description

Choose Finish to complete the login and configuration process.

Image description

In the following steps, we will setup configure Nexus, install and configure Splunk, and update all the instances with Ansible playbooks.

Step1: Configure Nexus by Ansible

To configure Nexus using Ansible, first login to the Nexus server via SSH using the demouser username and changeme password. Use the public address of Nexus instance.

Image description

Next, log in to the Ansible server via SSH using the same demouser username and changeme password. The private IP of Ansible is 10.10.10.11.

Image description

Clone the repository containing the Ansible playbooks:

git clone https://github.com/mzandinia/sample-project-on-AWS.git
cd sample-project-on-AWS/week4/ansible-project
Enter fullscreen mode Exit fullscreen mode

Image description

Run the ansible-playbook command to execute the nexus-config.yml playbook, which configures the Nexus OSS repository. When prompted, enter the new password you configured for Nexus during the login process.

ansible-playbook 01-nexus-config/nexus-config.yml
Enter fullscreen mode Exit fullscreen mode

Image description

Image description

After the playbook execution completes, you can verify that the appropriate repositories are configured in Nexus.

Image description

Step 2: Configure Linux Instances to use the nexus repo
The repo-config.yml playbook automates the configuration of the Linux instances to use the Nexus repository. It updates the package manager configuration files on each instance to point to the Nexus repository, ensuring that the instances will fetch packages and dependencies from Nexus instead of the default public repositories.

To configure the Linux instances to use the Nexus repository, run the following command:

ansible-playbook 02-repo-config/repo-config.yml
Enter fullscreen mode Exit fullscreen mode

Image description

Step3: Install Splunk and Special app to view the results of the updates

Splunk is a powerful log management and analysis tool that can help you monitor and visualize the results of the updates performed on Linux instances.
The splunk-config.yml playbook automates the installation of Splunk on a designated instance and configures it to collect logs from the Linux instances. It also sets up the necessary Splunk configurations and forwards the logs to the Splunk instance for centralized monitoring.

The linux-update app includes pre-configured reports and alerts that help you track the progress of the updates, identify any issues or failures, and ensure that all instances are successfully updated. The dashboard provides a comprehensive view of the update process, making it easier to monitor and manage the updates across multiple instances.

By using Splunk and the linux-update app, you can gain valuable insights into the update process, quickly identify and troubleshoot any problems, and ensure that your Linux instances are up-to-date and secure.

To install Splunk and set up a custom dashboard, run the following command:

ansible-playbook 03-splunk-config/splunk-config.yml
Enter fullscreen mode Exit fullscreen mode

Image description

Image description

Step4: Updating Linux Instances

Keeping your Linux instances up-to-date is crucial for maintaining system security, stability, and performance. This step focuses on updating the Linux instances using Ansible, with different update strategies and reboot behaviors.
The update process is divided into four plays based on two factors:

1- How to update the OS: Update all packages or exclude some packages

2- How to reboot the server: Just check the reboot status or automatically reboot using Ansible

Image description

The update-linux.yml playbook handles the update process based on the specified variables, ensuring that the desired update strategy and reboot behavior are applied to each instance.
To demonstrate all possible conditions, there is one instance that exists, so the unreachablility can be chack. Also, there is another instance that doesn't properly configure to use the Nexus repository so the cache updating process will fail.

To update all instances, run the following command:

ansible-playbook 04-update-linux/update-linux.yml
Enter fullscreen mode Exit fullscreen mode

Image description

Image description

After executing the playbook, use the ALB load balancer address to log into Splunk with the username admin and password changeme. The Splunk dashboard provides a centralized view of the update process, allowing you to monitor the progress and view the results.

Image description

Image description

Click on the "Linux Update Dashboard" to access the custom dashboard created by the linux-update app.

Image description

This dashboard presents a comprehensive overview of the update results and various configurations.

Image description

The linux-update app also includes predefined reports and alerts to help you identify and troubleshoot issues that may occur during the update process:

Reports:

Image description

  • Cache Update Failure: Triggers if the cache update process fails for any reason.
  • Host Unreachable: Alerts if any host becomes unreachable during playbook execution.
  • Update Result Failure: Indicates if there is a problem during the update operation.

Alerts:

Image description

  • Playbook Execution Failure: Triggers if there are no logs, suggesting that the playbook was not executed.
  • Reboot Failure: Critical alert if an instance fails to come back up after an Ansible-initiated reboot.
  • Server(s) must be reboot: Notifies if some servers require manual rebooting.

By leveraging Ansible, Splunk, and the linux-update app, you can efficiently manage and monitor the update process across your Linux instances, ensuring system security and stability.

Top comments (0)