The following guide walks you through the process of converting raw Linux application logs to CSV and then into structured JSON data. To begin, create the necessary directory structure:
mkdir -p tutorial
cd tutorial
Step 1: Generate Raw Logs
The process begins by generating log data. We use the echo command to append simulated application output to our log file.
# Generate logs inside the tutorial folder
echo "2026-01-18 05:42:09 | INFO | system initialized" >> myapp.log
echo "2026-01-18 05:42:09 | ERROR | disk space low" >> myapp.log
echo "2026-01-18 05:42:09 | INFO | user logged in" >> myapp.log
Step 2: Raw Log File
Once generated, the log file contains raw, delimited entries.
File: myapp.log
2026-01-18 05:42:09 | INFO | system initialized
2026-01-18 05:42:09 | ERROR | disk space low
2026-01-18 05:42:09 | INFO | user logged in
Step 3: Convert to CSV
The raw logs are parsed into a CSV format. This can be done using shell utilities like sed or awk to replace delimiters.
Example Command:
sed 's/ | /","/g; s/^/"/; s/$/"/' myapp.log > myapp.csv
File: myapp.csv
"2026-01-18 05:42:09","INFO","system initialized"
"2026-01-18 05:42:09","ERROR","disk space low"
"2026-01-18 05:42:09","INFO","user logged in"
Step 4: Create the .py file
Create a file named csv_to_json.py
touch csv_to_json.py
and paste in the following code:
import csv
import json
import sys
from pathlib import Path
def convert(csv_file, json_file):
csv_path = Path(csv_file)
json_path = Path(json_file)
if not csv_path.exists():
print(f"Error: Input file '{csv_file}' not found.")
print(f"Make sure you are running this from the project root and have generated the CSV.")
sys.exit(1)
data = []
with open(csv_path, mode='r', encoding='utf-8') as f:
reader = csv.DictReader(f, fieldnames=["timestamp", "level", "message"])
for row in reader:
data.append(row)
# Ensure parent directory for output exists
json_path.parent.mkdir(parents=True, exist_ok=True)
with open(json_path, mode='w', encoding='utf-8') as f:
json.dump(data, f, indent=4)
print(f"Successfully converted {csv_file} to {json_file}!")
if __name__ == "__main__":
# Find files relative to the script's own directory
script_dir = Path(__file__).parent
csv_file = script_dir / 'myapp.csv'
json_file = script_dir / 'myapp.json'
convert(csv_file, json_file)
Step 5: Use the Python file to create the JSON file
Now, run the script to generate your structured JSON data:
python3 csv_to_json.py
Step 6: Final JSON Output
The resulting JSON file is ready for consumption by web dashboards or other data analysis tools.
File: myapp.json
[
{
"timestamp": "2026-01-18 05:42:09",
"level": "INFO",
"message": "system initialized"
},
...
]
I hope you found this tutorial useful!
Ben Santora - January 2026
Top comments (0)