DEV Community

Ajit Kumar
Ajit Kumar

Posted on

Dynamic Log file for each spiders: Scrapy Logging

To dynamically assign a log file without modifying the settings directly, you need to set up logging outside the immutable settings object. Here's how to fix this:


Updated Solution for Dynamic Logging

Instead of modifying the LOG_FILE in the immutable settings, directly reconfigure the Python logging module within the spider_opened signal handler.

Modified DynamicLogFileExtension

import os
import datetime
import logging
from scrapy import signals

class DynamicLogFileExtension:
    @classmethod
    def from_crawler(cls, crawler):
        # Instantiate the extension
        ext = cls()

        # Connect the spider_opened signal
        crawler.signals.connect(ext.spider_opened, signal=signals.spider_opened)
        return ext

    def spider_opened(self, spider):
        # Create a logs directory if it doesn't exist
        log_dir = "logs"
        os.makedirs(log_dir, exist_ok=True)

        # Generate a dynamic log file name
        log_file_name = f"{log_dir}/{spider.name}_{datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}.log"

        # Set up logging to the file
        file_handler = logging.FileHandler(log_file_name, mode="w")
        file_handler.setLevel(logging.INFO)  # Adjust level as needed
        file_handler.setFormatter(
            logging.Formatter("%(asctime)s [%(name)s] %(levelname)s: %(message)s")
        )

        # Get Scrapy's root logger and add the file handler
        logger = logging.getLogger()
        logger.addHandler(file_handler)

        # Debugging message
        print(f"Log file for spider {spider.name} set to: {log_file_name}")
Enter fullscreen mode Exit fullscreen mode

Steps to Implement

  1. Add the Extension to settings.py Register the extension in your settings.py:
   EXTENSIONS = {
       'project_name.extensions.DynamicLogFileExtension': 500,
   }
Enter fullscreen mode Exit fullscreen mode
  1. Run the Spider Execute the spider as usual:
   scrapy crawl your_spider_name
Enter fullscreen mode Exit fullscreen mode
  1. Check the Log Directory Logs will now appear in the logs/ directory with a unique file for each spider:
   logs/
   ├── your_spider_2024-11-28_17-00-00.log
Enter fullscreen mode Exit fullscreen mode

Key Differences in the Fix

  1. Avoids Mutating Immutable Settings:

    • Instead of trying to set LOG_FILE in Scrapy's settings, the code configures Python's logging system directly.
  2. Customizes Logging for Each Spider:

    • Creates a new file for each spider dynamically using the spider_opened signal.
  3. Supports Multiple Handlers:

    • If needed, you can add additional loggers or handlers (e.g., console logging).

This approach avoids the TypeError and ensures your logs are correctly routed to dynamic log files for each spider.

Top comments (0)