DEV Community

shashank ms
shashank ms

Posted on

Building Environmental Science Tools with LLMs: A Tutorial

We are building a command-line watershed quality analyzer that turns raw sensor readings into structured compliance reports. It helps environmental technicians assess water quality against EPA guidelines without maintaining complex lookup tables. Each run evaluates pH, dissolved oxygen, turbidity, temperature, and nitrates, then returns both JSON and a markdown summary.

What you'll need

Step 1: Set up the Oxlo.ai client

I verify that my environment can reach Oxlo.ai and that my key is active. I use the OpenAI SDK as a drop-in client because Oxlo.ai is fully compatible, and its request-based pricing keeps costs flat no matter how much context I send. See https://oxlo.ai/pricing for plan details.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "Reply with 'client ready' and nothing else."},
    ],
    max_tokens=10,
)

print(response.choices[0].message.content)

Step 2: Define the system prompt

I keep the analyst instructions in a dedicated constant so I can tune thresholds without touching the rest of the code. The prompt locks the model to EPA freshwater limits and forces strict JSON.

SYSTEM_PROMPT = """You are a watershed quality analyst. Evaluate water sensor readings against EPA freshwater guidelines.

EPA thresholds:
- pH: 6.5 to 8.5
- Dissolved Oxygen: >= 5.0 mg/L
- Turbidity: <= 4.0 NTU
- Water Temperature: <= 27.0 C
- Nitrates: <= 10.0 mg/L

For each parameter return:
- value: the input number
- threshold: the EPA limit
- unit: the measurement unit
- status: COMPLIANT, WARNING, or CRITICAL
- notes: one-sentence explanation

Return ONLY a JSON object with keys: site_id, sample_date, overall_status, parameters (list), and recommended_actions (list). Do not use markdown formatting."""

Step 3: Build the analysis function

I wrap the API call in a function that accepts a site ID and a dictionary of readings. I use JSON mode to enforce valid output.

import json
from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

def analyze_water_sample(site_id, sample_date, readings):
    user_message = json.dumps({
        "site_id": site_id,
        "sample_date": sample_date,
        "readings": readings
    }, indent=2)

    response = client.chat.completions.create(
        model="qwen-3-32b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message},
        ],
        response_format={"type": "json_object"},
        temperature=0.1,
    )

    return json.loads(response.choices[0].message.content)

Step 4: Generate the narrative report

Raw JSON feeds pipelines, but field teams need readable text. I pass the structured result to a second model to produce a concise markdown summary.

import json
from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

REPORT_PROMPT = """You are a senior environmental engineer. Write a short markdown summary of the provided JSON analysis. Include a status heading, a bullet list of flagged parameters, and two concrete remediation steps. Keep it under 150 words."""

def generate_report(analysis_json):
    response = client.chat.completions.create(
        model="kimi-k2.6",
        messages=[
            {"role": "system", "content": REPORT_PROMPT},
            {"role": "user", "content": json.dumps(analysis_json, indent=2)},
        ],
        temperature=0.3,
    )

    return response.choices[0].message.content

Step 5: Wire up the CLI

I combine the stages in a single script that accepts a sample payload, calls the analyzer, and prints both the JSON and the markdown report. This block is the complete runnable file.

import json
from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are a watershed quality analyst. Evaluate water sensor readings against EPA freshwater guidelines.

EPA thresholds:
- pH: 6.5 to 8.5
- Dissolved Oxygen: >= 5.0 mg/L
- Turbidity: <= 4.0 NTU
- Water Temperature: <= 27.0 C
- Nitrates: <= 10.0 mg/L

For each parameter return:
- value: the input number
- threshold: the EPA limit
- unit: the measurement unit
- status: COMPLIANT, WARNING, or CRITICAL
- notes: one-sentence explanation

Return ONLY a JSON object with keys: site_id, sample_date, overall_status, parameters (list), and recommended_actions (list). Do not use markdown formatting."""

REPORT_PROMPT = """You are a senior environmental engineer. Write a short markdown summary of the provided JSON analysis. Include a status heading, a bullet list of flagged parameters, and two concrete remediation steps. Keep it under 150 words."""

def analyze_water_sample(site_id, sample_date, readings):
    user_message = json.dumps({
        "site_id": site_id,
        "sample_date": sample_date,
        "readings": readings
    }, indent=2)

    response = client.chat.completions.create(
        model="qwen-3-32b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message},
        ],
        response_format={"type": "json_object"},
        temperature=0.1,
    )

    return json.loads(response.choices[0].message.content)

def generate_report(analysis_json):
    response = client.chat.completions.create(
        model="kimi-k2.6",
        messages=[
            {"role": "system", "content": REPORT_PROMPT},
            {"role": "user", "content": json.dumps(analysis_json, indent=2)},
        ],
        temperature=0.3,
    )

    return response.choices[0].message.content

if __name__ == "__main__":
    readings = {
        "pH": 6.2,
        "dissolved_oxygen_mg_l": 4.8,
        "turbidity_ntu": 3.5,
        "temperature_c": 22.0,
        "nitrates_mg_l": 12.0
    }

    analysis = analyze_water_sample("SITE-07", "2025-01-15", readings)
    report = generate_report(analysis)

    print("=== STRUCTURED ANALYSIS ===")
    print(json.dumps(analysis, indent=2))
    print("\n=== FIELD REPORT ===")
    print(report)

Run it

Running the script produces both machine-readable JSON and a human summary.

$ python watershed_analyzer.py
=== STRUCTURED ANALYSIS ===
{
  "site_id": "SITE-07",
  "sample_date": "2025-01-15",
  "overall_status": "CRITICAL",
  "parameters": [
    {
      "value": 6.2,
      "threshold": "6.5 to 8.5",
      "unit": "pH",
      "status": "WARNING",
      "notes": "Slightly acidic, below lower EPA bound."
    },
    {
      "value": 4.8,
      "threshold": ">= 5.0 mg/L",
      "unit": "mg/L",
      "status": "WARNING",
      "notes": "Dissolved oxygen below minimum for aquatic life health."
    },
    {
      "value": 3.5,
      "threshold": "<= 4.0 NTU",
      "unit": "NTU",
      "status": "COMPLIANT",
      "notes": "Turbidity within acceptable range."
    },
    {
      "value": 22.0,
      "threshold": "<= 27.0 C",
      "unit": "C",
      "status": "COMPLIANT",
      "notes": "Temperature within acceptable range."
    },
    {
      "value": 12.0,
      "threshold": "<= 10.0 mg/L",
      "unit": "mg/L",
      "status": "CRITICAL",
      "notes": "Nitrate concentration exceeds safe drinking water limit."
    }
  ],
  "recommended_actions": [
    "Investigate upstream agricultural or wastewater runoff sources.",
    "Schedule follow-up sampling within 48 hours to confirm nitrate levels."
  ]
}

=== FIELD REPORT ===
**Site SITE-07 Critical Status**

- pH at 6.2 is below the 6.5 minimum
- Dissolved Oxygen at 4.8 mg/L is below the 5.0 threshold
- Nitrates at 12.0 mg/L exceed the 10.0 mg/L limit

Remediation steps:
1. Deploy temporary aeration at the sampling point immediately.
2. Collect upstream samples to isolate the nitrate source.

Next steps

Add a SQLite cache keyed by site ID and reading hash so repeated identical payloads do not consume API requests. Or extend the tool to accept a CSV batch file and write Oxlo.ai-generated JSON results to an HTML dashboard using the Jinja template engine.

Top comments (0)