Debugging a circuit with LTspice is basically like planning a road trip with Google Maps when you're not sure the roads actually exist. The map says everything will work. The real world says something else entirely. And you're standing on a corner that doesn't show up anywhere.
That's exactly the problem I've had with an electronics project that's been sitting on my bench for months: the simulation is perfect, the physical circuit doesn't behave the same way, and the gap between those two worlds feels like an uncrossable canyon.
When I found the SPICE + Claude Code + oscilloscope project, I literally said "this is what I was looking for" out loud. Alone. In my office. At 11pm.
Here's why it mattered to me — and what I learned from looking at it with a critical eye.
SPICE simulation with Claude Code automatic verification: what this project actually is
The setup is conceptually elegant. Three pieces chained together:
- LTspice / ngspice runs the circuit simulation and exports results (voltages, currents, waveforms) as text or CSV files.
- A digital-output oscilloscope (USB, GPIB, or a Python script with PyVISA) captures the real signal from the physical circuit.
- Claude Code receives both — simulation and real measurement — and has to decide whether they match, where they diverge, and what change in the circuit or model would explain the difference.
The agent isn't guessing. It's comparing structured data from two sources and reasoning about the discrepancy. That's a different thing entirely.
The interesting part isn't "AI does electronics for you." The interesting part is the exact moment where the agent touches the physical world and has to process the fact that reality is more complicated than the model.
# spice_verification.py
# Basic structure of the verification pipeline
import subprocess
import pandas as pd
import anthropic
from pathlib import Path
def run_spice_simulation(netlist_path: str) -> pd.DataFrame:
"""
Runs ngspice with the given netlist and parses the output.
Returns a DataFrame with time, voltage, current.
"""
result = subprocess.run(
["ngspice", "-b", "-o", "output.raw", netlist_path],
capture_output=True,
text=True
)
if result.returncode != 0:
raise RuntimeError(f"ngspice failed: {result.stderr}")
# Basic parser for ngspice raw output
# In production this gets more complex
return parse_ngspice_raw("output.raw")
def capture_oscilloscope(channel: int = 1) -> pd.DataFrame:
"""
Captures data from the oscilloscope via PyVISA.
Assumes the oscilloscope is connected via USB-TMC.
"""
import pyvisa
rm = pyvisa.ResourceManager()
# Find the first available instrument
instruments = rm.list_resources()
if not instruments:
raise RuntimeError("No instruments found connected")
scope = rm.open_resource(instruments[0])
scope.timeout = 5000 # 5 second timeout
# Identify the instrument first
idn = scope.query("*IDN?")
print(f"Oscilloscope connected: {idn}")
# Capture the waveform from the requested channel
scope.write(f":WAV:SOUR CHAN{channel}")
scope.write(":WAV:MODE NORM")
scope.write(":WAV:FORM ASCII")
raw_data = scope.query(":WAV:DATA?")
# Parse the response and build the DataFrame
return parse_ascii_waveform(raw_data, scope)
def verify_with_claude(
sim_data: pd.DataFrame,
real_data: pd.DataFrame,
circuit_context: str
) -> dict:
"""
Sends both datasets to Claude and requests discrepancy analysis.
Returns dict with: matches, divergences, hypotheses, next_step.
"""
client = anthropic.Anthropic()
# Build statistical summary — don't send millions of data points
sim_summary = {
"vmax": float(sim_data["voltage"].max()),
"vmin": float(sim_data["voltage"].min()),
"frequency_hz": calculate_frequency(sim_data),
"rise_time_us": calculate_rise_time(sim_data)
}
real_summary = {
"vmax": float(real_data["voltage"].max()),
"vmin": float(real_data["voltage"].min()),
"frequency_hz": calculate_frequency(real_data),
"rise_time_us": calculate_rise_time(real_data)
}
prompt = f"""
You are analyzing the discrepancy between a SPICE simulation and a real measurement.
Circuit context:
{circuit_context}
SPICE simulation results:
{sim_summary}
Real oscilloscope measurement:
{real_summary}
Analyze:
1. Do the signals match within a reasonable margin (±10%)?
2. Which parameters diverge the most?
3. What are the most likely hypotheses to explain the difference?
4. What change to the netlist or physical circuit should be tried first?
Be specific. Give me values, not generalities.
"""
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=2048,
messages=[{"role": "user", "content": prompt}]
)
# In a real system, this gets parsed with structured output
return {
"analysis": response.content[0].text,
"sim_stats": sim_summary,
"real_stats": real_summary
}
# Full pipeline
def verification_pipeline(
netlist: str,
circuit_description: str,
oscilloscope_channel: int = 1
):
print("▶ Running SPICE simulation...")
sim = run_spice_simulation(netlist)
print("▶ Capturing oscilloscope signal...")
real = capture_oscilloscope(oscilloscope_channel)
print("▶ Sending to Claude for analysis...")
result = verify_with_claude(sim, real, circuit_description)
print("\n=== ANALYSIS ===")
print(result["analysis"])
return result
This pipeline isn't fiction. With ngspice (open source), PyVISA (instrumentation standard), and the Anthropic API, this works today. The most accessible hardware to start with is a Rigol DS1054Z — it has a USB-TMC interface, PyVISA handles it without weird drivers, and it costs under $400.
What happens when reality doesn't match the simulation
This is the moment that interests me. Not the happy path where everything lines up. The moment where the agent receives two datasets that say different things and has to reason through it.
The most common divergences between SPICE simulation and physical circuit are predictable if you know where to look:
Real components vs. ideal models. A 100nF capacitor in SPICE is perfect. The physical one has ESR (equivalent series resistance) and ESL (equivalent series inductance). At high frequencies, that matters a lot. The standard BJT SPICE model generally doesn't include package parasitic capacitances.
Ground plane and trace resistance. In SPICE your GND is an ideal node. On the PCB or breadboard, ground has resistance, inductance, and can have loops that generate noise. A 1mm wide, 10cm long copper trace has around 16mΩ of resistance — which you typically don't model.
Temperature. Semiconductor parameters change with temperature. The simulation runs at 27°C by default. Your physical circuit might be at 45°C after running for 20 minutes.
Tolerances. Resistors at 5% tolerance means a nominal 10kΩ value can be anywhere between 9.5kΩ and 10.5kΩ. In sensitive circuits, that shifts behavior.
What Claude does in that moment is exactly what an experienced engineer would do: it ranks hypotheses by probability, suggests what to measure first to rule out causes, and proposes specific changes to the netlist to better model reality.
# Example of enriched context you send to the agent
circuit_context = """
Inverting amplifier with LM741 op-amp.
Design gain: -10 (R_feedback = 100kΩ, R_input = 10kΩ).
Input signal: sinusoidal, 1kHz, 100mV peak.
Power supply: ±15V.
Mounted on breadboard. Test leads approximately 20cm long.
Measurement taken at op-amp output.
Subjective observation: the real signal looks more 'rounded' at the peaks
compared to the simulation. The DC level at rest is also slightly different.
"""
That subjective observation matters. You're giving qualitative context on top of the numbers. The LLM can cross-reference that with the numerical discrepancies and sharpen the hypothesis.
The mistakes you're going to make (I made them reading through the project)
Mistake 1: Assuming the oscilloscope and the simulation share the same time reference.
Ngspice gives you data starting at t=0. The oscilloscope gives you data from the capture buffer, which can have a trigger offset. If you're comparing frequency, amplitude, and rise time, you're fine. If you try to compare phase directly, you'll see divergences that don't actually exist.
Mistake 2: Sending too many data points to the LLM.
An oscilloscope capture at 1MSa/s for 100ms is 100,000 points. That's expensive tokens and slow responses. The code above summarizes into key statistics. For most analyses, that's more than enough.
Mistake 3: Not giving the agent circuit context.
If you send two arrays of numbers without saying what circuit it is, you'll get generic analysis. The more specific context you give — topology, components, mounting conditions — the more useful the hypothesis it returns. This is the same thing I've learned with every AI tool: output quality is proportional to input quality. What I discussed in the post about token usage opacity applies here too: you'll burn more credits than you expect if you don't optimize context.
Mistake 4: Assuming the manufacturer's SPICE model is correct.
Some component SPICE models are old, approximate, or just plain wrong. I've seen netlists with BJT models that don't reflect real behavior at high frequencies. If the discrepancy is systematic and large, the problem might be the model, not the circuit.
Mistake 5: Skipping verification of your measurement setup.
Before running the pipeline, verify that your oscilloscope probe is calibrated (probe compensation adjustment), the channel is on the right scale, and the trigger is stable. Claude can't detect that your measurement is wrong if the data looks coherent but isn't. Garbage in, garbage out — no LLM solves that.
FAQ: SPICE simulation with Claude Code and automatic verification
Do I need an expensive oscilloscope to make this work?
No. Any oscilloscope with a USB-TMC or LAN interface and VISA support works. The Rigol DS1054Z (around $350–400 USD) is the most popular entry point — PyVISA detects it directly, it has 4 channels and 50MHz bandwidth. For audio signals or slow digital circuits, even a 20MHz oscilloscope is enough. There's also the option of using an acquisition board like the Red Pitaya, which doubles as a signal generator and has a native Python API.
Which version of SPICE works best for this pipeline?
Ngspice is open source, well-documented, and runs easily from the command line — ideal for automation. LTspice from Analog Devices is more popular among hobbyists but its automation interface is less direct (though it exists). For the pipeline I described, ngspice is the cleaner option. If you're already using LTspice, you can export results in raw format and parse them with Python without running the simulation from the pipeline.
Can Claude automatically modify the netlist based on the analysis?
Yes, and that's the next level. With Claude Code you have access to filesystem tools — it can read the netlist, propose specific changes ("increase R3 from 10kΩ to 12kΩ to compensate for the gain drop"), write the modified netlist, run the simulation again, and compare. It's an automatic refinement loop. The limit is that you still need a human to make the physical change in the real circuit — the agent can't act there on its own. Yet.
What if the divergence between simulation and reality is huge?
That's a signal that something fundamental is wrong: either the component SPICE model is incorrect, a component is damaged, or there's a design error the simulation didn't catch (grounding problem, parasitic oscillations, latch-up in a CMOS circuit). In those cases, Claude can help you rank what to test, but the physical debugging is on you. The agent is good for hypotheses, not a replacement for hands on hardware.
Can this be used with RF or high-frequency circuits?
With caution. SPICE is a lumped-circuit simulator — it assumes physical dimensions are much smaller than the wavelength. At high frequencies (say, above 100MHz) transmission line effects, radiation, and layout parasitic capacitances matter, and SPICE doesn't model them well without specific models. For serious RF work, you need electromagnetic simulation tools (EMsim, HFSS, or similar). The automatic verification pipeline still applies, but the discrepancies will be larger and harder to explain with just component parameters.
Are there security risks in automating instrument interaction?
Yes, and it's worth flagging. An agent that can write SCPI commands to a measurement instrument could, in principle, also send commands that damage the equipment (abruptly changing ranges, disabling protections). The pipeline should always have the capture layer in read-only mode — queries only, never commands that modify instrument state beyond what's needed for the capture. And if the agent has access to modify netlists and run simulations automatically, set limits on what parameters it can change. It's the same principle of security as proof of work that applies to any automated system.
What this project unblocked for me (and what's still missing)
I have a motor control circuit that's been sitting dead on my bench for months. The simulation says the control loop is stable. The physical circuit oscillates at 3kHz every time I put a load on it. I never found the time to debug it systematically.
Looking at this project made me realize the problem isn't time — it's method. I was trying to debug without structure: measuring one thing, changing another, not recording anything. A pipeline like this forces me to do what I should've done from the start: capture data, compare with the model, generate hypotheses, verify.
The LLM isn't magic. But it's an interlocutor that doesn't get tired, doesn't have an ego, and can cross-reference symptoms with hypotheses faster than I can dig through electronics forums at 2am. That has real value.
What's still missing in this kind of project is the complete loop. Today the agent analyzes and suggests, but the physical intervention is still manual. The next interesting step — which already exists in industrial contexts — is connecting the agent to actuators: relays, programmable power supplies, controllable signal generators. That's where the agent truly "touches" the physical world and can iterate without a human in the loop.
That has implications that go well beyond electronics. An agent that can modify a real circuit based on its own observations is a system with physical agency. That's different from one that only processes text. And that difference matters — in terms of what conversations you keep, in terms of responsibility, in terms of what data it shares without you noticing.
For now, the manual pipeline already has enough value. And I have a motor project I'm finally going to pick back up this weekend.
Do you have a circuit stuck in the same limbo — simulation vs. reality — where something like this might help you debug it? Tell me. I genuinely want to know if this problem is as common as it feels from where I'm sitting.
Top comments (0)