DEV Community

Cover image for Python + EVM without the paper cuts: a senior playbook for fast, correct, and scalable reads
OnlineProxy
OnlineProxy

Posted on

Python + EVM without the paper cuts: a senior playbook for fast, correct, and scalable reads

You don’t forget the first time your Web3 script “works”… and still returns the wrong number. You grab a token balance, divide by 10^18 out of habit, and proudly print 2.085e-12 of “something” that’s clearly 2.085 USDC. Or your call to a token’s decimals silently reverts because it’s an NFT. Or the latest web3.py “works” but keeps spitting console errors. If any of that sounds familiar, this guide is for you.

Below is a practical, opinionated playbook for reading from EVM chains with Python — fast, async, and correct — without stepping on the same rakes over and over. It also comes with a beginner-friendly checklist at the end that you can hand to a teammate and trust.

Why does my Web3 Python stack fail on day one?

Because the stack is two things at once: brittle and deceptively simple.

  • Python version drift. The freshest Python release is not always supported by web3.py. Pin your runtime to a known-good minor (e.g., 3.11) and avoid the very latest major/minor until web3.py catches up.

  • Latest library ≠ greatest library. The bleeding-edge web3.py build may “work” but log errors due to regressions. Prefer pinning to the penultimate version known to be stable in your team. Treat versions as infrastructure, not vibes.

  • Async needs to be first-class, not bolted on. Use AsyncWeb3 out of the box with AsyncHTTPProvider. Middleware hacks for async are legacy. If you build sync-first, you pay interest later.

  • HTTP client choice matters. curl_cffi is a fast async HTTP layer but can break on older macOS; aiohttp is popular but has its own paper cuts; httpx is another viable option. Pick one that actually works on your machines and pin it.

  • Spoof your UA if you scrape. For network calls outside web3.py, use something like fake-user-agent when hitting explorers to reduce accidental rate-limiting.

Mental model: RPC is a dialogue, not a magic pipe

  • RPC URL = the node you’re talking to.
  • You ask: “What’s gas now?” “What block am I on?” “What’s the chain ID?” The node answers.
  • That’s it. Everything else is ceremony around getting, parsing, and composing those answers.

The high-impact habit here: always test await w3.is_connected() first. Then pull chain_id and block_number right away; it confirms you’re in the right universe.

Where do reliable RPCs come from, and how do you know you’re actually connected?

  • Use Chainlist to find RPCs for the network you want (Ethereum, BSC, Arbitrum, etc.). You’ll see both HTTP and WebSocket endpoints, with latency and privacy hints.

  • Programmatic sanity checks:

    • await w3.is_connected() should be True.
    • await w3.eth.chain_id should match the explorer (e.g., Ethereum is 1, BNB Chain is 56).
    • await w3.eth.block_number should increase between runs.
  • EIP-1559 nuance: w3.eth.gas_price and w3.eth.max_priority_fee look like attributes, but in async they’re awaitables. Use await, no parentheses. It’s counterintuitive, but that’s the API.

The number one reason your amounts are wrong

You’re mixing units and decimals.

  • Native coin (e.g., ETH): amounts arrive in wei. Convert with from_wei/to_wei.

  • ERC‑20 tokens: amounts arrive in the token’s smallest unit. Do not use from_wei blindly. Divide by 10**decimals returned by the token’s decimals().

Rules of thumb:

  • Treat all amounts internally as integers. Format for humans at the edges of your app.

  • Never add USDC and DAI together before normalizing by their decimals. It’s like adding centimeters to meters and calling it “distance.”

What is a checksum address and why lower() will betray you?

  • EVM addresses are case-insensitive, but checksum case encodes a check. Using to_checksum_address both validates and normalizes.

  • Resist lower()/upper() as a crutch. Use w3.to_checksum_address for any user-provided or literal string before comparing or passing into contract calls. It will save you from invisible typos and silent mismatches.

Read vs. write isn’t a vibe — it’s a cost model

  • Read functions (view/pure) don’t change state, don’t cost gas, and don’t require a wallet. Great for analytics, monitoring, heuristics, and risk checks.

  • Write functions change chain state, must be signed, and cost gas. Save them for later. Build your read stack first.

Proxies will eat your lunch unless you handle them up front

Most serious tokens deploy behind upgradeable proxies. That means:

  • The proxy address is the “live” contract address you call.
  • The actual implementation code — and the full ABI — is elsewhere.
  • Explorers label this with “Read as Proxy” / “Write as Proxy” tabs.

That’s where the full ABI lives. If you use the proxy’s short ABI, you’ll swear the contract “has no functions.”

Do this every time:

  • Use the proxy address for all calls.

  • Fetch ABI from the implementation (the “as proxy” view).

  • Keep the ABI locally; don’t paste 1,500 lines inline. Store it in abi/token.json (or similar) and load it with json.loads or your own read_json helper.

A simple EVM read stack that scales

Framework: Explorer → ABI → Contract → Methods → Types

  • Explorer

    • Find the address of the thing (token, NFT, dApp contract).
    • Detect proxy. If so, open “Read as Proxy” and scroll to ABI.
  • ABI

    • Save it locally in abi/.json.
    • For big projects, keep a folder of ABIs: abi/erc20.json, abi/erc721.json, abi/app_specific.json.
    • Optional advanced move: keep a minimal Python dict ABI with only methods you call (e.g., name, symbol, decimals, balanceOf). It’s lean and fast to iterate on.
  • Contract

    • Create once: contract = w3.eth.contract(address=addr, abi=abi)
  • Methods

    • Read example: await contract.functions.name().call()
    • Remember the .call() and the await. Forget either and you’ll chase ghosts.
  • Types

    • Human-readable strings: name(), symbol().
    • Ints: decimals() (uint8), balances (uint256).
    • You don’t need to cast types — web3.py maps them correctly.

The ERC‑20 trick that makes your life easier

Most fungible tokens implement the ERC‑20 interface. That means name, symbol, decimals, totalSupply, balanceOf, allowance, etc., have identical signatures across tokens.

Consequences:

  • One ABI can be reused for DAI, USDT, USDC, etc. As long as a token is ERC‑20‑compliant, a single erc20.json ABI covers most of your reads.
  • You still use each token’s address. The ABI is the interface; the address is the instance.

The ERC‑721 gotcha that will ruin a demo

NFTs (ERC‑721) don’t have decimals. Conceptually, they are non-divisible units. Calling decimals() on an NFT contract will either fail or isn’t even in the ABI. If you must probe contract type dynamically, wrap decimals() in try/except and log a clear error.

Why your async feels sync (and how to fix it)

  • Use AsyncWeb3(AsyncHTTPProvider(rpc_url)) directly. There’s no need to wrap sync with middleware in modern versions.

  • Entry point: asyncio.run(main()) where main() is your coroutine.

  • Batch concurrent reads with asyncio.gather. For example, checking multiple balanceOf calls across addresses and tokens.

  • Beware property-like awaitables:

    • await w3.eth.gas_price not await w3.eth.gas_price()
    • await w3.eth.max_priority_fee not await w3.eth.max_priority_fee()
  • Prefer HTTP for simple stateless polling. Use WebSockets only if you really need subscriptions.

Practical patterns you’ll use immediately

  • Units conversion

    • Native: w3.from_wei(value, 'gwei'), w3.from_wei(value, 'ether')
    • Tokens: human = raw // 10**decimals (or a decimal-safe format if you need precise fractions for display)
  • Checksums everywhere

    addr = w3.to_checksum_address(addr_str)

  • Validating connectivity

    await w3.is_connected()

  • Probing the network

    await w3.eth.chain_id

    await w3.eth.block_number

  • Gas context (EIP‑1559)

    await w3.eth.gas_price

    await w3.eth.max_priority_fee

  • Contract reads

    await contract.functions.name().call()

    await contract.functions.decimals().call()

    await contract.functions.balanceOf(addr).call()

Finding whales without overthinking it

  • Native whales: scan balances of addresses in a chain (native get_balance per address). Sort descending. This is fast and cheap.

  • Token whales: use ERC‑20 balanceOf. Don’t forget decimals. Keep balances in raw ints internally, add a formatting function at output time.

  • For a practical source of addresses: pick recent transactions from an explorer, or reputable exchange-labeled addresses (e.g., “Exchange Hot Wallet 1”). Build your sample list and go.

Doing more with explorers without scraping headaches

  • For NFT collections: navigate from a mint page to its contract via “View on Etherscan.” The “Token Tracker” page gives holders and top holders. Copy the top holder address and call balanceOf for them via the contract.

  • For ownership and pause status: if an NFT contract implements Ownable or Pausable, you’ll find owner() and paused() in the Read tab. Call them directly via your contract instance.

  • For unverified contracts: the ABI tab may be missing. Your options are:

Use a canonical interface (e.g., ERC‑20 ABI for tokens; ERC‑721 ABI for NFTs) and hope the contract conforms.
Or bail out early rather than guessing. A clean try/except and a clear log beat undefined behavior.
A simple, reliable architecture for your reader scripts

Framework headline: the 7-layer reader sandwich

  1. Runtime layer
    Pin Python to a supported version (e.g., 3.11).
    Create a venv and keep it per-project.

  2. Dependencies layer
    Pin web3.py to a known-stable release (do not grab today’s release by default).
    Pick one HTTP async client (curl_cffi, aiohttp, or httpx) and pin it.
    Optional: fake-user-agent for non-web3 HTTP calls.

  3. Configuration layer
    Store RPCs in .env or config. Keep a fallback RPC per network.
    Store ABI files under abi/.

  4. Connectivity layer
    AsyncWeb3(AsyncHTTPProvider(rpc))
    Sanity: is_connected, chain_id, block_number

  5. Normalization layer
    to_checksum_address on all addresses
    from_wei only for native; custom 10**decimals for tokens

  6. Contract layer
    w3.eth.contract(address=..., abi=...)
    Keep minimal Python dict ABIs if you only need a subset (name, symbol, decimals, balanceOf)

  7. Concurrency layer
    Wrap independent reads in asyncio.gather. Cap concurrency if needed to avoid RPC throttling.

Common mistakes and their precise fixes

Mistake: Calling from_wei on an ERC‑20 token.

Fix: Fetch decimals() and divide by 10**decimals.
Mistake: Comparing addresses as raw strings (one lowercased, the other checksummed).

Fix: Normalize both with to_checksum_address and compare.
Mistake: Using the proxy’s short ABI, not the implementation’s ABI.

Fix: “Read as Proxy” in explorer → copy full ABI.
Mistake: Forgetting .call() or await on contract methods.

Fix: Always await contract.functions.() .call()
Mistake: Calling decimals() on an NFT.

Fix: Wrap in try/except; ERC‑721 has no decimals. Use name, symbol, tokenURI instead.
Mistake: Building sync-first because it feels simpler.

Fix: Start with async; your future self will run 50 wallets in parallel with one gather.

Step-by-step guide: a beginner’s checklist you can trust

Tools and setup

Install Python 3.11.
Install PyCharm Community (or your preferred IDE).
Create a project and a venv.
Pin package versions in requirements.txt.
Dependencies

web3.py (stable, not freshest).
One of: curl_cffi, aiohttp, or httpx.
Optional: fake-user-agent.
Connect

Choose an RPC from Chainlist.
AsyncWeb3(AsyncHTTPProvider(rpc))
await w3.is_connected() → True
await w3.eth.chain_id → matches explorer
await w3.eth.block_number → increments
Baseline network data

await w3.eth.gas_price (wei)
await w3.eth.max_priority_fee (EIP‑1559 networks; awaitable attribute)
Convert with from_wei for display.
Addresses

Normalize everything: to_checksum_address(addr_str)
Store checksummed addresses in code/config.
Native balance

await w3.eth.get_balance(addr) → wei
from_wei(..., 'ether') for display.
Token basics (ERC‑20)

Save abi/erc20.json locally.
contract = w3.eth.contract(address=token_addr, abi=erc20_abi)
Read name, symbol, decimals.
raw = await contract.functions.balanceOf(addr).call()
human = raw // 10**decimals
NFTs (ERC‑721)

Use ERC‑721 ABI; do not call decimals().
Use name, symbol, possibly tokenURI.
Proxies

Always check explorer for “as Proxy.”
Use proxy address + implementation ABI.
Sorting and whale detection

Build a list of addresses.
For each addr: get native balance or token balanceOf.
Normalize values. Sort descending.
Print address and human-readable balance.
Error handling

Wrap reads that may not exist (e.g., decimals() on NFTs) in try/except.
Log the ABI mismatch clearly. Move on.
Concurrency

Run independent calls with asyncio.gather.
Limit concurrency if your RPC gets grumpy.

Questions you’ll Google less after this

Why is max_priority_fee missing parentheses?

It’s an awaitable attribute in async web3.py. await w3.eth.max_priority_fee.
Why do my token numbers look like dust in scientific notation?

You used from_wei on a token. Use 10**decimals scaling instead.
Why does ABI X work for token Y?

ERC‑20 interface is standard across compliant tokens. Same ABI, new address, same methods.
Why can’t I find allowance in the ABI?

You copied the proxy’s short ABI. Get the full ABI from “Read as Proxy.”
Why does my code blow up comparing two addresses?

One is checksummed, the other isn’t. Normalize both with to_checksum_address.
Real-world mini-scenarios (you can ship these this afternoon)

Top minter on a minting NFT:

Explorer → collection → “Token Tracker” → “Holders” → copy top holder address.
In code: NFT contract instance → balanceOf(top_holder).
Print name, symbol, holder balance.
Sorting USDT balances across 10 addresses:

Gather 10 addresses (recent tx senders, exchange wallets, etc.).
ERC‑20 contract for USDT at the Ethereum address.
Fetch decimals (USDT uses 6).
For each address: balanceOf → integer → divide by 10**6.
Sort and print richest to poorest.
Owner and paused status of an NFT:

NFT contract → read owner() and paused() (if present).
If paused returns False: mint is open. If True: mint paused.
Note: Not every NFT implements Pausable. Wrap in try/except.
A few Python basics that save hours, not minutes

Reading stack traces: bottom line is the error type; earlier lines show file and line number. Go there first.

Inputs are strings by default. Convert explicitly when you expect numbers: int(input("...")). If you don’t, "1" + "2" == "12".

Augmented assignment is your friend: x += 1, x //= 10, etc. Write idiomatic code from the start.

Strings concatenate, numbers add. That’s a feature, not a bug. Don’t mix them in the same operation unless you mean it.

Final Thoughts

The EVM is simple; our tools make it feel complicated. Most “Web3 bugs” in read-only code are off-by-one errors in units, the wrong ABI for the right address, or forgetting that NFTs don’t have decimals. If you adopt a few non-negotiables — checksummed addresses everywhere, proxies handled by default, integers all the way down with formatting at the edges — your scripts become predictable, fast, and easy to extend.

Start with the basics:

Pin your Python and packages.
Connect once, verify often.
Normalize addresses and units aggressively.
Reuse ERC‑20 and ERC‑721 ABIs for what they are good at.
Keep reads async from day one.
Then ship something small today: print name, symbol, and top holder balance for one NFT, or sort 10 addresses by USDT holdings. The confidence you get from one correct script is worth more than another night of “research.”

What’s the one thing you can automate this week that will save you an hour every week after? Build that reader now.

Top comments (0)