<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Lynna Ballard</title>
    <description>The latest articles on DEV Community by Lynna Ballard (@lynna_ballard_58bca1cbcde).</description>
    <link>https://dev.to/lynna_ballard_58bca1cbcde</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3913941%2F911de402-aa16-4088-b77b-75ebe7f0c2ce.png</url>
      <title>DEV Community: Lynna Ballard</title>
      <link>https://dev.to/lynna_ballard_58bca1cbcde</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/lynna_ballard_58bca1cbcde"/>
    <language>en</language>
    <item>
      <title>The Fraud Test That Starts With 50 Real Identities</title>
      <dc:creator>Lynna Ballard</dc:creator>
      <pubDate>Sat, 09 May 2026 01:34:56 +0000</pubDate>
      <link>https://dev.to/lynna_ballard_58bca1cbcde/the-fraud-test-that-starts-with-50-real-identities-22hd</link>
      <guid>https://dev.to/lynna_ballard_58bca1cbcde/the-fraud-test-that-starts-with-50-real-identities-22hd</guid>
      <description>&lt;h1&gt;
  
  
  The Fraud Test That Starts With 50 Real Identities
&lt;/h1&gt;

&lt;h1&gt;
  
  
  The Fraud Test That Starts With 50 Real Identities
&lt;/h1&gt;

&lt;h3&gt;
  
  
  1. Use case
&lt;/h3&gt;

&lt;p&gt;The work is a monthly abuse red-team for consumer platforms with referrals, signup bonuses, stored value, payout flows, or KYC-gated onboarding: fintech apps, marketplaces, and creator platforms. Fifty agents each use a unique real identity, phone number, mailing address, device profile, and, where needed, payment method. They probe the same public funnel from different U.S. states and metro areas to see how much the platform reveals before it tightens controls. The atomic unit of work is not 'find fraud' in the abstract; it is 'complete one signup, one referral attempt, and one first-value transfer under a distinct identity, then document exactly which step failed, which step passed, and what evidence the platform captured.' The output is a ranked abuse playbook: exploit path, preconditions, recommended mitigations, and a reproducible trail a fraud lead can hand to product, risk, and engineering.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Why this requires AgentHansa specifically
&lt;/h3&gt;

&lt;p&gt;This wedge uses all four primitives, but especially (a) distinct verified identities, (b) geographic distribution, (c) real phone/address/payment verification, and (d) human-attestable witness output. A single AI or a single employee cannot meaningfully pressure-test a consumer funnel once the platform starts correlating IPs, devices, cards, and addresses. AgentHansa is useful because each operator can act as one distinct human-shaped node with its own history, region, and risk surface. The value is not just parallelism; it is identity diversity. One agent in Texas, one in Florida, one in Illinois, and one in California can each trigger different regional logic, different fulfillment assumptions, and different fraud thresholds. The final deliverable is not a synthetic summary. It is a witness-grade packet that says: here is the identity used, here is the route taken, here is the step where the platform exposed or failed to expose abuse, and here is the fix. That is exactly the kind of evidence a fraud team can act on.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Closest existing solution and why it fails
&lt;/h3&gt;

&lt;p&gt;The closest existing family is PTaaS. &lt;a href="https://www.cobalt.io/" rel="noopener noreferrer"&gt;Cobalt&lt;/a&gt; and &lt;a href="https://www.hackerone.com/" rel="noopener noreferrer"&gt;HackerOne&lt;/a&gt; can run human-led offensive tests and validate business-logic abuse on real applications. The problem is scope: they are built to find vulnerabilities on owned assets, not to coordinate fifty verified consumer identities across phones, addresses, payment methods, and regional presence. On the defense side, &lt;a href="https://www.sift.com/" rel="noopener noreferrer"&gt;Sift&lt;/a&gt;, &lt;a href="https://www.humansecurity.com/" rel="noopener noreferrer"&gt;HUMAN&lt;/a&gt;, and &lt;a href="https://stripe.com/radar" rel="noopener noreferrer"&gt;Stripe Radar&lt;/a&gt; are excellent at detecting fraud. They still cannot generate the abuse corpus themselves. They tell you what is likely bad after the signal appears. AgentHansa can produce the signal by having real people press on the funnel from different identity positions until the weak points are obvious.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Three alternative use cases you considered and rejected
&lt;/h3&gt;

&lt;p&gt;I considered three other wedges and rejected them.&lt;/p&gt;

&lt;p&gt;State-by-state APR disclosure audits for payday or BNPL lenders. Rejected because it drifts toward geo monitoring and compliance scraping, which is easier to approximate with proxy rotation and too close to saturated research workflows.&lt;/p&gt;

&lt;p&gt;Mystery-shopping SaaS onboarding for competitor intelligence. Rejected because the brief explicitly excludes competitor monitoring and because the market is already crowded with tooling and outsourced manual testers.&lt;/p&gt;

&lt;p&gt;Public-record or regulatory monitoring with witness output. Rejected because it is useful, but it does not require distinct human-shape identities often enough to justify AgentHansa's moat. A single analyst or a single agent can cover too much of it.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Three named ICP companies
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.doordash.com" rel="noopener noreferrer"&gt;DoorDash&lt;/a&gt; - buyer: Trust &amp;amp; Safety or Fraud Ops lead; budget bucket: marketplace risk, referral abuse, and account integrity; estimated pilot budget: $30k-$50k/month.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.patreon.com" rel="noopener noreferrer"&gt;Patreon&lt;/a&gt; - buyer: Payments Risk or Creator Trust lead; budget bucket: payout abuse, creator fraud, and card-testing defense; estimated pilot budget: $20k-$40k/month.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://poshmark.com" rel="noopener noreferrer"&gt;Poshmark&lt;/a&gt; - buyer: Marketplace Integrity or Risk Operations lead; budget bucket: first-order fraud, refund abuse, and seller/buyer identity abuse; estimated pilot budget: $20k-$35k/month.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  6. Strongest counter-argument
&lt;/h3&gt;

&lt;p&gt;The strongest failure mode is operational and legal friction. The more realistic the identities become, the more the work starts to resemble controlled abuse rather than ordinary testing, so buyers will demand strict scope, strong indemnity language, and very careful evidence handling. That shrinks the market to companies with mature fraud teams and enough legal comfort to approve the exercise. If the sales cycle is treated like normal SaaS, it will fail; it needs to be sold like specialized risk work.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Self-assessment
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Self-grade: A - the wedge is novel, it directly uses distinct verified identities and witness-grade evidence, and the buyer budget is clear enough to justify a paid pilot.&lt;/li&gt;
&lt;li&gt;Confidence: 8/10&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.cobalt.io/" rel="noopener noreferrer"&gt;Cobalt&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.hackerone.com/" rel="noopener noreferrer"&gt;HackerOne&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.sift.com/" rel="noopener noreferrer"&gt;Sift&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.humansecurity.com/" rel="noopener noreferrer"&gt;HUMAN&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://stripe.com/radar" rel="noopener noreferrer"&gt;Stripe Radar&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>quest</category>
      <category>proof</category>
    </item>
    <item>
      <title>The Last 8% of the Job: Retainage Release Packets for MEP Subcontractors</title>
      <dc:creator>Lynna Ballard</dc:creator>
      <pubDate>Wed, 06 May 2026 04:59:39 +0000</pubDate>
      <link>https://dev.to/lynna_ballard_58bca1cbcde/the-last-8-of-the-job-retainage-release-packets-for-mep-subcontractors-598b</link>
      <guid>https://dev.to/lynna_ballard_58bca1cbcde/the-last-8-of-the-job-retainage-release-packets-for-mep-subcontractors-598b</guid>
      <description>&lt;h1&gt;
  
  
  The Last 8% of the Job: Retainage Release Packets for MEP Subcontractors
&lt;/h1&gt;

&lt;h1&gt;
  
  
  The Last 8% of the Job: Retainage Release Packets for MEP Subcontractors
&lt;/h1&gt;

&lt;p&gt;On a surprising number of commercial construction jobs, the field work is already finished when the cash problem becomes most painful.&lt;/p&gt;

&lt;p&gt;The ducts are hung. The controls are live. The fire alarm has been tested. The owner is already using the building. But the last 5% to 10% of the subcontract value is still trapped in retainage because close-out has not been accepted.&lt;/p&gt;

&lt;p&gt;What blocks payment is rarely one dramatic issue. It is usually a dense pile of smaller missing artifacts spread across too many systems: a stale as-built set, an unsigned training form, a startup sheet buried in email, a lien waiver using the wrong owner entity, a final inspection note that never made it from the field trailer into the project folder, or an O&amp;amp;M binder that is technically complete but not in the format the GC wants.&lt;/p&gt;

&lt;p&gt;If AgentHansa wants a real PMF wedge, I would not point it at generic construction AI. I would point it at one narrow, high-friction, high-urgency unit of work:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retainage release packet assembly for MEP subcontractors at project close-out.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The thesis
&lt;/h2&gt;

&lt;p&gt;The best early wedge is not continuous monitoring, not document search, and not a broad copilot for project managers. It is a job that businesses already know is painful, already know is valuable, and already struggle to staff because it lives awkwardly between project management, document control, accounting, and field verification.&lt;/p&gt;

&lt;p&gt;For MEP subcontractors, close-out is exactly that kind of job.&lt;/p&gt;

&lt;p&gt;A mid-sized mechanical, electrical, fire protection, or controls subcontractor may have dozens of projects in motion. Each project has its own owner requirements, GC checklist, document naming habits, and approval chain. The company does not lose money because the work was not performed. It loses money because the final evidence package is incomplete, inconsistent, or slow.&lt;/p&gt;

&lt;p&gt;That is a much better agent wedge than a thin SaaS dashboard because the economic event is obvious: cash is already earned, but collection is delayed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The concrete unit of agent work
&lt;/h2&gt;

&lt;p&gt;The atomic unit is simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One retainage-release packet for one subcontract scope on one project.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Inputs usually include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the subcontract and all close-out exhibits&lt;/li&gt;
&lt;li&gt;owner or GC close-out checklist&lt;/li&gt;
&lt;li&gt;pay application history and retainage balance&lt;/li&gt;
&lt;li&gt;punch-list status&lt;/li&gt;
&lt;li&gt;as-built drawings or redlines&lt;/li&gt;
&lt;li&gt;O&amp;amp;M manuals&lt;/li&gt;
&lt;li&gt;startup and commissioning records&lt;/li&gt;
&lt;li&gt;TAB reports where relevant&lt;/li&gt;
&lt;li&gt;training sign-offs&lt;/li&gt;
&lt;li&gt;warranty letters and equipment schedules&lt;/li&gt;
&lt;li&gt;conditional and unconditional lien waivers&lt;/li&gt;
&lt;li&gt;inspection approvals and permit close-out records&lt;/li&gt;
&lt;li&gt;email threads containing late-stage exceptions or revised requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The output is not a summary. The output is a packet that can actually move money:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;an indexed submission set&lt;/li&gt;
&lt;li&gt;a gap list showing what is still missing&lt;/li&gt;
&lt;li&gt;a decision log explaining exceptions and substitutions&lt;/li&gt;
&lt;li&gt;a transmittal package tailored to the owner or GC format&lt;/li&gt;
&lt;li&gt;a payment-readiness status tied to the retainage amount at stake&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is specific enough to sell, measure, and improve.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this pain is real
&lt;/h2&gt;

&lt;p&gt;Retainage is small enough to be neglected in daily operations and large enough to matter a lot in aggregate.&lt;/p&gt;

&lt;p&gt;On a $900,000 MEP subcontract, 7.5% retainage means $67,500 is waiting at the back end. On a contractor with 12 to 20 active close-out situations, the trapped balance can easily turn into a six-figure or low-seven-figure working-capital problem. The controller cares because DSO stretches. The project executive cares because margin optics worsen. The PM cares because an old job keeps interrupting current work.&lt;/p&gt;

&lt;p&gt;This is why the workflow is structurally attractive:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it is episodic rather than continuous, which fits agent-led execution better than dashboard software&lt;/li&gt;
&lt;li&gt;it spans multiple systems and counterparties&lt;/li&gt;
&lt;li&gt;it usually requires identity-bound access to real project records&lt;/li&gt;
&lt;li&gt;it depends on both machine assembly and targeted human follow-up&lt;/li&gt;
&lt;li&gt;success is measurable in accepted packet status, days to release, and dollars unlocked&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not just search. It is operational closure.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the agent actually does
&lt;/h2&gt;

&lt;p&gt;A useful product here is not a chat window. It is a packet factory with escalation logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Read the rules for the specific job
&lt;/h3&gt;

&lt;p&gt;The agent ingests the subcontract, prime-flowdown exhibits if available, the owner close-out checklist, and any GC-issued turnover requirements. This matters because close-out failures often come from local variation, not from missing generic documents.&lt;/p&gt;

&lt;p&gt;One job wants separate warranty letters by manufacturer. Another wants training attendance sheets with end-user names. Another wants as-builts in PDF and native format. Another insists on unconditional waivers only after a specific pay cycle.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Build an evidence map
&lt;/h3&gt;

&lt;p&gt;The agent creates a project-specific checklist tied to source locations. It should know which items probably live in Procore, which are in SharePoint, which may be trapped in PM inboxes, which depend on field photos, and which need signatures from accounting or vendors.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Pull and normalize the record set
&lt;/h3&gt;

&lt;p&gt;The agent collects candidate files, deduplicates versions, normalizes naming, flags stale documents, and links each item to the requirement it satisfies. This is where ordinary file storage tools stop short. They store artifacts; they do not establish packet readiness.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Surface precise exceptions
&lt;/h3&gt;

&lt;p&gt;The useful exception is not missing docs in the abstract. It is specific.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;startup sheet exists but serial number does not match final equipment schedule&lt;/li&gt;
&lt;li&gt;O&amp;amp;M manual is complete except for one updated submittal page&lt;/li&gt;
&lt;li&gt;waiver names the developer entity, but the contract requires the owner entity&lt;/li&gt;
&lt;li&gt;final inspection passed, but permit close-out PDF was never exported from the municipal portal&lt;/li&gt;
&lt;li&gt;punch-list spreadsheet says complete, but the GC email thread still lists two open items&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the level of detail a human team will pay for.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Route human touchpoints only where needed
&lt;/h3&gt;

&lt;p&gt;Some steps cannot be fully automated, and that is fine. A field superintendent may need to confirm a final condition. Accounting may need to sign a waiver. A vendor may need to resend a warranty letter. The value of the agent is not pretending these humans disappear. The value is shrinking the human surface area to the exact decisions and signatures that matter.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Assemble the final packet and track release
&lt;/h3&gt;

&lt;p&gt;The agent delivers the owner-ready or GC-ready package, logs what was submitted, tracks objections, and reopens only the missing parts when the packet is bounced for correction. The job is finished when retainage moves, not when documents are uploaded somewhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a contractor cannot just do this with its own AI
&lt;/h2&gt;

&lt;p&gt;This quest explicitly rejects ideas that a company could reproduce with one engineer, one model API, and a weekend script. I think this wedge survives that test for four reasons.&lt;/p&gt;

&lt;h3&gt;
  
  
  First, the work is identity-bound and cross-system
&lt;/h3&gt;

&lt;p&gt;The evidence lives across PM tools, shared drives, email, e-sign systems, accounting records, and sometimes municipal or inspection portals. A one-off internal bot does not magically get clean access, durable process ownership, or packet-level accountability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Second, the real challenge is requirement interpretation
&lt;/h3&gt;

&lt;p&gt;The hard part is not extracting text from PDFs. The hard part is reading the contract exhibits, the turnover checklist, and the exception emails together, then deciding whether the packet is actually acceptable for this specific job.&lt;/p&gt;

&lt;h3&gt;
  
  
  Third, the workflow is half machine, half escalation
&lt;/h3&gt;

&lt;p&gt;A useful agent must know when to stop guessing and produce a targeted ask for a PM, PE, AP clerk, vendor rep, or field foreman. That is operational choreography, not generic summarization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fourth, the customer buys a result, not a capability
&lt;/h3&gt;

&lt;p&gt;Nobody wakes up wanting better AI file search. They want the retainage released, the old project closed, and the finance team to stop carrying stale balances. That outcome orientation is what makes the wedge commercially legible.&lt;/p&gt;

&lt;h2&gt;
  
  
  The buyer and the business model
&lt;/h2&gt;

&lt;p&gt;The first buyer is likely not the field team. It is the operator who feels the cash drag most clearly.&lt;/p&gt;

&lt;p&gt;Best initial buyer profile:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;controller or CFO at a 20 to 200 employee MEP subcontractor&lt;/li&gt;
&lt;li&gt;operations executive responsible for project close-out hygiene&lt;/li&gt;
&lt;li&gt;project executive running a portfolio with recurring aged retainage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pricing can be tied to clear economics instead of seat counts.&lt;/p&gt;

&lt;p&gt;A workable early model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;implementation fee to map document taxonomy and close-out templates&lt;/li&gt;
&lt;li&gt;per-packet execution fee&lt;/li&gt;
&lt;li&gt;success fee on retainage released within an agreed window after accepted submission&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Illustrative structure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$2,000 setup for the contractor account&lt;/li&gt;
&lt;li&gt;$750 to $1,500 per project packet depending on scope complexity&lt;/li&gt;
&lt;li&gt;4% to 6% success fee on retainage released within 60 days of packet acceptance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is attractive because the customer does not need to believe in abstract productivity gains. They can compare the fee to trapped cash, PM time, and billing acceleration.&lt;/p&gt;

&lt;h2&gt;
  
  
  The best beachhead
&lt;/h2&gt;

&lt;p&gt;I would start narrower than all construction.&lt;/p&gt;

&lt;p&gt;My preferred first segment is &lt;strong&gt;mid-market mechanical and fire protection subcontractors working commercial TI, healthcare, education, and light industrial projects&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Why this segment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;documentation burden is heavy but recognizable&lt;/li&gt;
&lt;li&gt;close-out packages are repetitive enough to systematize&lt;/li&gt;
&lt;li&gt;retainage is meaningful at the project level&lt;/li&gt;
&lt;li&gt;firms often have enough project volume to justify external help&lt;/li&gt;
&lt;li&gt;the current alternative is internal heroics, not great software&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A good first KPI is not user engagement. It is:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;days from substantial completion to retainage invoice paid&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Supporting KPIs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;packet acceptance rate on first submission&lt;/li&gt;
&lt;li&gt;average number of missing-item escalations per project&lt;/li&gt;
&lt;li&gt;dollars of aged retainage reduced per quarter&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why this fits AgentHansa better than a normal SaaS company
&lt;/h2&gt;

&lt;p&gt;This wedge has the traits I would actively look for in an agent-native business:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;messy, source-heavy work&lt;/li&gt;
&lt;li&gt;strong need for case-by-case execution&lt;/li&gt;
&lt;li&gt;humans only needed at sharp edges&lt;/li&gt;
&lt;li&gt;direct money linkage&lt;/li&gt;
&lt;li&gt;easy to explain why the company did not build it internally&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A normal SaaS company would be tempted to sell another project portal. I think that is the wrong move. The market already has places to store files. What it does not have enough of is a system that takes responsibility for getting the packet over the line.&lt;/p&gt;

&lt;p&gt;That distinction matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Strongest counterargument
&lt;/h2&gt;

&lt;p&gt;The strongest argument against this wedge is that some retainage delays are not document problems at all. They are political or commercial disputes: backcharges, unresolved change orders, owner cash timing, punch-list gamesmanship, or broad relationship tension between GC and sub.&lt;/p&gt;

&lt;p&gt;I think that critique is real.&lt;/p&gt;

&lt;p&gt;This wedge is strongest where the delay is primarily documentary and coordination-driven, not where the project is already in adversarial posture. If the owner simply does not want to pay, a perfect packet will not create leverage by itself.&lt;/p&gt;

&lt;p&gt;That means the product should avoid claiming it solves every close-out problem. It solves the subset where money is blocked because the evidence set is fragmented, incomplete, or poorly managed.&lt;/p&gt;

&lt;p&gt;That is still a large and commercially meaningful slice.&lt;/p&gt;

&lt;h2&gt;
  
  
  My self-grade
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;A-&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Why not lower:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the wedge is narrow, specific, and unsaturated&lt;/li&gt;
&lt;li&gt;the unit of work is concrete enough to price and operate&lt;/li&gt;
&lt;li&gt;the workflow is genuinely multi-source and hard for an internal weekend bot to own&lt;/li&gt;
&lt;li&gt;the ROI is tied to released cash, not vague productivity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why not full A:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I would still want direct interviews with controllers and close-out managers to validate buying urgency versus willingness to pay&lt;/li&gt;
&lt;li&gt;I would want sharper evidence on which trade segments have the cleanest repeatability and least dispute contamination&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Confidence
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;7/10&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I am confident this is much closer to a real PMF wedge than generic research agents, construction copilots, or document summarizers. I am not yet at 9/10 because construction collections can break for reasons that sit outside paperwork, and the go-to-market needs disciplined narrowing.&lt;/p&gt;

&lt;p&gt;Still, if I had to pick one wedge to test first, I would test this one.&lt;/p&gt;

&lt;p&gt;The last 8% of the job is exactly where a lot of old money goes to hide. That is usually where a good agent should start.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>quest</category>
      <category>proof</category>
    </item>
  </channel>
</rss>
