Why I Rent Expertise and Own Judgment

The crew I didn't build

Feb 10, 2026

I needed to know when regulatory filings change in ways that matter.

So I built a monitoring system last week. It watches a list of URLs, downloads whatever’s there, compares it to yesterday, decides whether the change is significant, and alerts me if it is.

Took about three hours.

Here’s what I didn’t build: the URL validator, the diff engine, or the change detection logic.

I built the part that’s actually mine: what counts as important to me.

A minimal isometric illustration of a central control room with industrial machinery in warm muted tones.

Let me back up.

I had a simple problem. A bunch of documents live at URLs. Regulatory filings, policy pages, terms of service. They change sometimes. When they change in ways that matter, I need to know.

Sounds like a weekend project. Download files, compare them, send an alert.

Except it’s not that simple.

Some of those URLs are sketchy. Phishing domains that look legitimate. Redirects to malware. The internet is full of traps, and my system was about to fetch whatever I pointed it at, no questions asked.

So I need URL validation. Is this domain trustworthy? How old is it? Does it have a history of malicious behavior? Is it impersonating something else?

That’s not a weekend project. That’s a security research problem. Months of work to do well. Years to do really well.

Then there’s the diff problem.

Two CSVs. One from yesterday, one from today. What changed?

Naive answer: run a diff. Here are 200 lines that are different.

Except 180 of those are formatting noise. A column moved. Dates reformatted. Whitespace changed. Nothing actually changed — it just looks different.

The 20 real changes? Buried in the noise.

Building a diff engine that understands the difference between “formatting changed” and “content changed” is hard. Really hard. You need to understand structure, not just text. You need heuristics that work across different file types. You need to handle edge cases that seem obvious until you try to code them.

That’s not a weekend project either.

So now my “simple monitoring system” requires:

A security research team for URL validation
A data engineering team for intelligent diffing
Me, to decide what matters

Three months of work. Minimum. For something I thought would take a weekend.

This is the build trap.

You start with a simple idea. You discover it requires hard subproblems. Each subproblem looks tractable until you’re deep into it. Before you know it, you’re building infrastructure instead of solving your actual problem.

Most people respond to this in one of two ways.

Option A: Build it anyway. Spend three months. End up with a mediocre URL validator and a diff engine that mostly works. Your actual problem, what changes matter, gets 10% of your attention because the other 90% went to infrastructure.

Option B: Give up. The problem was too big. Maybe next quarter. Maybe never.

There’s a third option now.

I didn’t build the URL validator.

I used one. An agent that’s already solved the “is this URL safe to fetch” problem. Checks domain age, reputation, redirect chains, known malicious patterns. Returns a verdict: safe, suspicious, or dangerous.

I didn’t build the diff engine.

I used one. An agent that looks at two versions of structured data and tells you what actually changed. Not “here are 200 different lines.” Instead: “3 real changes, 197 formatting noise.”

What I built was the part that’s actually mine.

The backstory: what am I monitoring and why. The goals: what kind of changes matter to me. The decision logic: when do I alert a human, when do I act automatically, when do I ignore.

That’s the part nobody else can build. That’s the part that requires my context, my judgment, my risk tolerance.

The URL validation? Commodity expertise. Someone else already solved it.

The intelligent diffing? Commodity expertise. Someone else already solved it.

My monitoring system became a composition. Three agents working together. Two I rented. One I built.

Three hours.

This is what crews actually are.

Not frameworks. Not orchestration layers. Not another abstraction to learn.

A crew is a collection of agents solving a problem together. Some you build. Some you rent. The magic is knowing which is which.

The question isn’t “how do I build an agent that does X.”

The question is: “Is X my secret sauce, or is it something I should rent from someone who’s already solved it?”

If URL safety is your core business, build the URL agent. Make it the best in the world. That’s your edge.

If URL safety is a subproblem you need solved so you can get to your actual work? Rent it. Don’t spend months becoming a mediocre security researcher when you could be great at the thing you actually care about.

Same logic applies everywhere.

Need to parse PDFs? That’s probably not your secret sauce. Rent it.

Need to extract entities from news articles? Probably not your edge. Rent it.

Need to assess credit risk using your proprietary methodology that you’ve refined over fifteen years? That’s your edge. Build it. Own it. Don’t let anyone else touch it.

The crew pattern makes this composition explicit. You’re not building a monolith. You’re assembling authorities.

Here’s what my monitoring crew looks like:

URL safety check - Rent. Commodity. Someone else has better threat intel.
Structured diff - Rent. Commodity. “Real change vs. noise” is solved.
Is this filing material? - Unclear. Could go either way depending on how differentiated my judgment is. Maybe a situation where multiple opinions would be helpful?
What matters to me - Build. My context. My risk model. My workflow.
Orchestration - Build. How I wire things together is part of my secret sauce.

The rented agents don’t know about each other. They don’t know about my problem. They just do their job well.

The agent I built knows everything. It knows why I’m monitoring these URLs. It knows what kind of changes I care about. It knows when to wake me up at 2am versus when to log it for tomorrow.

That’s the separation of concerns.

Specialist expertise at the edges. My judgment in the middle.

Here’s how the wiring actually works.

The orchestration layer is simple. A loop that processes URLs, gates on safety, fetches if clean, diffs against previous, then decides.

async def process_url(url: str, previous_content: dict) -> Alert | None:
    
    # First gate: is this URL safe to touch?
    safety = await url_checker.check(url)
    
    if safety.verdict == "dangerous":
        log.warn(f"Skipping malicious URL: {url}")
        return None
    
    if safety.verdict == "suspicious":
        return Alert(
            type="manual_review",
            reason=f"Suspicious URL flagged: {safety.reason}",
            url=url
        )
    
    # Safe to fetch
    current_content = await fetch_document(url)
    
    # What actually changed?
    diff = await rvl.compare(
        previous=previous_content.get(url),
        current=current_content
    )
    
    if diff.real_changes == 0:
        log.info(f"No real changes: {url} ({diff.formatting_changes} formatting only)")
        return None
    
    # This is the part that's mine
    decision = await my_agent.evaluate(
        url=url,
        changes=diff.real_changes,
        context=get_monitoring_context(url)
    )
    
    if decision.should_alert:
        return Alert(
            type=decision.alert_type,
            reason=decision.reasoning,
            changes=diff.real_changes,
            url=url
        )
    
    return None

The url_checker and rvl calls are the rented expertise. They’re specialized agents - I call them over the network, they return structured responses. I don’t know or care how they work internally. I just trust their verdict.

The my_agent.evaluate call is where my judgment lives:

class MonitoringAgent:
    
    def __init__(self, config: MonitoringConfig):
        self.goals = config.goals  # What am I watching for?
        self.thresholds = config.thresholds  # When do I care?
        self.escalation = config.escalation  # Who gets woken up?
    
    async def evaluate(self, url: str, changes: list[Change], context: dict) -> Decision:
        
        # This prompt encodes MY judgment, MY priorities
        prompt = f"""
        You are monitoring {url} for {self.goals}.
        
        The following real changes were detected:
        {format_changes(changes)}
        
        Context about why this document matters:
        {context}
        
        Based on my thresholds:
        - Alert immediately if: {self.thresholds.immediate}
        - Flag for review if: {self.thresholds.review}
        - Ignore if: {self.thresholds.ignore}
        
        What should I do?
        """
        
        response = await llm.complete(prompt)
        return parse_decision(response)

That’s the whole thing. The rented agents handle “is this safe” and “what actually changed.” My agent handles “does this matter to me.”

The orchestration (the order of operations, the error handling, the logging) that’s also mine. It expresses my workflow. Someone else monitoring different documents for different reasons would wire it differently.

Three agents. Maybe 50 lines of orchestration. The hard problems (URL security, intelligent diffing) are solved by people who specialize in them. The easy problem (what do I care about) is the only part I had to think through.

A hand places a golden piece into a modular structure in a digital isometric illustration.

The economics of this are counterintuitive.

Renting agents feels expensive. You’re paying for something you could theoretically build yourself.

But building agents is way more expensive. You’re just not accounting for the cost correctly.

The URL safety agent I used has been running for months. Thousands of calls. Edge cases discovered and handled. Threat intelligence updated. A team thinking about this problem full-time.

If I built my own, I’d get a snapshot. Whatever I knew about URL safety at the moment I built it. No updates. No edge case refinement. No team iterating on it.

Renting gets me the accumulated expertise. Building gets me my own amateur hour.

The only exception: when the thing you’re building is the expertise. When you’re going to be the authority that other people rent.

Then you build. And you build it better than anyone else. And you make it available to the people who should be renting instead of building.

This is the mental shift.

Old model: I need to build a system that does X, Y, and Z.

New model: I need to build a system. X and Z are commodity expertise: I’ll rent those. Y is my secret sauce: I’ll build that.

The crew isn’t a framework you adopt. It’s a way of thinking about problems.

When you design a crew, you’re asking: where does my unique value live? What am I building that nobody else can build? And what am I building that I shouldn’t be building, because someone else has already solved it better?

The monitoring system I built in three hours would have taken three months if I’d built everything myself. And the version I built in three hours is better, because the rented components are better than what I would have built.

Rent expertise. Own judgment.

That’s the crew I didn’t build.

Discussion about this post

Ready for more?