Session Player Studio | Product Consulting

The Problem

Finding consulting opportunities as a senior product person is fundamentally different from finding a full-time job. The best opportunities aren't posted on job boards. They're buried in company blogs, mentioned in passing during funding announcements, or hidden in the gaps between what a company says they're doing and what they're actually struggling with.

Job boards optimize for volume and full-time roles. Aggregators repost the same listings across multiple sites. And when you do find something labeled "contract," it might be a 6-month engagement worth pursuing, or it might be a government procurement RFP that has nothing to do with product consulting.

I didn't need another lead generation tool that blasts out cold emails. I needed something that would increase my surface area of awareness — a system that could scan wider than I could manually, surface contexts that might be worth attention, and then let me decide what's actually interesting.

Why I Built This as a Fast, Iterative Tool

I didn't set out to design the "perfect" system for finding consulting work. Perfect is the enemy of useful, especially when you're still figuring out what the problem actually is.

One of the best things about having modern tools at our disposal is the ability to build throwaway code that solves real problems. I don't write PRDs anymore — I build until I've satisfied my own understanding of the problem, then share immediately for feedback. The code might be rough, but it's real, and that's what matters.

Instead, I wanted something I could get immediate value from, even if it was rough. Something I could run, look at, react to, and then adjust as my understanding of the problem evolved. The fastest way for me to think clearly about a messy problem is to put a real artifact in front of myself.

So the goal of this first version was deliberately simple:

work end to end,
surface real results,
and give me something concrete to react to.

I started with two data sources: Greenhouse board scanning (for structured ATS data) and Google Custom Search (to go wide and find things posted on company sites or niche boards). The pipeline ran, returned results, scored them with keyword matching. Technically, it worked.

But once I could see actual outputs — good ones and bad ones — it became obvious where my assumptions were wrong: which signals were too noisy, which sources were misleading, and which distinctions I hadn't articulated clearly enough yet.

That feedback loop is the point. Rather than over-designing upfront, I optimized for short build cycles, visible output, and the ability to change direction quickly as my thinking sharpened. This approach let the product thinking emerge through the work: each iteration clarified the real problem a little more, which then informed the next change.

It's the same way I like to work with teams: build something small that's real, learn from it immediately, and let the system get smarter as your understanding does.

The MVP Ran — but It Wasn't Solving the Right Problem Yet

The first version ran end to end. It pulled data, surfaced results, and gave me something real to look at. But once I started reviewing the output, it was clear it wasn't quite solving the problem I actually cared about.

Most of what came back was noise:

full-time roles slipping through because they mentioned "flexible" or "hybrid,"
job board aggregators repeating the same postings across multiple sites,
product roles that didn't translate into real consulting opportunities (government RFPs, defense contracts, etc.),
and legitimate contract roles buried under hundreds of irrelevant results.

Nothing was technically "broken" — the system was doing what I asked it to do. But the results weren't helping me decide who to reach out to. That gap was useful.

It forced me to step back and realize the issue wasn't tuning or scoring. The problem was how I was framing the question in the first place. I wasn't trying to find product jobs. I was trying to surface situations where outside product help would actually make sense — whether they'd posted a role or not.

That reframe changed everything. It shaped the next iteration.

Reality Check

Before jumping into fixes, I sat down and catalogued exactly how the system was failing. Understanding failure modes is how you know what to actually fix.

Aggregator pollution. Sites like Jobright, BuiltIn, Indeed were ranking well in search results but adding no signal — just reposting the same roles I'd already seen elsewhere.
"Contract" ≠ consulting opportunity. The word appears everywhere: government RFPs, defense procurement, legal boilerplate. Most of these had nothing to do with product consulting.
Snippets lie. Google search snippets often cut off at misleading points, making it look like a contract role when the actual employment type was buried deeper in the full description.
Full-time leakage. Roles that mentioned "flexible" or "hybrid" were getting through because my bag-of-words scoring couldn't distinguish between "flexible contract" and "flexible full-time."

But here's what became clear: the problem wasn't the data sources, and it wasn't the scoring algorithm. It was something more fundamental.

Breakthrough: Split the Problem

The unlock came when I stopped trying to optimize one pipeline for two incompatible goals. I needed two lanes, not one blended system.

Lane A

Explicit Demand

They posted a contract, interim, or part-time role. The employment type is stated clearly.

"Contract Product Manager"

"Interim Head of Product"

"Fractional CPO"

Gate:employment_type ∈ allowlist

Lane B

Implicit Demand

They didn't post a contract role, but something interesting is happening that might justify outreach.

"Building product function from scratch"

"Leading platform migration"

"Series B, first product hire"

Gate:context_reason ≠ null

Separate thresholds, separate outputs

Lane A: Explicit Demand

They posted a contract, interim, or part-time role. The employment type is stated clearly. These need hard gating — if the detected employment type isn't in my allowlist, reject it. No exceptions. This lane is about precision, not recall.

Lane B: Implicit Demand

They didn't post a contract role, but something risky or interesting is happening: a migration, a platform rebuild, a monetization shift, a first product hire. These need context detection — not keyword matching, but compositional rules that look for combinations of signals. This lane is about recall, not precision.

Once I separated them, I could tune each lane independently. The explicit lane got stricter filters. The implicit lane got broader context detection. And suddenly, both started working better.

Signal Tiers

Boulder

Leadership + time-bounded

"Interim CPO, Acting VP Product"

Rock

Senior role + initiative

"Contract PM leading migration"

Pebble

Advisory + seeking language

"Product advisor, part-time PM"

System Design: From Lead Engine → Context Radar

Renaming the thing from "Lead Engine" to "Context Radar" was a small shift, but it clarified the purpose. This isn't about generating leads. It's about surfacing contexts that make me curious — situations where I might be able to help, whether they've posted a role or not.

That reframe led to a set of design principles that shaped how the system works:

Surface contexts, not answers. Don't pretend the system knows what's worth pursuing. Show me why something matched, what signals triggered it, and let me decide.
Optimize for "this makes me curious." False positives are fine if they're interesting. False negatives — missing something that would have been worth pursuing — are the real failure.
Explainability > score. A result with tier_reasons: ["interim", "head of product"] tells me instantly if it's worth clicking. A score of 7.2 tells me nothing. I need to understand why something matched, not just that it did.
Fewer, better items. 20 well-qualified, well-explained results per run beats 200 noisy ones. Quality over quantity, always.

These principles shaped how I structured the queries, how I classified results, and how I presented them. Everything flows from the core idea: this is a tool for augmented judgment, not automated decision-making.

Tiered web queries — structured by lane and priority

discover.ts

const WEB_QUERIES: TieredQuery[] = [
  // BOULDER EXPLICIT - Leadership + time-bounded terms
  { tier: 'boulder', lane: 'explicit', 
    query: `("interim head of product" OR "interim CPO")`, 
    purpose: 'Interim leadership explicit' },
  
  // ROCK EXPLICIT - Contract/FTC + product role  
  { tier: 'rock', lane: 'explicit',
    query: `("contract product manager") ("platform" OR "analytics")`,
    purpose: 'Contract PM initiative' },
  
  // ROCK IMPLICIT - Initiative + senior role
  { tier: 'rock', lane: 'implicit',
    query: `("product manager") ("migration" OR "replatform")`,
    purpose: 'PM leading migration' },
  
  // PEBBLE IMPLICIT - Advisory/help seeking
  { tier: 'pebble', lane: 'implicit',
    query: `("looking for" OR "seeking") ("product advisor")`,
    purpose: 'Seeking product help' },
];

What Changed

Once I had the two-lane model, the improvements became obvious. Here's what changed, in order of impact:

1. Hard employment-type gating for explicit lane

If the detected employment type isn't in the allowlist, the result is rejected from the explicit lane. No exceptions, no scoring, no second chances. This eliminated most of the full-time leakage and government RFP noise.

Employment type detection — allowlist-based gating

discover.ts

// Eligible employment types - ONLY these pass the EXPLICIT lane gate
const ELIGIBLE_EMPLOYMENT_TYPES = [
  'contract', 'interim', 'fractional', 'consulting', 'fixed-term',
  'part-time', 'temporary', 'freelance', 'secondment', 'retainer',
];

// Domain blocklist - skip these aggregator/spam sites entirely
const DOMAIN_BLOCKLIST = [
  'jobright.ai', 'builtin.com', 'indeed.com', 'glassdoor.com',
  'ziprecruiter.com', 'simplyhired.com', 'talent.com', 'monster.com',
  'linkedin.com', 'reddit.com', 'quora.com', 'medium.com',
  // ... 20+ more
];

2. Domain blocklist for aggregators

Kill known bad domains before they even hit scoring. LinkedIn, Indeed, Glassdoor, BuiltIn, Reddit — all gone. If a domain doesn't add signal, it doesn't get processed. This cut the noise by about 60%.

3. ATS trust bonus only for real ATS domains

Greenhouse, Lever, Ashby get a source quality bonus because they're structured and reliable. Random company blogs don't. This helps prioritize results that are more likely to be actionable.

4. Compositional rules for Boulder/Rock/Pebble

A "Boulder" isn't just a keyword hit. It requires leadership title AND time-bounded signal. A "Rock" requires senior role AND initiative signal. This eliminates most false positives by requiring multiple signals to align.

Tier classification — compositional rules, not bag-of-words

discover.ts

function classifyTier(fullText: string): TierResult {
  const textLower = fullText.toLowerCase();
  
  // BOULDER: Leadership title + time-bounded signal
  const hasLeadershipTitle = hasAnyTerm(textLower, LEADERSHIP_TITLES);
  const hasTimeBoundedLeadership = hasAnyTerm(textLower, TIME_BOUNDED_SIGNALS);
  
  if (hasLeadershipTitle && hasTimeBoundedLeadership) {
    return { tier: 'boulder', tierReasons: [...matchedLeadership, ...matchedTime] };
  }
  
  // ROCK: Senior role + initiative signal
  const hasSeniorRole = hasAnyTerm(textLower, SENIOR_PRODUCT_ROLES);
  const hasInitiative = hasAnyTerm(textLower, INITIATIVE_SIGNALS);
  
  if (hasSeniorRole && hasInitiative) {
    return { tier: 'rock', tierReasons: [...matchedRole, ...matchedInit] };
  }
  
  // PEBBLE: Advisory signals + seeking language
  const hasAdvisory = hasAnyTerm(textLower, ADVISORY_SIGNALS);
  const hasSeeking = hasAnyTerm(textLower, SEEKING_SIGNALS);
  
  if (hasAdvisory && hasSeeking) {
    return { tier: 'pebble', tierReasons: [...matchedAdvisory, ...matchedSeeking] };
  }
  
  return { tier: 'unqualified', tierReasons: [] };
}

5. Separate lane assignment logic

Each lane has its own gate. The explicit lane checks employment type. The implicit lane checks for context signals. They don't interfere with each other anymore.

Lane assignment — separate gates for explicit vs implicit

discover.ts

function assignLane(
  fullText: string,
  employmentType: string | null,
  hasProductRole: boolean
): LaneAssignment {
  // EXPLICIT lane: employment type in allowlist
  if (isEligibleEmploymentType(employmentType)) {
    return { lane: 'explicit', context_reason: null };
  }
  
  // Skip full-time for both lanes
  if (employmentType === 'full-time') {
    return { lane: null, context_reason: null };
  }
  
  // IMPLICIT lane: context detected with product role
  if (hasProductRole) {
    const contextReason = detectImplicitContext(fullText);
    if (contextReason) {
      return { lane: 'implicit', context_reason: contextReason };
    }
  }
  
  return { lane: null, context_reason: null };
}

6. Fetch HTML for borderline results

For results that score near the threshold, the system can fetch the actual page HTML and re-run detection. No AI — just more context to make a better decision. This catches cases where Google snippets were misleading.

❌

Before: Lead Engine

•One mixed pipeline of results
•"Technically correct, practically useless"
•Aggregator pollution everywhere
•Full-time roles leaking through
•Bag-of-words scoring (any keyword = match)

✓

After: Context Radar

•Two lanes with separate thresholds
•Surfaces contexts, not answers
•Hard domain blocklist for aggregators
•Explicit employment-type gating
•Compositional rules (role + signal + context)

What This Says About How I Work

This whole process — from MVP to two-lane system to Context Radar — reveals something about how I approach product problems:

Define the problem, but be willing to refine it as new information surfaces. I started thinking I needed a "lead engine." But when the MVP output showed me what I was actually looking for, I realized the problem wasn't finding leads — it was surfacing contexts worth attention. Always check if you're solving the right problem, not just tuning the solution.
Prototyping and rapid iteration beats endless planning at these early stages. The fastest way to think clearly about a messy problem is to put a real artifact in front of yourself. The MVP output was mostly useless, but it taught me what actually mattered. You can't learn that from theory — you need real outputs to react to.
When something isn't working, check if you're solving the right problem. The MVP wasn't broken — it was doing exactly what I asked. But the results weren't helping me decide. That gap forced me to realize the issue wasn't tuning or scoring. It was how I was framing the question. Sometimes the best fix is to reframe the problem entirely.

What's Next

This is a working internal tool, not a product. It solves my problem, and that's enough. The plan going forward is simple:

Keep tuning sources and context detectors as I learn what actually converts — which signals lead to conversations, which don't.
Improve the model to further refine context insights — but only once the signal is clean enough to be worth refining. Garbage-in-garbage-out applies whether you're using rules or models. Clean the signal first, then improve the model.
Don't overbuild. It runs on Vercel, takes 30 seconds, outputs a JSON file. That's enough. More infrastructure wouldn't make it better, just more complicated.

The system works because it's focused. It does one thing well: surfaces contexts worth attention. That's the whole point.

Session Notes: Building a Lead Signal Engine

The Problem

Why I Built This as a Fast, Iterative Tool

The MVP Ran — but It Wasn't Solving the Right Problem Yet

Reality Check

Breakthrough: Split the Problem

Explicit Demand

Implicit Demand

Lane A: Explicit Demand

Lane B: Implicit Demand

Signal Tiers

System Design: From Lead Engine → Context Radar

What Changed

1. Hard employment-type gating for explicit lane

2. Domain blocklist for aggregators

3. ATS trust bonus only for real ATS domains

4. Compositional rules for Boulder/Rock/Pebble

5. Separate lane assignment logic

6. Fetch HTML for borderline results

Before: Lead Engine

After: Context Radar

What This Says About How I Work

What's Next

Let's make some beautiful music together

Schedule a Call

What You Get

Direct Line