You have 15 candidates interested in a housekeeping position. You contact them on WhatsApp one by one. The first one doesn't have her papers in order. The third one is asking for double the offered salary. The seventh lives on the other side of the city and isn't willing to commute. You've spent three hours chatting and you've only found two candidates worth presenting to the family.
And when the family asks "why do you recommend this one and not that one?" all you have are scattered notes to justify your reasoning.
The problem isn't the number of candidates. It's that the initial screening is completely manual, inconsistent, and leaves no comparable data.
The Bottleneck Nobody Sees
In domestic worker recruitment, the WhatsApp screening stage is the step that consumes the most time and produces the least useful data. It's pure conversation — no structure.
The result is an accumulation of problems:
- Long interviews with incompatible candidates. The deal-breaker (paperwork, salary, location) shows up at turn 8 of a 12-turn conversation. All the prior time was wasted.
- No traceability. There's no structured record of why each candidate was rejected. If the family asks, you're improvising.
- No comparability. Every interview has different questions depending on how the conversation flowed. Comparing candidates to each other is impossible.
- Total dependency on the operator. If the person running the interviews is busy, the process stops.
The solution isn't a new spreadsheet or a Google Form. It's an AI agent that conducts interviews via WhatsApp autonomously, with judgment, and produces reports you can use directly.
What I Built
A three-workflow system in n8n that operates like this:
Workflow 1 — Start interview: Receives the job posting and candidate from the database, generates a personalized interview plan based on the job type (live-in, live-out, nanny, elderly caregiver) and sends the first message via WhatsApp.
Workflow 2 — Conduct the interview: Triggered by each message from the candidate. The agent reads the full context (job posting, plan, conversation history, data already extracted) and decides what to ask next, whether to follow up on something ambiguous, or whether it found an incompatibility that warrants closing.
Workflow 3 — Generate the report: When the interview closes (due to incompatibility, complete coverage, or reaching the 12-question limit), a second model analyzes the full conversation and produces two documents: a technical JSON with scores per dimension, sentiment, and intent level, and a clear Markdown summary for the family.
The Agent: Carla
The interviewing agent's name is Carla. She's a warm, professional recruiter — at least that's how she presents herself and how she behaves.
Carla knows several things from the very first turn:
- The job type, neighborhood, salary, schedule, and main responsibilities
- The non-negotiable requirements (if any fail, the interview closes immediately)
- The prioritized question plan for that specific position, generated by a previous model that analyzes the job posting
Her conversation rules are simple but fundamental:
- One question per message. This is WhatsApp, not an in-person interview.
- Maximum two sentences per turn. If the response is evasive, ask again once. Just once.
- The data she confirms (paperwork, salary, experience, availability) is extracted as structured JSON on each turn, accumulated in Postgres.
- If she detects a deal-breaker, she closes immediately. She thanks the candidate warmly and leaves the door open for future positions. She doesn't keep asking.
The hard limit is 12 questions. If she reaches that point without closing due to incompatibility, she closes as "interview completed" or "limit reached," with everything she managed to cover.
How the Question Plan Works
Before the first message, a normalizing model analyzes the job description and produces a JSON with:
- The critical dimensions to evaluate and in what order (deal-breaker detection questions come first)
- The specific deal-breakers for that position and how to detect them in conversation
- Specific questions based on job type (an elderly caregiver position has questions about medication management and mobility assistance; a nanny position, about first aid and age-appropriate stimulation)
The 10 dimensions common to every position are stable: basic identity, current documents, prior experience, references, availability, salary match, health and physical capacity, relevant habits, communication, and motivation.
Order matters: if salary is a frequent deal-breaker, it's asked at turn 2. Not turn 10.
The Reports: One for You, One for the Family
When the interview closes, the pipeline generates two completely different documents.
Technical report (JSON): For the recruiter. Includes the match score (0–100, calculated as a weighted average of scores by dimension, with non-negotiable requirements weighted triple and the score forced to 0 if there's any deal-breaker), the recommendation (advance/maybe/reject), turn-by-turn sentiment analysis, the candidate's intent level, concrete red and green flags, and pending questions if the interview closed before covering them.
Client report (Markdown): For the family. No technical jargon. Includes the recommendation with emoji (✅ advance, ⚠️ maybe, ❌ do not advance), the two or three most relevant points about the candidate, strengths in everyday language, and "topics to discuss in person" framed as areas to explore further — not as flaws. 200–350 words. Readable in one minute.
An example of the client report for a candidate with a match score of 78:
# Marlene G. — Elderly Caregiver
**Recommendation:** ✅ Advance to in-person interview
**Compatibility with the position:** 78/100
## Most Relevant Points
8 years of experience, the last 3 caring for an elderly woman with
mild dependency. Lives nearby, paperwork in order, available immediately.
## Strengths
- Experience closely aligned with what you're looking for
- Lives nearby, low risk of delays
- Clear and warm communication throughout the conversation
## Topics to Discuss in Person
- Asking for $650,000 net (offer is $620,000) — minor gap, negotiable
- Only provided one verifiable reference. Ask for a second.
- Explore specifics of medication management for this position.
## How the Conversation Felt
Warm and honest. She asked twice about the start date and asked
for details about the neighborhood — clear signals of genuine interest.
The Technical Architecture
The full stack:
| Component | Tool |
|---|---|
| Workflow engine | n8n (self-hosted) |
| LLM (all roles) | OpenAI GPT-4o |
| Database | Postgres (Supabase) |
| Messaging channel | WhatsApp via HTTP Request (placeholder) |
The database has five tables: positions, candidates, interviews, messages, and reports. The interviews table accumulates the structured data extracted by the agent in a JSONB field that gets merged turn by turn using Postgres's || operator.
The agent doesn't use tool calls — it emits structured JSON on each turn with the extracted data, whether there was a deal-breaker, whether it should close, the reason, and its response to the candidate. A code node in n8n processes that JSON, decides the new interview state, and triggers the report workflow if applicable.
Estimated cost per interviewed candidate: $0.16 USD. A 12-turn interview consumes about $0.06 in agent tokens and $0.10 in the two report LLMs.
What It Solves in Practice
Before, screening 15 candidates took 3–4 hours of fragmented manual conversation. With the agent:
- Deal-breakers are detected at turns 2–3 in 60–70% of cases. Those interviews close in minutes, not hours.
- All evaluated candidates are in the same comparable database. I can sort them by match score, filter by recommendation, and see exactly which dimension each one falls short on.
- The family receives a formatted, clear, jargon-free report. I don't have to write anything by hand.
- The process works even when I'm busy with something else. Carla handles incoming messages and conducts the interview autonomously.
What's Next
The next step is connecting a real WhatsApp provider (I'm evaluating Twilio sandbox for testing and Meta Cloud API for production) and building a minimal UI in Retool or Appsmith on top of Postgres to manage positions and launch interviews without opening n8n.
If you're interested in solving something similar — whether in domestic staff recruiting or any structured qualification process via WhatsApp — the pattern is the same: agent with injected context, early incompatibility detection, structured data accumulated per turn, and auto-generated reports.