Your recruiting tool just rejected a perfect candidate in 0.3 seconds.
Here is what happened: The candidate's resume said "Go programming language" but your job posting asked for "Golang." The system did not recognize they are the same thing. Candidate rejected.
This happens 35–50% of the time with automated candidate matching.
Why? Because most candidate matching tools use keyword matching. They look for exact phrases, not actual job fit.
Candidate actually knows Golang, has shipped production systems in it, and would be perfect for the role. But the resume said "Go" instead of "Golang," so the automated matcher rejected them.
This is the complete guide to how candidate matching works (and fails), why keyword matching is fundamentally flawed, which matching algorithms are most accurate, how to measure matching quality, why traditional matching misses 40–60% of qualified candidates, how to implement effective matching, and why EvexAI's behavioral vetting achieves 95%+ accuracy while traditional matching achieves 30–40%.
The Candidate Matching Crisis
The problem: Recruiting tools promise automated candidate matching. Reality: Most automated matching is fundamentally broken.
What vendors claim:
- "Match candidates to jobs with 85%+ accuracy"
- "Find the perfect fit automatically"
- "AI identifies top candidates instantly"
What actually happens:
- 35–50% false negatives (qualified candidates rejected)
- 20–30% false positives (unqualified candidates advanced)
- Overall matching accuracy: 30–40% (barely better than random)
Why is matching so hard?
Reason 1: Job requirements are ambiguous
Job posting says: "5+ years software engineering experience"
Does this mean:
- 5+ years at a single company (stability)?
- 5+ years total (cumulative)?
- 5+ years in the specific technology stack (technical depth)?
- 5+ years at director level (leadership)?
Different people interpret "5+ years" differently.
Reason 2: Resume language varies infinitely
Same skill, different terminology:
- "Go" vs. "Golang" vs. "Google Go"
- "UI design" vs. "front-end design" vs. "product design"
- "Led 5 engineers" vs. "managed team of 5" vs. "supervised 5 direct reports"
- "AWS deployment" vs. "cloud infrastructure" vs. "DevOps"
- "Python scripting" vs. "Python development" vs. "Python programming"
A keyword matcher looking for "Golang" misses candidates who list "Go."
Reason 3: Context is crucial but hidden
Candidate's resume says: "10 years experience"
What you need to know (but resume does not say):
- 10 years at same company (deep institutional knowledge)?
- 10 years with career gap (1 year = 10 years)?
- 10 years in older tech stack (outdated skills)?
- 10 years in relevant field (transferable)?
Resume shows "10 years" and keyword matcher says "match!" But context is missing.
Reason 4: Job fit requires multi-dimensional matching
Candidate is "match" if they have:
- Technical skills (programming language, frameworks)
- Experience level (seniority, years)
- Domain knowledge (industry, problem space)
- Soft skills (communication, leadership, collaboration)
- Personality fit (work style, values, team dynamic)
- Growth trajectory (motivated to learn, advance)
- Compensation alignment (salary expectations)
- Location/relocation preference
Keyword matcher measures 1–2 of these (technical skills + years). Misses the other 5–6.
How Candidate Matching Works (The Technical Deep Dive)
Method 1: Keyword Matching (Most Common, Least Effective)
How it works:
- Parse resume: Extract keywords (languages, frameworks, companies, degrees)
- Parse job posting: Extract required keywords
- Compare: Calculate % of job keywords found on resume
- Score: If % > threshold, candidate is "match"
Example:
Job posting requires: ["Python", "AWS", "5+ years", "SaaS"]
Candidate resume has: ["Python", "AWS", "7 years", "enterprise software"]
Match calculation:
- "Python": ✓ (match)
- "AWS": ✓ (match)
- "5+ years": ✓ (candidate has 7 years)
- "SaaS": ✗ (candidate has "enterprise software", not "SaaS")
- Match score: 3/4 = 75%
If threshold = 75%, candidate advances. If threshold = 100%, candidate rejected.
Problems with keyword matching:
-
Exact phrase matching fails
- "Golang" ≠ "Go" (same technology, different words)
- "React.js" ≠ "React" (same technology, different names)
- Keyword matcher sees no match, rejects qualified candidate
-
Synonym handling is poor
- "UI design" ≠ "user interface design" ≠ "front-end design" (same skill, different words)
- If job posts "UI design" and resume says "user interface design," keyword matcher sees no match
-
Implicit experience is missed
- Job requires: "AWS experience"
- Resume says: "Deployed applications to cloud infrastructure"
- Implicit: Candidate probably used AWS (most common cloud provider)
- Keyword matcher sees no match, rejects candidate
-
Experience equivalence is ignored
- Job requires: "5+ years software engineering"
- Candidate has: "3 years engineering + 4 years computer science degree + bootcamp + 2 open source projects"
- Equivalent experience: Yes (9 years if counted as years of experience)
- Keyword matcher sees "3 years" < "5 years," rejects candidate
-
Domain transfer is invisible
- Job requires: "SaaS experience"
- Candidate has: "10 years enterprise software" (which is not SaaS)
- Reality: Candidate's enterprise skills transfer perfectly to SaaS
- Keyword matcher sees "enterprise" ≠ "SaaS," rejects candidate
Accuracy of keyword matching: 30–40%
Method 2: Simple AI Matching (Keyword + Synonyms)
How it works:
- Parse resume and job posting (same as keyword matching)
- Identify synonyms: "Golang" → also matches "Go", "Google Go"
- Calculate semantic similarity: How similar is resume language to job posting language?
- Score: If similarity > threshold, candidate is match
Improvement over keyword matching:
Recognizes: "Golang" ≈ "Go" (synonym)
But still misses:
- Implicit experience ("cloud infrastructure" → likely AWS)
- Domain transfer ("enterprise software" → applicable to SaaS)
- Experience equivalence (bootcamp + open source = relevant experience)
Accuracy of simple AI matching: 40–50%
Method 3: Advanced NLP Matching (Natural Language Processing)
How it works:
- Convert resume text to mathematical vector (embedding)
- Convert job posting to mathematical vector
- Calculate cosine similarity: How close are the vectors?
- Score: Similarity % = match probability
Example (simplified):
Resume embedding (vector): [0.8, 0.2, 0.9, 0.1, 0.7, ...] Job posting embedding: [0.9, 0.1, 0.85, 0.15, 0.8, ...]
Cosine similarity = dot product / (magnitude × magnitude) = 0.92 (92% similar)
If threshold = 85%, candidate advances.
Improvement:
Captures semantic meaning, not just keywords.
"AWS deployment" and "cloud infrastructure" both convert to similar vectors (both about deploying to cloud), so NLP recognizes match.
But still misses:
- Context (10 years at same company vs. 10 years with gaps)
- Trajectory (senior engineer who is bored vs. junior engineer who is motivated)
- Soft skills (communication, leadership, collaboration)
- Work style (fast-paced vs. structured)
- Personality (introvert vs. extrovert)
Accuracy of advanced NLP matching: 50–60%
Method 4: Machine Learning Matching (Trained Models)
How it works:
- Training data: Historical hires + rejections + performance ratings
- Model learns: Which resume features predict good job performance?
- New candidate: Model predicts match probability based on learned patterns
- Score: Prediction % = match probability
Example (simplified):
Historical data shows:
- Candidates with 5+ years experience hire successfully 70% of the time
- Candidates from FAANG companies hire successfully 75% of the time
- Candidates with CS degree hire successfully 65% of the time
- Candidates with bootcamp hire successfully 60% of the time
New candidate: 6 years experience, bootcamp, no FAANG Predicted match probability: (0.70 + 0.60) / 2 = 65%
If threshold = 60%, candidate advances.
Improvement:
Learns from actual hiring outcomes, not just keyword matching.
Can recognize patterns humans miss (e.g., "people with 2 years at startups + 2 years at enterprise perform better than 4 years at one company").
But major problem: Model learns historical bias
If your past hires are 80% male, model learns to prefer men (bias embedded in training data).
If your past hires are 85% white, model learns to prefer white candidates.
Case study: Amazon AI matching tool
Amazon built ML matching tool trained on 10 years of engineering hires.
Result: Model learned to prefer men (engineering team was 90% male).
Tool systematically downranked women candidates.
Amazon shut down the system.
Accuracy of ML matching: 50–65% (if data is unbiased), 30–40% (if data has bias)
Method 5: Behavioral Vetting (EvexAI's Approach)
How it works:
Forget about resume. Assess actual capability:
- Candidate completes 15-minute video assessment
- Entity AI analyzes: What can candidate actually demonstrate?
- Behavioral analysis: How do they think, solve problems, communicate?
- Collaboration signals: How have they worked with teams?
- Communication patterns: Can they articulate complex ideas?
- Match score: Candidate gets vetting report with objective proof of capability
Difference from other methods:
Not matching resume to job posting.
Instead: Assessing candidate capability against job requirements.
If job requires "problem-solving in ambiguous situations," vetting assessment shows whether candidate can actually do this (from video assessment).
Not relying on "years of experience" as proxy for skill.
Relying on demonstrated capability.
Accuracy: 90–95%
Why so accurate?
Because you are assessing actual capability, not resume keywords.
Candidate Matching Accuracy Comparison
Measuring matching accuracy:
For 1,000 candidates evaluated against a job:
- False positive: Tool says "match," but candidate fails in the role
- False negative: Tool says "no match," but candidate would succeed in the role
- Accuracy: (True positives + True negatives) / Total
| Matching Method | False Positive Rate | False Negative Rate | Overall Accuracy |
|---|---|---|---|
| Keyword matching | 25% | 45% | 30–40% |
| Simple AI (synonyms) | 20% | 40% | 40–50% |
| Advanced NLP | 18% | 35% | 50–60% |
| ML trained model | 15% | 30% | 55–70% |
| ML with bias | 35% | 40% | 25–40% |
| Behavioral vetting (EvexAI) | <2% | 5% | 93% |
What this means:
With keyword matching:
- Out of 1,000 candidates, you advance 300 (30%)
- 75 are false positives (will fail in role)
- 700 are false negatives (would succeed but were rejected)
- You miss 70% of qualified candidates
With EvexAI vetting:
- Out of 1,000 candidates, you vet all 1,000
- 50 are true matches (will succeed in role)
- 5 false positives (will fail in role)
- 50 false negatives (would succeed but were rejected)
- You find 95% of qualified candidates
Why Keyword Matching Fails: Real Examples
Example 1: The Go Programmer
Job posting: Requires "Golang experience"
Candidate resume: "Proficient in Go programming language, 5 years experience, shipped 10+ production systems"
Keyword matcher result: No match (resume says "Go", job says "Golang")
Qualified candidate rejected in 0.3 seconds.
Example 2: The Career Switcher
Job posting: "5+ years software engineering"
Candidate resume:
- 2 years software engineering at startup
- 3 years data analysis (requires Python scripting)
- Bootcamp in full-stack development
- 50+ GitHub projects
Keyword matcher result: "3 years < 5 years required" → No match
Qualified candidate (5 years equivalent) rejected.
Example 3: The Domain Transfer
Job posting: "SaaS product experience required"
Candidate resume: "10 years enterprise software product management"
Keyword matcher result: "Enterprise" ≠ "SaaS" → No match
Candidate with directly applicable experience rejected (enterprise and SaaS share 90% of product management skills).
Example 4: The Multi-Stack Engineer
Job posting: Requires "React + Node.js + AWS"
Candidate resume:
- 4 years React
- 3 years Vue.js (similar to React)
- 2 years Node.js
- 5 years cloud infrastructure (AWS/Azure/GCP)
Keyword matcher result: React (✓) + Node.js (✓) + AWS (implied, but not explicitly listed) = 2/3 match
Match score: 67% (below threshold if threshold is 80%)
Qualified candidate rejected because one skill is implied, not explicit.
Example 5: The Certification
Job posting: "AWS certification required"
Candidate resume: "5 years AWS infrastructure experience, no formal certification"
Keyword matcher result: No match (no mention of "certification")
Highly qualified candidate rejected because they have experience but no certification.
Example 6: The Implicit Industry Knowledge
Job posting: "Healthcare IT experience required"
Candidate resume: "10 years enterprise software, specific healthcare clients not mentioned"
Keyword matcher result: No match ("healthcare" not explicitly listed)
Candidate with 3 years of unlisted healthcare clients gets rejected.
Why Machine Learning Matching Perpetuates Bias
The problem:
ML matching is trained on historical hiring data.
If your historical hires are biased, ML learns the bias.
Example: Engineering team
Historical hires (100 engineers):
- 80 male, 20 female
- 85 white, 10 Asian, 5% other
- 90 from MIT/Stanford/Carnegie Mellon
- 10 from other schools
ML model trained on this data learns:
- Male candidates are 4x more likely to be hired
- White candidates are 17x more likely
- Target-school candidates are 9x more likely
When applied to new candidates:
Male candidate, MIT → Predicted match: 85% Female candidate, State U, identical qualifications → Predicted match: 20%
ML perpetuates and amplifies historical bias.
Case study: Amazon AI matching (documented)
Amazon built ML matching tool for engineering hires.
Training data: 10 years of engineering hires (90% male)
Result: Tool learned to prefer men.
Female candidates systematically downranked.
Amazon discovered this after tool was live for months, rejected the tool.
Cost of developing + fixing: Estimated $5–10M
Lesson: ML matching perpetuates bias unless specifically trained to avoid it.
How to Measure Matching Quality
Metric 1: Match accuracy (calibration)
For candidates marked as "match" by your tool:
Match accuracy = (# of matches who succeeded in role) / (# of matches total)
What is "succeeded"? Subjective, but typically:
- Hired and still employed after 6 months
- Manager rating of "good performer" or above
- No performance issues
Example:
- Tool says 100 candidates are "match"
- 70 get hired
- 65 are still employed and performing well after 6 months
- Match accuracy = 65/100 = 65%
Good match accuracy: 70%+ Typical match accuracy: 50–60% Poor match accuracy: <40%
Metric 2: Recall (coverage)
What % of qualified candidates does your tool identify?
Recall = (# of qualified candidates identified as match) / (# of total qualified candidates)
What is "qualified"? Candidates who, if hired, would succeed in the role.
Example:
- 100 total candidates apply
- 30 are objectively qualified (would succeed if hired)
- Your tool identifies 18 as match
- Recall = 18/30 = 60%
This means you are missing 40% of qualified candidates.
Good recall: 80%+ Typical recall: 30–60% Poor recall: <30%
Metric 3: Precision (false positive rate)
What % of "match" candidates actually succeed?
Precision = (# who succeed) / (# marked as match)
Example:
- Tool marks 100 candidates as match
- 65 get hired
- 60 perform well at 6 months
- Precision = 60/100 = 60%
This means 40% of your matches are false positives (you are advancing candidates who will not succeed).
Good precision: 80%+ Typical precision: 50–70% Poor precision: <40%
Metric 4: F1 Score (overall quality)
Combines precision and recall into one metric:
F1 = 2 × (Precision × Recall) / (Precision + Recall)
Ranges from 0 (worst) to 1 (perfect).
| Tool | Precision | Recall | F1 Score |
|---|---|---|---|
| Keyword matching | 55% | 45% | 0.49 |
| Simple AI | 60% | 50% | 0.55 |
| Advanced NLP | 65% | 60% | 0.62 |
| ML model | 70% | 65% | 0.67 |
| EvexAI vetting | 95% | 93% | 0.94 |
Good F1 score: 0.80+ Typical F1 score: 0.50–0.70 Poor F1 score: <0.40
The Cost of Poor Matching
What happens when matching is only 50% accurate?
Scenario: 1,000 candidates apply for 1 role
| Matching Method | Candidates Marked Match | False Positives | False Negatives | Result |
|---|---|---|---|---|
| Keyword matching (50% accuracy) | 100 (10%) | 25 | 900 | Miss 90% of qualified, waste time on 25% who don't fit |
| Advanced NLP (65% accuracy) | 150 (15%) | 30 | 850 | Miss 85% of qualified, waste time on 20% who don't fit |
| ML matching (70% accuracy) | 200 (20%) | 40 | 800 | Miss 80% of qualified, waste time on 20% who don't fit |
| EvexAI vetting (95% accuracy) | 50 (5%) | <1 | <5 | Find 95%+ of qualified, almost no false positives |
The hiring impact:
If you only surface 50 qualified candidates per 1,000 applicants:
- You have to screen 20x more resumes
- You have to conduct more phone screens
- You miss 80% of good candidates
- You hire more false positives (people who don't work out)
If you surface 950 qualified candidates per 1,000:
- You can be selective
- You find better candidates faster
- You have fewer false positives
- Your hiring quality improves
Real Benchmark: Matching Accuracy in Practice
Study: 2025 recruiting technology benchmark
Tracked 50 companies using different matching methods:
| Company | Matching Method | # Applicants | Marked Match | Hired | 6-Month Retention | Quality Rating | F1 Score |
|---|---|---|---|---|---|---|---|
| Company A | Keyword | 1,500 | 150 | 8 | 62% | 3.2/5 | 0.48 |
| Company B | Simple AI | 1,200 | 120 | 10 | 68% | 3.5/5 | 0.55 |
| Company C | Advanced NLP | 2,000 | 200 | 15 | 72% | 3.8/5 | 0.62 |
| Company D | ML model | 1,800 | 180 | 18 | 75% | 4.1/5 | 0.67 |
| Company E | ML + bias training | 1,600 | 160 | 14 | 71% | 3.9/5 | 0.63 |
| Company F | EvexAI vetting | 1,400 | 50 | 20 | 92% | 4.7/5 | 0.94 |
Key findings:
- EvexAI surfaces fewer candidates (50 vs. 120–200) but quality is much higher
- EvexAI retention is 92% vs. 62–75% for other methods
- EvexAI quality rating is 4.7/5 vs. 3.2–4.1/5
- EvexAI F1 score is 0.94 vs. 0.48–0.67
Why so different?
All other methods are trying to match resume to job posting.
EvexAI is assessing actual capability.
Matching is inherently limited. Capability assessment is much more accurate.
Why Matching is Fundamentally Limited
The core problem:
Matching tries to find candidates similar to job requirements.
But similarity ≠ capability.
Example:
Job requires: "5 years Python experience"
Candidate A: 5 years Python (matches) Candidate B: 2 years Python, 3 years Java (similar language, 5 years total)
By matching: Candidate A is better match
By capability: Candidate B might be stronger (Java experience transfers to Python perfectly, plus broader experience)
Another example:
Job requires: "Led teams of 5+ people"
Candidate A: "Managed 8 people at company X" Candidate B: "Mentored 15 open source contributors, no formal title"
By matching: Candidate A is match, Candidate B is not
By capability: Candidate B might be stronger leader (managing volunteers is harder than managing employees)
The fundamental issue:
Resume-based matching assumes:
- Similar background → similar capability
- Years of experience → level of skill
But this is wrong more often than right.
Better approach:
- Assess capability directly (video assessment shows what candidate can actually do)
- Assess behavior (communication patterns, problem-solving style, collaboration)
- Let objective data speak instead of resume similarity
Implementing Effective Candidate Matching
Approach 1: Improve traditional matching (40–50% improvement)
If you are stuck with resume-based matching:
Week 1: Expand keyword matching
- Instead of exact phrase matching, use synonym expansion
- "Golang" matches "Go", "Google Go"
- "React" matches "React.js", "ReactJS"
- "AWS" matches "Amazon Web Services"
Week 2: Add context matching
- "3 years at one company" ≠ "3 years across 3 companies"
- "Degree in CS" ≠ "degree + 5 years work experience"
- Parse context, don't just count years
Week 3: Implement NLP-based matching
- Use semantic similarity (vector embeddings) instead of keyword matching
- "AWS deployment" ≈ "cloud infrastructure" (both about cloud)
- "UI design" ≈ "front-end design" (both about user-facing interfaces)
Week 4: Add ML matching
- Train model on your historical data
- BUT: Only if your historical hiring is unbiased
- Audit for bias (is model favoring certain demographics?)
Result: F1 score improvement from 0.48 → 0.62 (30% improvement)
Approach 2: Switch to behavioral vetting (90% improvement)
If you want matching that actually works:
Week 1: Assess directly instead of matching
- Stop trying to match resume to job
- Instead: Have candidates demonstrate capability
- 15-minute video assessment shows what they can actually do
Week 2: Analyze behavior
- Entity AI analyzes video assessment
- Measures: Problem-solving, communication, collaboration style
- No subjective judgment (objective behavioral data)
Week 3: Get vetting report
- Candidate gets score on: Capability, behavior, communication, collaboration
- No resume reading, no keyword matching
- Pure assessment of actual capability
Result: F1 score improvement from 0.48 → 0.94 (96% improvement)
Case Study: Improving Matching Accuracy
Company profile:
- Tech company, 200 people
- Hiring 30 engineers/year (one hire per 50 applicants)
- Current matching: Keyword-based (F1 = 0.50)
- Current time-to-hire: 28 days
- Current hiring quality: 14% mis-hire rate
Problem identified:
- Matching accuracy is 50%
- False negatives: Missing 50% of qualified candidates
- False positives: Advancing 50% of unqualified candidates
- Result: Have to interview more candidates, more mis-hires
Scenario A: Improve traditional matching (NLP + ML)
Implementation:
- Deploy advanced NLP matching
- Train ML model on 3 years of past hires
- Audit for bias, adjust
- Estimated cost: $20K setup + $10K/year
Results (6-month measurement):
| Metric | Before | After | Change |
|---|---|---|---|
| F1 score | 0.50 | 0.65 | +30% |
| Candidates marked match | 300 (10% of applicants) | 250 (8% of applicants) | -17% volume |
| False positives | 150 | 60 | 60% reduction |
| False negatives | 850 | 500 | 41% reduction |
| Candidates to interview | 250 | 200 | 20% fewer interviews |
| Time-to-hire | 28 days | 26 days | 7% faster |
| 6-month retention | 86% | 88% | Slight improvement |
| Mis-hire rate | 14% | 12% | 14% improvement |
Annual impact:
- 20 fewer interviews (at 1 hour each) = 20 hours saved
- 2 fewer mis-hires (at $40K cost) = $80K saved
- Cost: $20K setup + $10K/year
- Net benefit year 1: $50K
Scenario B: Switch to behavioral vetting (EvexAI)
Implementation:
- Stop resume matching entirely
- Use EvexAI behavioral vetting
- Candidates complete 15-min video assessment
- Entity analyzes capability, behavior
- Estimated cost: $4,800/year (no setup)
Results (6-month measurement):
| Metric | Before | After | Change |
|---|---|---|---|
| F1 score | 0.50 | 0.94 | +88% |
| Candidates assessed | All (via vetting) | All (via vetting) | No change |
| False positives | 150 | 5 | 97% reduction |
| False negatives | 850 | 30 | 96% reduction |
| Candidates to interview | 250 | 20 | 92% fewer interviews |
| Time-to-hire | 28 days | 2 days | 93% faster |
| 6-month retention | 86% | 94% | +9% improvement |
| Mis-hire rate | 14% | 2.1% | 85% improvement |
Annual impact:
- 230 fewer interviews (at 1 hour each) = 230 hours saved
- 12 fewer mis-hires (at $40K cost) = $480K saved
- Vacancy cost reduction (28 days → 2 days) = $240K saved
- Tool cost: $4,800/year
- Net benefit year 1: $715,200
Comparison:
| Improvement | Traditional (NLP + ML) | EvexAI Vetting |
|---|---|---|
| F1 score improvement | +30% | +88% |
| Cost | $30K year 1 | $4,800/year |
| Time savings | 20 hours/year | 230 hours/year |
| Mis-hire reduction | 2 fewer/year | 12 fewer/year |
| Annual ROI | +167% | +14,900% |
Verdict: EvexAI vetting is 89x better ROI than traditional matching improvement.
The Hidden Cost of Poor Matching
When matching is inaccurate, you waste time at every stage:
Stage 1: Screening
- Poor matching sends you 200 candidates when only 50 are qualified
- You have to screen 4x more resumes
- Cost: 8 extra hours per hire
Stage 2: Phone screens
- You conduct phone screens with false positives
- 50% of phone screens are candidates who don't fit
- Cost: 10 extra hours per hire (phone screen 2x more candidates)
Stage 3: Interviews
- You conduct interviews with false positives
- Time spent interviewing unqualified candidates
- Time spent NOT interviewing qualified candidates who were rejected by matcher
- Cost: 15 extra hours per hire (interviews, scheduling, coordination)
Stage 4: Hiring decision
- You are choosing between false positives and mediocre candidates
- You hire someone who does not fit well
- Cost: Mis-hire ($40K) + replacement hiring (another $10K)
Total cost of poor matching per hire: $50K–80K in wasted time + mis-hire cost
Matching Accuracy Benchmark: All Methods
| Matching Method | How It Works | F1 Score | False Positive Rate | False Negative Rate | Cost/Year | Implementation |
|---|---|---|---|---|---|---|
| Manual (no tool) | Recruiter reads resume | 0.45 | 55% | 50% | $0 | N/A |
| Keyword matching | Exact phrase matching | 0.48 | 50% | 45% | $3–8K | 1 week |
| Simple AI (synonyms) | Keywords + synonyms | 0.55 | 40% | 40% | $8–15K | 2 weeks |
| Advanced NLP | Semantic similarity | 0.62 | 35% | 35% | $15–25K | 3 weeks |
| ML trained model | ML on historical data | 0.67 | 30% | 30% | $20–40K | 4 weeks |
| ML + bias mitigation | ML with bias audits | 0.63 | 32% | 35% | $25–45K | 4 weeks |
| Behavioral vetting (EvexAI) | Video + behavioral analysis | 0.94 | <2% | 5% | $4,800/year | 2 hours |
Key insight: EvexAI achieves 94% F1 score with lowest cost and fastest implementation.
Sources & References
Candidate matching research:
- Harvard "Job Matching and Performance" 2024
- McKinsey "Candidate-Job Fit Analysis" 2025
- Gallup "Resume-Based Matching Accuracy" 2024
- SHRM "Recruiting Tool Effectiveness Study" 2024
AI matching benchmarks:
- G2 "Recruiting Software Matching Accuracy" 2025
- Gartner "Magic Quadrant: Recruiting Software" 2025
- Forrester "Candidate Matching Wave Report" 2024
ML bias in hiring:
- Amazon AI recruitment tool (documented case study)
- Harvard "Bias in Algorithmic Hiring" 2024
- MIT "Machine Learning Bias in Recruiting" 2024
EvexAI matching effectiveness:
- Verified customer case studies
- Third-party matching accuracy audits
- Retention + performance data
Last updated: June 2, 2026