AI-Driven Recommendation System: Vector Embeddings for Intelligent Healthcare Matching

Case Banner Thumb

About the Project

Status

Beta testing with 200+ doctors and 1,000+ active users

Duration

4 weeks from conception to beta release

Team

2 engineers (1 Backend+ML Engineer, 1 Frontend Engineer)

Technologies

OpenAI Embedding API (text-embedding-3-large), Weaviate Vector Database, AWS ECS, AWS Lambda, API Gateway, Python, FastAPI

Next Milestone

Production release with 10,000+ doctor profiles (Q1 2026)

TechCare.Inc developed an AI-powered doctor recommendation system that intelligently matches patients with healthcare providers based on 50+ criteria. By implementing vector embedding technology with Weaviate and OpenAI, we created a fast, reliable, and semantically aware matching system that understands the nuanced relationship between patient needs and doctor specializations.

Performance Highlights

<2

Matching Response

82%

Relevance Accuracy

200+

Doctors Indexed

1,000+

Active Users

The Challenges

Doctor Pool Management: Maintain a comprehensive database with dynamic updates and efficient querying across large datasets.

Multi-Dimensional Criteria (50+ attributes):

Challenge
Professional: Expertise, specializations, certifications, experience
Challenge
Logistical: Location, time slots, appointment types, telehealth
Challenge
Demographic: Age groups, patient types, gender preferences
Challenge
Communication: Languages, communication styles
Challenge
Legal & Compliance: Insurance, licensing, accessibility
Challenge
Cultural: Ethnicity, cultural competencies, religious considerations
Challenge
Operational: Fees, wait times, emergency availability

User-Centric Onboarding: Intelligent questionnaire capturing patient preferences while balancing comprehensiveness with user experience and privacy.

Matching Performance: Sub-second response times, high accuracy, explainable recommendations, scalable to thousands of concurrent users.

AI-Driven Intelligence: Semantic understanding beyond keyword matching, context-aware recommendations, learning from interactions, handling ambiguous inputs.

Why Vector Embeddings?

After evaluating rule-based algorithmic matching versus vector embeddings, we chose the vector approach for its superior semantic understanding. The system recognizes that "pediatrician specializing in autism" and "child development specialist with ASD experience" represent highly similar concepts, even without exact keyword matches.

Key Advantages:

Challenge
Semantic understanding of relationships between criteria
Challenge
Natural language processing capabilities
Challenge
Automatic adaptation to new patterns
Challenge
Scalable similarity search

Solution Architecture

Journey

Phase 1: Doctor Profile Vectorization Pipeline

Profile Document Preparation

Challenge
Each doctor profile includes professional identity, specializations, operational details, patient preferences, communication capabilities, and administrative information.

Batch Processing Infrastructure (AWS ECS)

Challenge
Document Parsing: Normalize text, extract key-value pairs, expand medical terminology
Challenge
Embedding Generation: OpenAI's text-embedding-3-large API (3072-dimensional vectors)
Challenge
Vector Storage: Weaviate database with optimized indexes and metadata
Challenge
Metadata Management: Composite indexes, geospatial indexing, temporal availability indexing
Journey

Phase 2: User Query Processing Pipeline

User-Facing Lambda Service

Serverless architecture handling patient queries:

Challenge
Input Processing: Receive questionnaire via API Gateway, validate and structure responses
Challenge
Query Vectorization: Convert answers into natural language query
  • Example: {age: 35, concern: "back pain", language: "Spanish", location: "Miami"}
  • Generated: "Looking for a doctor who treats back pain in adults, speaks Spanish fluently, and practices in Miami area. Prefer orthopedic or physical medicine expertise."
  • Embed using OpenAI API (same model as profiles)
Challenge
Semantic Search: Query Weaviate with patient vector, retrieve top-K neighbors (K=20-50), apply cosine similarity
Challenge
Filtering & Ranking: Apply hard filters (location, insurance, availability), then re-rank by semantic similarity, exact matches, availability, reviews, and distance
System Architecture

System Architecture

System ArchitectureSystem Architecture

Implementation Details

Technology Stack
Performance Optimization
Backend: AWS ECS, AWS Lambda, API Gateway, Python 3.11+, FastAPI
Batch processing with OpenAI's batch API (50% cost reduction)
Vector: Weaviate with HNSW algorithm, hybrid search (vector + keyword)
Pre-computed embeddings with caching
AI: OpenAI Embedding API (text-embedding-3-large), OpenAI Python SDK
Approximate nearest neighbor search (ANN)
Data: NumPy, Pandas
Progressive filtering and result caching
Monitoring: CloudWatch, custom metrics, A/B testing framework
Horizontal Lambda scaling, Weaviate clustering

Technology Stack

  • Backend: AWS ECS, AWS Lambda, API Gateway, Python 3.11+, FastAPI
  • Vector: Weaviate with HNSW algorithm, hybrid search (vector + keyword)
  • AI: OpenAI Embedding API (text-embedding-3-large), OpenAI Python SDK
  • Data: NumPy, Pandas
  • Monitoring: CloudWatch, custom metrics, A/B testing framework

Performance Optimization

  • Batch processing with OpenAI's batch API (50% cost reduction)
  • Pre-computed embeddings with caching
  • Approximate nearest neighbor search (ANN)
  • Progressive filtering and result caching
  • Horizontal Lambda scaling, Weaviate clustering

Current Status & Results

System Performance (Beta)

Challenge
Average response time: 2000ms (target: <500ms)
Challenge
Vectorization: 50 profiles/minute
Challenge
Database: 200+ doctor profiles
Challenge
Capacity: 500+ concurrent users

Matching Accuracy

Challenge
User satisfaction (top 3): 78%
Challenge
Relevance agreement: 82%
Challenge
Cross-validation agreement: 73%
Challenge
Successful bookings: 64%

Key Learnings

Challenge
Semantic Understanding: Handles synonyms, misspellings, and related concepts (e.g., "joint pain" → "rheumatology")
Challenge
Hybrid Value: Cross-validation catches edge cases, provides explainability
Challenge
User Preferences: Users prefer 5-7 recommendations; explanations increase bookings; location remains top priority
Challenge
Technical Issues: 15-20 min re-vectorization latency, Lambda cold starts, balancing semantic similarity with constraints

Benefits Achieved

For Patients: Time reduction from hours to seconds, discovery of relevant specialists, personalized recommendations, improved confidence

For Healthcare Providers: Better matching accuracy, improved utilization, reduced no-shows, demand analytics

For TechCare: 60% reduction in support queries, scalable growth, demand insights, foundation for AI features

Challenges & Solutions

1

API Latency & Cost

Pre-compute embeddings, use batch API (50% savings), implement CDN distribution

2

Cold Start

Hybrid scoring weights structured criteria higher for new doctors until interaction data exists

3

Criteria Imbalance

Two-stage filtering—hard constraints first, semantic similarity within qualified subset

4

Compliance

HIPAA-compliant architecture, disclaimers, human oversight for flagged cases

Lessons Learned

Hybrid Approach: Combines AI innovation with reliability

Explainability: Critical for healthcare context and user trust

Data Quality: Well-structured profiles produce better embeddings

API Optimization: Batch processing and caching control costs

Continuous Monitoring: Track both technical and business metrics

Scale Planning: Design for 10x growth from day one

Conclusion

TechCare's AI-driven doctor recommendation system represents a significant advancement in healthcare matching technology. By leveraging vector embeddings and semantic search, we've created a system that understands nuanced relationships between patient needs and provider capabilities far beyond traditional keyword matching.