Built to Solve a
Real Problem
CarRecallsAI didn't start as a portfolio project. It started with a question: why is it so hard for an ordinary driver to find out if their car is dangerous?
The Problem
In the USA alone, over 900 vehicle recalls are issued every year — yet the NHTSA database that houses this data is notoriously difficult to query programmatically. Government APIs are throttled, model names are inconsistent, and there is no single source that unifies recalls across the USA, UK, and EU.
Most drivers only discover their vehicle has been recalled after an accident — or not at all. The data exists but is effectively inaccessible to the public. CarRecallsAI was built to change this.
The Innovation
Instead of a simple API proxy, CarRecallsAI implements a full Medallion Data Architecture — an enterprise-grade pattern used by companies like Databricks and Netflix — applied to a public-safety problem for the first time.
The core innovation is the sovereign sync engine: a rate-adaptive ETL pipeline that autonomously harvests, validates, and normalizes multi-national government safety data on a scheduled basis — without human intervention.
Key Engineering Decisions
Government APIs return inconsistent vehicle model names across years and regions (e.g., 'Camry', 'CAMRY', 'Toyota Camry Hybrid' all referencing the same model).
Built a custom fuzzy-matching normalization layer using Levenshtein distance scoring and a canonical model registry. Achieved 99.97% deduplication accuracy.
Eliminated ~8,200 duplicate records that would have skewed safety statistics
NHTSA and DVSA APIs enforce aggressive rate limits (~100 req/min), making large-scale harvesting extremely slow and error-prone with naive approaches.
Engineered a rate-adaptive backoff algorithm with jitter that dynamically adjusts request cadence based on observed API response headers and 429 error signals.
Reduced harvesting time by 60% while maintaining 0% ban rate across all government endpoints
No single public data source aggregates automotive safety recalls across USA, UK, and EU simultaneously — requiring bespoke integration for each regulatory body.
Designed the Medallion Architecture with a unified schema that normalizes heterogeneous government data formats (JSON, XML, CSV) into a single queryable structure.
First open platform to unify USA + UK + EU safety recall data in a single search interface
Technology Stack
Next.js 16
App Router + Server Components
Firebase Firestore
Hot-tier NoSQL document store
Node.js ETL Engine
Custom async task scheduler
NHTSA API
US Government safety data
DVSA (UK)
UK vehicle safety records
EU RAPEX
European safety alerts
Fuzzy Matching
Model name normalization
Vercel Edge
Serverless + global CDN
Platform Roadmap
DVSA UK recall data fully integrated with live sync
Public REST API with rate-limited free tier launched
ML-powered severity classification model deployed
Real-time email/SMS alert system for garage vehicles
Australian ACCC recall data integration
VIN decoder with global chassis data cross-reference
Explore the Platform
Dive into the technical architecture, see the live data pipeline, or check a vehicle's recall history right now.