Annecy, France

Benjamin Berhault

Solo Founder & Full-Stack Engineer. Building 6 products across EdTech, HR-tech, Dating, GovTech, PropTech, and Data Infrastructure. Venture studio operator — own and operate, not sell.

Réalisations

6 produits conçus, lancés et développés dans des secteurs variés.

Real Estate Intelligence

language

PropTech · Data Platform

En ligne
expand_more

Interactive map platform scoring French property transactions as investment opportunities. Official DVF data, PostGIS analytics, vector tile visualization.

Python Dagster dbt PostGIS MapLibre D3.js Docker

Résultat clé

Building-level investment scoring across 2 departments, extensible to all 96 metropolitan French departments

Built investment scoring engine on official French government property data

Problème

French property investors rely on listings (biased) or gut feeling. Official transaction data (DVF) is public but raw and unusable without significant processing.

Ce que j'ai fait

Built a full data pipeline: Dagster orchestration, dbt transforms, PostGIS analytics computing commune-level statistics (percentiles, medians, yield estimates). Generated vector tiles with Tippecanoe for MapLibre rendering. Scoring algorithm ranks every transaction 0-100 relative to its local market — a property at the 15th percentile in its commune scores 85 (opportunity).

Contrainte principale

DVF data has inconsistencies — missing coordinates, multi-lot transactions, price anomalies. Had to build robust cleaning pipelines before scoring was meaningful.

Leçon apprise

Relative scoring (vs local market) is far more actionable than absolute prices. A EUR150K apartment can be a great deal in one commune and overpriced in the next.

Échelle: Millions of DVF transactions, building-level geospatial resolution

Data Infrastructure · Open Source

En ligne
expand_more

Progressive, self-hosted data platform — from DuckDB on a laptop to full Spark/Kafka/Flink enterprise stack. 20+ composable services, 5 deployment levels. Apache 2.0.

Docker Dagster dbt Spark Kafka Flink PostgreSQL MinIO Superset

Résultat clé

20+ composable services, 5 deployment levels, 80% of Databricks at 5% of the cost

Designed progressive 5-level data platform architecture — DuckDB to Spark/Kafka

Problème

Small companies either overspend on Databricks/Snowflake (USD2-5K+/mo) or waste weeks assembling Docker services manually. No tool offered an honest progressive path.

Ce que j'ai fait

Architected a composable platform with 5 deployment levels (Level 0: DuckDB+dbt at USD0/mo → Level 4: Spark+Kafka+Flink+Keycloak at USD150-300/mo). Built a Python CLI for interactive component selection, a docker-compose generator, and 20+ pre-configured services with health checks and resource limits. Apache 2.0 licensed.

Contrainte principale

Ensuring 20+ services work together reliably across 5 different configurations. Health checks, resource limits, and dependency ordering are critical and unglamorous work.

Leçon apprise

Most companies will never need Level 3+. The value is in giving them an honest framework to know when to upgrade — not pushing them to over-engineer from day one.

Échelle: 20+ Docker services, 31 Dockerfiles, 5 preset environments

CMS Platform

construction

GovTech · Emergency Services

En cours
expand_more

Real-time geospatial monitoring for emergency dispatch. C++ routing engine, Kafka streaming, GPS tracking, coverage analysis with building-level isochrones.

C++ Kafka Redis FastAPI MapLibre Deck.gl PostGIS Docker

Résultat clé

Building-level coverage analysis with real-time GPS tracking, sub-second route computations

Built C++ real-time routing engine for emergency services coverage analysis

Problème

French fire services (SDIS/BSPP) need to know if every building in their territory can be reached within response time targets. Existing tools are slow and batch-oriented.

Ce que j'ai fait

Built a C++ routing engine using RoutingKit on OSM road graphs with contraction hierarchies, forbidden turns, and one-way streets. Kafka streams GPS positions in real-time from dispatch systems. Coverage isochrones computed per-building (not per-zone) with parallel workers. Frontend renders 31K+ buildings with viewport-based culling on MapLibre.

Contrainte principale

C++ binary distribution — cannot source-compile in CI without exposing RoutingKit internals. Had to pre-compile and distribute binaries.

Leçon apprise

GovTech sales cycles are 12-18 months minimum. The technical product can be ready in weeks, but the procurement process takes a year. Build relationships first, code second.

Échelle: 31K+ buildings, real-time GPS streaming, parallel C++ workers

HR-tech · AI Job Matching

En ligne
expand_more

AI-powered, zero-friction job matching platform. Candidate-first, brutally honest analysis. Telegram-native with browser extension. Targeting the $500B job search market.

Python FastAPI PostgreSQL Redis Claude AI Tailwind CSS Stripe Telegram Bot pgvector

Résultat clé

Sub-30 second time-to-value from URL paste to full analysis with match score, red flags, salary data, and tailored CV generation

Built multi-LLM job analysis pipeline processing 6 job board APIs

Problème

Job search platforms optimize for engagement, not outcomes. Candidates waste hours on poorly matched listings. No tool gave honest, instant feedback on fit.

Ce que j'ai fait

Designed and built end-to-end: job aggregation from 6 APIs (France Travail, Indeed, LinkedIn, WTTJ, Adzuna, Jooble), multi-LLM routing (Claude, OpenAI, DeepSeek, Mistral) with cost optimization, pgvector semantic matching, real-time Telegram bot, Chrome extension, and full application tracking.

Contrainte principale

EUR200 total launch budget. Every LLM call costs real money — had to build smart routing to keep cost at ~USD0.02/analysis while maintaining quality.

Leçon apprise

Over-engineered the first version with microservices. Collapsed back to a monolith — shipping speed matters more than architecture purity at this stage.

Échelle: 10K+ job analyses, 6 API integrations, 6 LLM providers

Zero-friction Telegram-first UX — no signup, no forms, instant value

Problème

Traditional job platforms require 10-30 minutes of setup (account, CV upload, preferences) before any value. Most candidates abandon during onboarding.

Ce que j'ai fait

Built a Telegram bot where users paste a job URL and get a full analysis in 30 seconds — no account, no forms. Profile is built conversationally over time. Browser extension adds one-click analysis directly on LinkedIn/Indeed.

Contrainte principale

Telegram API rate limits, browser extension cross-origin restrictions, maintaining session state without traditional auth.

Leçon apprise

The biggest barrier to adoption is not features — it is friction. Removing the signup wall increased engagement dramatically.

Échelle: Targeting 500 founding members

HestiaMatch

language

Dating · Mobile App

En ligne
expand_more

Values-based dating app for family-oriented singles. React Native, 73 screens, proximity crossings, compatibility scoring, Stripe payments, matchmaker B2B mode.

React Native Expo Stripe WebSocket REST API i18n

Résultat clé

73 screens, 96+ builds, cross-platform iOS/Android/Web, full Stripe integration

Shipped 96+ builds of a cross-platform dating app with values-based matching algorithm

Problème

Mainstream dating apps optimize for engagement (more swiping = more ads). Values-oriented singles — people looking for marriage, not hookups — are underserved.

Ce que j'ai fait

Built a full React Native app from scratch: 73 screens, weighted compatibility scoring (values 25%, family goals 25%, financial philosophy 20%, intimacy 15%, communication 15%), proximity crossings (Happn-style), real-time WebSocket chat, Stripe payments, physical QR charm cards for offline-to-online conversion, and a B2B matchmaker dashboard.

Contrainte principale

React Native ecosystem moves fast — breaking changes between Expo versions, platform-specific bugs on iOS vs Android, App Store review requirements.

Leçon apprise

Dating apps are a distribution problem, not a technical one. Building the product is 20% of the challenge — getting users to trust and try a new platform is 80%.

Échelle: 50+ services (~14K lines), 25+ reusable components

Questus-AI

open_in_new

EdTech · SaaS

En ligne
expand_more

AI-powered flashcard & quiz platform. Upload documents, generate questions, learn with spaced repetition. 27+ content parsers, 6 LLM providers, 30+ languages.

Python FastAPI PostgreSQL Redis Elasticsearch Vite HTMX OpenAI Docker

Résultat clé

40+ question types rendered, 30+ languages supported, cost per generation optimized by 70% via smart routing

Built 27+ AI content parsers with smart LLM routing across 6 providers

Problème

Existing flashcard tools (Anki, Quizlet) require manual card creation. AI-generated content was low quality and expensive due to single-provider lock-in.

Ce que j'ai fait

Engineered a content ingestion pipeline supporting DOCX, PDF, PPTX, images (OCR), YouTube transcripts. Built 27+ pattern parsers (MCQ, scenario, fill-blank, matching, ordering) with confidence scoring. Implemented smart LLM routing across OpenAI, DeepSeek, Mistral, Claude, Grok, and Ollama — selecting model based on task complexity and cost.

Contrainte principale

Each LLM provider has different strengths — GPT-4 for nuance, DeepSeek for cost, Mistral for speed. Had to build a routing layer that balances quality vs cost per task type.

Leçon apprise

Token counting and cost tracking per user per provider was essential from day one — without it you cannot price a SaaS sustainably.

Échelle: 635 source files, 6 LLM providers, 30+ languages

Solo Venture · Solo

Built multi-LLM job analysis pipeline processing 6 job board APIs

Job search platforms optimize for engagement, not outcomes. Candidates waste hours on poorly matched listings. No tool gave honest, instant feedback on fit.

Résultat Sub-30 second time-to-value from URL paste to full analysis with match score, red flags, salary data, and tailored CV generation
Impact Technical · product

Ce que j'ai fait

Designed and built end-to-end: job aggregation from 6 APIs (France Travail, Indeed, LinkedIn, WTTJ, Adzuna, Jooble), multi-LLM routing (Claude, OpenAI, DeepSeek, Mistral) with cost optimization, pgvector semantic matching, real-time Telegram bot, Chrome extension, and full application tracking.

Contraintes

EUR200 total launch budget. Every LLM call costs real money — had to build smart routing to keep cost at ~USD0.02/analysis while maintaining quality.

Ce que j'ai appris

Over-engineered the first version with microservices. Collapsed back to a monolith — shipping speed matters more than architecture purity at this stage.

10K+ job analyses, 6 API integrations, 6 LLM providers Hr Tech Python FastAPI PostgreSQL pgvector Redis Claude Stripe Telegram
2024–present
Solo Venture · Solo

Zero-friction Telegram-first UX — no signup, no forms, instant value

Traditional job platforms require 10-30 minutes of setup (account, CV upload, preferences) before any value. Most candidates abandon during onboarding.

Résultat Zero-to-value in under 60 seconds vs industry standard of 10-30 minutes
Impact Efficiency · user_experience

Ce que j'ai fait

Built a Telegram bot where users paste a job URL and get a full analysis in 30 seconds — no account, no forms. Profile is built conversationally over time. Browser extension adds one-click analysis directly on LinkedIn/Indeed.

Contraintes

Telegram API rate limits, browser extension cross-origin restrictions, maintaining session state without traditional auth.

Ce que j'ai appris

The biggest barrier to adoption is not features — it is friction. Removing the signup wall increased engagement dramatically.

Targeting 500 founding members Hr Tech Telegram Bot API Chrome Extension FastAPI Redis
2024–present
Solo Venture · Solo

Built 27+ AI content parsers with smart LLM routing across 6 providers

Existing flashcard tools (Anki, Quizlet) require manual card creation. AI-generated content was low quality and expensive due to single-provider lock-in.

Résultat 40+ question types rendered, 30+ languages supported, cost per generation optimized by 70% via smart routing
Impact Technical · product

Ce que j'ai fait

Engineered a content ingestion pipeline supporting DOCX, PDF, PPTX, images (OCR), YouTube transcripts. Built 27+ pattern parsers (MCQ, scenario, fill-blank, matching, ordering) with confidence scoring. Implemented smart LLM routing across OpenAI, DeepSeek, Mistral, Claude, Grok, and Ollama — selecting model based on task complexity and cost.

Contraintes

Each LLM provider has different strengths — GPT-4 for nuance, DeepSeek for cost, Mistral for speed. Had to build a routing layer that balances quality vs cost per task type.

Ce que j'ai appris

Token counting and cost tracking per user per provider was essential from day one — without it you cannot price a SaaS sustainably.

635 source files, 6 LLM providers, 30+ languages Edtech Python FastAPI PostgreSQL OpenAI Tesseract OCR tiktoken Vite HTMX
2024–present
Solo Venture · Solo

Shipped 96+ builds of a cross-platform dating app with values-based matching algorithm

Mainstream dating apps optimize for engagement (more swiping = more ads). Values-oriented singles — people looking for marriage, not hookups — are underserved.

Résultat 73 screens, 96+ builds, cross-platform iOS/Android/Web, full Stripe integration
Impact Technical · product

Ce que j'ai fait

Built a full React Native app from scratch: 73 screens, weighted compatibility scoring (values 25%, family goals 25%, financial philosophy 20%, intimacy 15%, communication 15%), proximity crossings (Happn-style), real-time WebSocket chat, Stripe payments, physical QR charm cards for offline-to-online conversion, and a B2B matchmaker dashboard.

Contraintes

React Native ecosystem moves fast — breaking changes between Expo versions, platform-specific bugs on iOS vs Android, App Store review requirements.

Ce que j'ai appris

Dating apps are a distribution problem, not a technical one. Building the product is 20% of the challenge — getting users to trust and try a new platform is 80%.

50+ services (~14K lines), 25+ reusable components Dating React Native Expo Stripe WebSocket i18n
2024–present
Solo Venture · Solo

Built investment scoring engine on official French government property data

French property investors rely on listings (biased) or gut feeling. Official transaction data (DVF) is public but raw and unusable without significant processing.

Résultat Building-level investment scoring across 2 departments, extensible to all 96 metropolitan French departments
Impact Technical · product

Ce que j'ai fait

Built a full data pipeline: Dagster orchestration, dbt transforms, PostGIS analytics computing commune-level statistics (percentiles, medians, yield estimates). Generated vector tiles with Tippecanoe for MapLibre rendering. Scoring algorithm ranks every transaction 0-100 relative to its local market — a property at the 15th percentile in its commune scores 85 (opportunity).

Contraintes

DVF data has inconsistencies — missing coordinates, multi-lot transactions, price anomalies. Had to build robust cleaning pipelines before scoring was meaningful.

Ce que j'ai appris

Relative scoring (vs local market) is far more actionable than absolute prices. A EUR150K apartment can be a great deal in one commune and overpriced in the next.

Millions of DVF transactions, building-level geospatial resolution Proptech Dagster dbt PostGIS MapLibre Tippecanoe D3.js FastAPI
2026–present
Solo Venture · Solo

Designed progressive 5-level data platform architecture — DuckDB to Spark/Kafka

Small companies either overspend on Databricks/Snowflake (USD2-5K+/mo) or waste weeks assembling Docker services manually. No tool offered an honest progressive path.

Résultat 20+ composable services, 5 deployment levels, 80% of Databricks at 5% of the cost
Impact Efficiency · infrastructure

Ce que j'ai fait

Architected a composable platform with 5 deployment levels (Level 0: DuckDB+dbt at USD0/mo → Level 4: Spark+Kafka+Flink+Keycloak at USD150-300/mo). Built a Python CLI for interactive component selection, a docker-compose generator, and 20+ pre-configured services with health checks and resource limits. Apache 2.0 licensed.

Contraintes

Ensuring 20+ services work together reliably across 5 different configurations. Health checks, resource limits, and dependency ordering are critical and unglamorous work.

Ce que j'ai appris

Most companies will never need Level 3+. The value is in giving them an honest framework to know when to upgrade — not pushing them to over-engineer from day one.

20+ Docker services, 31 Dockerfiles, 5 preset environments Data Infrastructure Docker Dagster dbt Spark Kafka Flink Trino Superset Keycloak
2025–present
Solo Venture · Solo

Built C++ real-time routing engine for emergency services coverage analysis

French fire services (SDIS/BSPP) need to know if every building in their territory can be reached within response time targets. Existing tools are slow and batch-oriented.

Résultat Building-level coverage analysis with real-time GPS tracking, sub-second route computations
Impact Technical · product

Ce que j'ai fait

Built a C++ routing engine using RoutingKit on OSM road graphs with contraction hierarchies, forbidden turns, and one-way streets. Kafka streams GPS positions in real-time from dispatch systems. Coverage isochrones computed per-building (not per-zone) with parallel workers. Frontend renders 31K+ buildings with viewport-based culling on MapLibre.

Contraintes

C++ binary distribution — cannot source-compile in CI without exposing RoutingKit internals. Had to pre-compile and distribute binaries.

Ce que j'ai appris

GovTech sales cycles are 12-18 months minimum. The technical product can be ready in weeks, but the procurement process takes a year. Build relationships first, code second.

31K+ buildings, real-time GPS streaming, parallel C++ workers Govtech C++14 RoutingKit Kafka Redis FastAPI MapLibre Deck.gl Protobuf
2025–present

Expertise technique

Niveau : Expert

SQL verified
Microsoft SQL Server verified
SSIS verified
Data Warehouse Design verified
ETL/ELT verified
Power BI verified
Apache Kafka verified
Redis verified
Data Integration Architecture verified
Geospatial Algorithms verified
Real-time Data Pipelines verified

Niveau : Confirmé

Python PostgreSQL Docker Git GitLab CI/CD Linux Debezium (CDC) FastAPI Flask C++ (Modern C++17) PostGIS JavaScript (ES6+) MapLibre GL JS Leaflet Server-Sent Events (SSE)

Niveau : Familier

React Native Cloudflare Azure DevOps PowerShell Bash LLM integration (OpenAI/DeepSeek) Vector Tiles GeoJSON Protocol Buffers

Parcours

2024–now

Independent Data Engineer

Independent Consultant

2024

Data Engineer

Energy sector client (PoC)

2023–2024

Data Engineer

City of Lausanne (data synchronization platform)

2021–2023

Data Engineer

ICRC (Resolve / Family Visit Program / Red Loop)

2019–2021

Data Architect

Brigade de sapeurs-pompiers de Paris (BSPP)

2017–2019

Data Warehouse Developer / BI Developer

Brigade de sapeurs-pompiers de Paris (BSPP)

Construisons ensemble.

Envoyer un email arrow_forward benjamin.berhault@hotmail.com