May 30, 2026, 1:50 PM·10min·case-study

AI-Powered Engineering Portfolio Platform

Started as a small redesign, ended as a full rebuild — frontend, backend, database, CMS, and an AI assistant. This writeup covers why I did it and what I learned.

Problem

Most engineering portfolios stop at screenshots and technology lists. That makes it hard for a recruiter or engineer to understand how the system was actually designed, what tradeoffs shaped it, or whether the work reflects real product thinking.

Decision

Treat the portfolio as a real product system instead of a static site: Next.js for the public experience, FastAPI for content and assistant APIs, PostgreSQL for structured CMS data, pgvector for semantic retrieval, and a grounded AI assistant that answers only from indexed project writeups and architecture notes—not generic internet knowledge.

Outcome

A portfolio that behaves like an engineering knowledge system: visitors can read case studies, inspect architecture decisions, and ask the AI guide questions grounded in the same content—making the work explorable and credible instead of purely presentational.

Overview

Product direction and engineering scope

The story behind why I rebuilt my portfolio from scratch as a proper product — the problems with the old static site, the design choices I made, the things which broke along the way, and what I would do differently next time.

Full writeup

Implementation notes

Key engineering decision

Treat the portfolio as a product, not a brochure.

The biggest decision was to stop hardcoding content inside React files and instead build a proper backend with a shared content_items table. One table, six content types, JSONB for type-specific extras. This single choice removed almost all the duplicate code I was about to write, and made the AI assistant possible later on.

Architecture summary

System topology and platform shape

The platform is a small monorepo with three main pieces:

Next.js app — handles both the public site and the dashboard CMS. Public pages use ISR; dashboard pages are client-driven and talk to the backend through a same-origin BFF layer.
FastAPI service — owns all the business logic. Routes are thin, services hold the rules, repositories talk to Postgres, models stay clean.
Postgres — single source of truth. One shared content_items table, plus knowledge_documents and knowledge_chunks for the RAG pipeline.

Everything is glued together with TypeScript on the frontend, async SQLAlchemy on the backend, and Alembic for migrations. Docker Compose runs the whole thing locally with one command.

Why this shape

I wanted one content pipeline, not five. The same Markdown that powers a public project page also feeds the AI assistant and the search index. By keeping the data model unified and the rendering pipeline shared, I avoided the usual "CMS says one thing, site shows another" problem.

Full writeup — Implementation notes

The problem with the old portfolio

The old site was a static React app built on Vite. Content was hardcoded inside .tsx files. To add a project I had to:

Open the code
Add an entry to a constants file
Commit, push, wait for Netlify to redeploy

For one project this is fine. For ten projects with case studies, architecture notes, articles, experiment logs — it becomes real friction. I was just not writing because the cost of writing was too high.

Also, recruiters were asking me the same five questions on every call. "What stack did you use here? Why this database? How did you handle scaling?" I wanted a way for that information to be available without me being in the loop every time.

So the rebuild had two goals:

Make writing content cheap and pleasant — so I actually do it
Make the content queryable — so visitors can ask questions and get real answers

First attempt — separate tables per content type

Started with the "obvious" design — one table for projects, one for articles, one for case_studies, etc. Each with its own model, repository, service, API route, and frontend component.

This worked for about two days. Then I noticed every table had the same 12 columns — title, slug, description, body, SEO fields, published_at, status, tags. The only real difference was a few type-specific fields like tech stack for projects or read time for articles.

I was writing the same code five times. So I scrapped it and went with:

one content_items table
a type column
a JSONB metadata field for type-specific extras

Painful refactor. But after it was done everything became smoother — adding a new content type now takes maybe an hour instead of a full day.

Lesson: If you find yourself duplicating the same table structure with minor variations, stop and unify. The polymorphism is worth the small bit of JSONB messiness.

The CMS journey — three attempts

Attempt 1 — A separate admin app. Decided to build the dashboard as its own React app on a subdomain. Looked clean architecturally. But auth across subdomains was a headache, deployment was double the work, and sharing components meant publishing to a private npm registry. Killed it after a week.

Attempt 2 — Dashboard inside Next.js, calling FastAPI directly from the browser. Better. But now I had CORS issues, the auth cookie story was confusing, and CSRF protection needed manual handling. Worked, but felt fragile.

Attempt 3 — Dashboard inside Next.js, talking through a BFF layer. Created /api/dashboard/** routes in Next.js which forward to FastAPI admin routes. Same-origin, same cookies, no CORS, simple CSRF. This is the version which stuck.

Lesson: Same-origin BFF is underrated. People reach for direct API calls because it feels "purer", but the operational simplicity of a BFF wins almost every time for internal tools.

The rich text editor disaster

I really wanted a Notion-style editor for the CMS. Spent two days integrating one. The writing experience was great. The output was a nightmare:

The HTML had weird wrapper divs everywhere
Code blocks lost their language tags
Pasting from other places brought random inline styles
Worst of all — when I fed this HTML into the RAG chunker, the chunks were full of markup noise. Embedding quality dropped, retrieval became worse, and the AI answers got vague.

Ripped it out and went pure Markdown. Now:

Storage is clean Markdown text
The dashboard shows live preview using the exact same renderer as the public site
The RAG chunker can split on headings, preserve code blocks, and produce clean chunks

The writing experience is slightly less fancy, but everything downstream is 10x better.

Lesson: Choose your data format based on what consumes it, not based on what feels nice to author in. Markdown is boring and that's exactly why it works.

The RAG assistant — what broke and what got fixed

This was the hardest part.

First version (naive) — dump the full Markdown of each content piece as one big chunk, embed it, retrieve top 3 on query. Two problems:

Chunks were too large. A 2000-word article became one chunk. Retrieval would return the whole thing even if only one paragraph was relevant. LLM context got bloated, answers became wishy-washy.
No heading awareness. A question about "the database choice" might match a section in the middle of a chunk, but the LLM had no idea which part of the chunk was actually answering the question.

Second version (fixed):

Heading-aware chunking — split on ## and ### headings. Each chunk is now a coherent sub-section with its own heading as context. Average chunk size came down to around 300 tokens.
Strict grounding prompt — the LLM is told "Answer only using the provided context. If the answer is not there, say you don't have that information." This stops it from making things up.

Now the assistant gives short, accurate answers with citations back to the source content.

Lesson: RAG quality is 80% about chunking strategy, not about which LLM you use. Get the chunks right and even a small model gives great answers.

The edit page hydration bug

This one took me a full afternoon to debug. The dashboard had auto-save drafts in localStorage. But when I opened an existing item to edit, sometimes it would show stale draft content instead of the latest saved data from the server.

Root cause — the form was initialising from localStorage first, then the server data was loading, but the form state had already committed. So the server data got ignored.

Fix — in edit mode, always prefer server data as the source of truth. Show drafts only through an explicit "Recover draft" banner, never silently overwrite.

Lesson: When you have two sources of state (server + local), be very explicit about which one wins in which situation. Don't let "whichever loads first" decide.

Things I would do differently

Start with the unified table from day one. I wasted time on the per-type tables.
Skip the rich text editor experiment. Should have gone straight to Markdown.
Set up the RAG pipeline earlier. I treated it as a "final step" but actually the content structure should be designed for RAG from the beginning.
Write more tests on the backend. Frontend tests are decent, backend tests are thinner than I would like. Adding more now.

What I am happy with

Area	Result
Monorepo structure	Clean, scales well
Unified content model	Removed so much duplication
Markdown-first decision	Paid off in three different ways — authoring, rendering, RAG
Performance	ISR + cache tags + good Postgres indexes make the site genuinely fast
AI assistant	Actually works, gives honest answers, no hallucinations

Closing note

The portfolio is now itself a project I can point to. Every design decision I made is visible in the code, the architecture docs, and the case studies. That was the whole point — instead of telling people I can build production software, the site quietly *sh