All projects
Computer Vision

Image Caption Generator

CNN + LSTM model trained on COCO dataset to describe images in English with transfer learning from Inception.

Overview

Generates English captions for arbitrary images using CNN-LSTM hybrid trained on COCO (12 categories, 80 sub-categories).

The Problem

Accessibility tools needed scalable, automated image descriptions.

The Solution

Transfer learning with Inception for feature extraction, custom LSTM decoder for caption generation.

Impact

Generated captions with usable accuracy for accessibility and content-tagging workflows.