Projects

This page documents my applied AI engineering and research projects, spanning LLM systems, retrieval-augmented generation, model compression, and production-grade ML pipelines.

Text-to-3D mesh generation using diffusion models - Tesseract v1

Visual Samples

August 2025

Text-to-3D mesh generation using diffusion models

Tesseract v1 - Text-to-3D Mesh Generation System

Context & Motivation

I built Tesseract as my first flagship, end-to-end ML system to deliberately move beyond notebook-driven experiments and understand what separates production-grade AI systems from typical college-level projects..

Early-stage 3D asset creation is slow and labor-intensive, often forcing artists and developers to start from scratch before any meaningful iteration can begin. While text-to-3D diffusion models exist, most are released as fragile scripts or demos that are difficult to integrate into real workflows, lack reproducibility, and are not designed for deployment or scaling.

I wanted to explore how a research model could be wrapped into a reliable, scriptable, and scalable system, usable through real interfaces rather than isolated experiments.

Problem Statement

How can text-conditioned 3D mesh generation be exposed as a reliable, production-ready system suitable for batch workflows, service-based integration, and reproducible experimentation—rather than remaining a research demo or notebook artifact?

System Overview

Tesseract v1 is a modular, production-oriented ML pipeline that wraps a diffusion-based 3D generation model and exposes it through multiple interfaces:

Stateless FastAPI service for supporting asynchronous, job-based mesh generation
Custom CLI interface for local usage, batch generation and scripting
Config-driven execution model (YAML-based) to enable reproducible runs and controlled experimentation

The system manages prompt ingestion, latent generation, mesh decoding, file export, and optional preview rendering in a unified pipeline. and is is intentionally designed to be stateless at the service layer, device-aware (GPU with CPU fallback), and structured in a way that is compatible with containerized deployment patterns.

Key Engineering Decisions

Rather than focusing on model novelty, I centered this project around system reliability, iteration ergonomics, and realistic integration constraints:

Custom CLI with explicit flags for prompts, batching, output formats, and dry-runs, enabling reproducible experimentation outside notebooks
Batch processing support to generate multiple samples per prompt, both to improve output selection and to stress-test the pipeline under heavier workloads
Latent cache saving and resume support, allowing generation to continue from intermediate latent states instead of restarting full runs
Asynchronous job handling to avoid blocking execution and support concurrent requests
Modular system boundaries separating API, orchestration, decoding, rendering, and configuration layers

While initially attempting to extract only the minimal components required for text-to-3D generation, I found that Shape-E's research-oriented codebase was highly coupled, with core functionality spanning a large dependency graph. Rather than risk subtle breakage or silent correctness issues, I chose to vendor the full Shape-E project into the core, and build clean abstraction layers around it instead of aggressively pruning internals.

This decision favored correctness, stability, and debuggability over premature modularization.

Engineering & Stack

Core: Python, PyTorch, Shape-E (text-conditioned diffusion)
Interfaces: Custom CLI, FastAPI (REST)
Execution: Async job orchestration, batch processing, latent caching
Configuration: YAML-driven runtime configuration
Observability: Structured logging across pipeline stages
Deployment posture: Designed to be container-friendly and compatible with common Docker and Kubernetes deployment patterns

The focus was on keeping the system simple to reason about, easy to debug, and reproducible, rather than optimizing prematurely for scale.

Evaluation & Validation

Evaluation focused on system correctness and operational behavior, rather than benchmark scores:

Qualitative inspection of generated meshes to assess usability as starting canvases
Validation of batch execution under different memory and latency configurations
Failure-mode testing around decoding, rendering, and partial pipeline crashes
Verification that cached latents could reliably resume downstream stages

The emphasis was on ensuring the pipeline behaved predictably under iteration and failure, rather than optimizing for output aesthetics alone.

Limitations & Reflections

Several engineering choices emerged directly from repeated friction during development.

I introduced batch processing not only to increase the likelihood of obtaining useful outputs from stochastic diffusion, but also to intentionally stress-test the pipeline and observe how resource usage and failure modes surfaced under load.

Similarly, latent cache saving was added after encountering multiple cases where decoding or rendering failures—often due to small bugs or misconfigurations—forced the entire diffusion process to restart. Persisting intermediate latents significantly reduced wasted compute and made debugging faster and less frustrating.

Integrating a research codebase like Shape-E also highlighted the tradeoff between ideal modularity and practical correctness. This project reinforced that, in applied ML systems, stability and recoverability often matter more than architectural purity.

As my first flagship, end-to-end ML system, Tesseract helped solidify my understanding of what distinguishes robust engineering from experimental code: careful state management, clear boundaries, and designing for things to fail gracefully.

Status

Completed (v1) — a production-oriented architecture with clear scope for future improvements in model quality, evaluation rigor, and modular refinement.

LLM-based Reddit user persona generation system - Reddit-Persona

Visual Samples

July 2025

LLM-based user persona generation from Reddit activity

Reddit-Persona - LLM-based User Persona Generation

I built Reddit-Persona to explore how large language models can synthesize coherent behavioral personas from noisy, unstructured, real-world text data. This project originated as a pre-interview internship assignment, but I intentionally extended it beyond initial requirements to understand how LLMs behave when tasked with higher-level reasoning over fragmented user activity rather than direct question answering.