AI / ML & Computer Vision Engineer

Harsh Akula

I build AI systems that actually work in production.Not demos. Real systems that make money and don't break at 3 AM.

What I've Built

Projects That Actually Matter

Here's the stuff I'm actually proud of. The ones where I learned something, broke something, and then fixed it properly.

01

Parrot — Movement Made Social

Enterprise-Grade iOS Movement Platform

Parrot App Interface
Pose Detection
Social Features
Leaderboards
Multi-Modal Activities

This is the big one. A production iOS app that's actually live in the App Store. Real-time AR pose detection, frame-accurate movement matching, multi-stream video capture, and a full social platform—all built in SwiftUI. It's not a prototype. It's not an MVP. It's a shipping product that real people use every day. And yes, you can download it right now.

Live in App Store

Status

99.7%

Pose Accuracy

<100ms

Latency

SwiftUI
Snap Camera Kit
MediaPipe
Firebase
AVFoundation
ARKit
Swift Concurrency

Multi-Stream Video Capture Pipeline

Simultaneously captures three video streams: 1080p camera quality, 720p processing stream, and depth data. Triple-buffer architecture with sophisticated memory management and frame synchronization. Because users want beautiful videos, but pose detection needs efficiency.

3 Streams Simultaneous

Real-Time AR Pose Detection

Snap Camera Kit integration with 20-joint 3D skeleton tracking at 30 FPS with sub-millisecond latency. MediaPipe Heavy model fallback ensures 99.7% detection accuracy even in challenging lighting. Frame-accurate timestamping with normalized pose data.

99.7% Accuracy • 30 FPS

Frame-Accurate Pose Matching

Frame-by-frame comparison with 33 pose landmarks, Euclidean distance in 3D space, temporal alignment, confidence-weighted scoring, and beat-synchronized analysis. Produces sub-percentage accuracy scores (87.3%) that users actually trust.

Sub-Percentage Accuracy

Enterprise Social Platform

Follow requests, threaded comments, user discovery, engagement analytics, content moderation. Full Firebase integration with composite indexes, atomic uploads, and intelligent retry logic. Infrastructure that scales.

Firebase Cloud

3-Tier Smart Caching System

L1 cache (500MB, 7 days) for previews, L2 cache (100MB, 30 days) for thumbnails, L3 permanent downloads. LRU eviction with access frequency boosting, automatic expiration, cache integrity validation, and emergency cleanup at 95% capacity.

3-Tier Architecture

Multi-Modal Activities Platform

Beyond dance: Boxing mode with 3D SceneKit visualization, exercise tracking, yoga poses, table tennis. Each mode uses specialized pose models optimized for the specific activity. Because movement isn't just dancing.

4+ Activity Modes

Lessons & Real Talk

  • Atomic transaction pattern for cloud uploads. All-or-nothing with explicit rollback. Learned the hard way that partial failures leave orphaned files.
  • User-scoped directory architecture for multi-user devices. Complete data isolation by userId. GDPR compliance from day one, not an afterthought.
  • 3-tier caching with LRU eviction and frequency boosting. Frequently accessed items get 20% score boost. Simple key-value caches don't cut it at scale.
  • Defensive state management prevents race conditions. Explicit flags stop auth listener interference. Async timing bugs are impossible to reproduce otherwise.
  • Performance instrumentation on every operation. Production monitoring, bottleneck identification. Can't fix what you don't measure.
  • VideoContentPipeline as unified facade. Views don't need to know about 8+ managers. Separation of concerns done right.
  • Health monitoring with self-healing. App monitors itself and takes corrective action. Operational maturity prevents user-facing failures.
  • Content moderation with warning flags. Threaded comment system includes report workflows, admin review queue, and automatic flagging. Community management at scale.
  • Backwards compatibility with deprecated upload methods. @available markers guide toward atomic patterns without breaking existing uploads. Incremental refactoring done right.
02

CakeGPT App

AI-Powered Restaurant Intelligence Platform

Menu Management
Inventory Management
Sales Analytics
Enterprise Sales
Employee Management
Business Insights

Built a full-stack restaurant management platform that talks to ChatGPT. Natural language queries, real-time POS sync, multi-supplier inventory, and enterprise analytics—all through conversational AI. Deployed as a Cloudflare Worker because edge computing is the future.

Cloudflare Edge

Deployment

MCP/JSON-RPC

Protocol

MongoDB DSL

Query Engine

Cloudflare Workers
React
TypeScript
MCP
JSON-RPC 2.0
MongoDB Query DSL
Chart.js

Menu Management System

Hierarchical navigation: Menu → Categories → Items → Modifiers → Ingredients. Full CRUD-ready with parent-child relationships and dynamic ingredient generation.

Hierarchical Structure

Multi-Supplier Inventory

Unified schema supporting SYSCO and USFOODS. MongoDB-style query syntax with operators ($eq, $gt, $contains, $or, $and) and nested field queries. Real-time stock tracking.

Unified Schema

Enterprise Sales Analytics

Multi-location, multi-region aggregation with advanced grouping. Payment breakdowns, order type analysis, top performers with percentage contribution. Analytics that drive decisions.

Multi-Location

Employee & Customer CRM

Employee performance metrics, shift scheduling, customer loyalty tiers, lifetime value tracking. All queryable through natural language.

CRM Integration

AI Business Insights

Categorized insights with actionable recommendations. Sales trends, price optimization, event-based forecasting. Because data is useless without context.

AI-Powered

Lessons & Real Talk

  • Built a MongoDB-style query engine from scratch. Not because I had to, but because I wanted operators like $gt, $contains, $or that work across nested fields. Because sometimes you need to query 'prices.case.netPrice > 50'.
  • Unified multi-supplier schema instead of separate handlers. SYSCO and USFOODS have different formats, but the frontend doesn't care. Abstraction layer handles it. Adding a third supplier? Minimal code changes.
  • Dynamic date generation: dates are relative to 'now', not hardcoded. getDate(-5) = 5 days ago. Demos always look current, no embarrassing 'last updated 2 years ago' moments.
  • 8-second timeout on external API calls. MCP has strict timeouts, and I've learned that external APIs will hang if you let them. Fail fast, fail gracefully.
  • Query validation with helpful errors. Instead of returning empty results, it tells you 'Invalid supplier(s): ACME. Available options: SYSCO, USFOODS.' Because debugging is easier when the system helps you.
  • Designed schemas for easy porting to Snowflake. Comments explicitly mention 'relational structure' because I know this will need to scale to real databases eventually.
  • Widget meta system abstracted into reusable functions. Because copying the same metadata pattern 10 times is how bugs happen. DRY isn't just a principle—it's survival.
  • Observability enabled from day one in wrangler.toml. Because when something breaks at 3 AM, you want logs. Not 'I'll add monitoring later.'
03

Enterprise ML & Research

Machine Learning Systems at Scale

Published Research
Research Methodology
Research Results
EEG Cross Decoding
EEG Analysis Results

The full arc: published research, neuroscience applications, and production ML that actually makes money. Because academic papers are nice, but shipping models that work is where it's at.

99.96%

Accuracy

$1M+

Revenue Impact

Published

Status

Python
XGBoost
MATLAB
SVM
Snowflake
Statistical Analysis
K-fold Validation

Published ML Research — Auto Test Set Predictor

Got something published. Peer-reviewed and everything. It's about ML validation—the boring stuff that prevents disasters.

Published & Peer-Reviewed

Neuroscience EEG Research

Found out memory encoding happens between neurons, not within them. The hypothesis was wrong, but that's what made it interesting.

Novel Discovery

Predictive Customer Health Scoring

XGBoost model that tells sales which customers are about to churn. 99.96% accuracy sounds impressive until you realize the 0.04% are the ones that matter.

99.96% Accuracy

Lessons & Real Talk

  • Chose XGBoost over neural networks for interpretability. Sales team needed to explain predictions. SHAP values > 'trust me, it's AI.'
  • Feature engineering took 3x longer than model training. The boring work is where accuracy comes from.
  • K-fold validation isn't optional when decisions affect customer relationships. One bad prediction = one angry client.
  • EEG research: hypothesis was wrong, and that's the discovery. Memory encoding between neurons, not within. 🧠
  • Built model drift monitoring before deployment. Watched 'high-accuracy' models silently degrade. Never again.
  • SHAP for feature importance. Stakeholders don't read confusion matrices; they read 'customer at risk because X.'
04

CAKE Menu Maestro

Full-Stack AI Restaurant Platform → Direct POS Integration

Visual Menu Builder
AI-Powered Insights
Order Management
Invoice Processing

Take a photo of a menu, get a working POS system. That's the pitch. The reality involves Claude AI, embeddings, dual supplier APIs, and way too many edge cases. But it works, and restaurants are using it.

End-to-End

Pipeline

3 LLMs

AI Models

POS-Ready

Output

React
TypeScript
NestJS
Claude AI
OpenAI Embeddings
GraphQL
React Flow
Supabase

AI Menu Digitization

Claude Opus reads menus from photos. It's not perfect, but it's way better than typing everything manually. Auto-categorizes, finds modifiers, spots allergens—the whole deal.

Photo → Structured Data

Semantic Ingredient Matching

Embeddings that understand 'ground beef 80/20' and 'Angus patties' are basically the same thing. Regex can't do that. Rich context descriptions make the matching actually work.

Vector Embeddings

Inventory Tracking That Actually Works

OCR invoices, detect suppliers automatically, fix pack sizes in the background. Because '1 case' means different things to different people, and inventory math matters.

Real-time Correction

AI Business Insights

Claude Opus 4.1 with web search for local events and weather. It can actually call Sysco/US Foods APIs mid-conversation. Real data, real recommendations, not generic advice.

Live Intelligence

Dual Supplier Integration

Sysco and US Foods APIs, both live. OAuth, GraphQL, the works. Built it extensible from day one because I knew a third supplier was coming.

Sysco + US Foods

Lessons & Real Talk

  • Multi-supplier architecture from day one. Adding a third supplier requires minimal code changes. I knew it was coming.
  • Rich text descriptions for embeddings include brand, category, pack size, price. Context matters: 'organic flour' vs 'all-purpose flour' need different matches.
  • Pack size normalization is where the money is. '1 case' means different things to different suppliers. Get this wrong and inventory is fiction. 📦
  • Background correction: invoice saves immediately, pack-size correction happens async. Users get instant feedback; data improves behind the scenes. 🎯
  • Food-item-only filtering: Claude excludes paper products automatically. Nobody needs 'Bounty paper towels' in ingredient inventory.
  • Tool-use architecture: Claude Opus 4.1 makes *real* Sysco/US Foods API calls mid-conversation. Live data, not canned responses.
  • Extended Thinking with 10K tokens. Weather + events + sales history = demand forecasting that works. Not just 'sell more burgers.'
  • Semantic matching at 0.81 threshold. Tuned by watching real workflows. High enough to avoid false positives, low enough to catch semantic matches.
  • Stable UUID generation based on content hashing. Menu items keep identical IDs across exports. POS systems won't duplicate items on re-sync.
05

Siren Platform

Commissioned Full-Stack Application

Siren Platform Interface
Siren Game Screen
Siren Features

Built this for a music social startup. Spotify API integration, daily puzzles, OAuth token refresh logic that handles edge cases. The kind of project where you learn more from what breaks than what works.

Commissioned

Type

Spotify API

Integration

Daily

Updates

Node.js
JavaScript
Spotify API
Real-time Data
OAuth 2.0
Game Logic

Lessons & Real Talk

  • OAuth token refresh handles edge cases tutorials skip: expired tokens mid-request, race conditions, 3 AM silent failures.
  • Spotify API rate limits are per-endpoint, not global. Got 429'd on search while charts worked fine. Read the docs twice.
  • Caching layer invalidates intelligently. Daily charts don't need real-time; playlists do. Not all data is equal.
  • Timezone handling: server stores UTC, client converts. Sounds obvious until debugging why yesterday's puzzle loaded today. 🌍
  • Exponential backoff with jitter on external calls. When Spotify hiccups, synchronized retries make it worse.
  • Scope creep from 'Wordle for music' to full platform. Managed by shipping incrementally—MVP first, features later.

Get in Touch

Let's talk

Always open to interesting projects, new opportunities, or just chatting about AI/ML, computer vision, or the latest thing that broke in production.Remote work preferred, but I'll travel for the right team.

Based in Tampa, Florida — but honestly, I'm usually just wherever my laptop is.