SmartPotato
Project Overview
SmartPotato is an AI-assisted video system for home or small-site security—think continuous recording, motion-aware clips, and search—without requiring a powerful machine or a paid cloud service.
Why “Potato”? The idea is simple: if it runs well on an old or low-end PC (“potato-grade” hardware), it can run almost anywhere. The aim is to make useful, privacy-respecting video review accessible, not locked behind expensive gear or mandatory subscriptions.
What it does
Instead of rewinding through hours of clips, you can search in everyday language—for example “person in a red shirt” or “delivery truck”—and get ranked moments from your saved recordings, similar to how you might search a photo library, but for your own footage stored locally.
The app records short clips when motion and simple category filters suggest something worth keeping, then builds a search index so those moments are easy to find later. Once models are set up, it can run without relying on the cloud for day-to-day use.
What makes it stand out
- Built for real hardware you might already own — designed with modest CPUs and edge-style deployment in mind, not only datacenter GPUs.
- Natural-language search over your recordings, so review feels closer to “ask a question” than “scrub a timeline.”
- Local-first mindset — your footage and search stay under your control; no subscription required for the core idea.
- Open source — code, training tooling, and documentation are on GitHub for others to learn from, reuse, and improve.
Training drew on broad public datasets and custom tooling so the detector stays practical for home-security–style scenes. There is also a related open dataset on Hugging Face for research and reuse:
- person-face-package-home-security-detection — licensing, splits, and how to cite are on the dataset page.
Open source and license
The project is published on GitHub under AGPL-3.0 (see the repo for full terms and third-party notices). Contributions are welcome.
What’s next
In the near term I’m focusing on other work (for example my capstone). Down the road I’d like to push richer search tuned for typical CCTV-style video and support for more than one camera in a single setup.
Gallery
Project walkthrough: