← Work
ActiveInternal toolBuilt & maintained2024–present

Vision Lab

The data pipeline behind a card recognition model — collecting, curating, and labelling training images at scale.

The problem

Card recognition — identifying a Pokémon card from a photo — is a genuinely hard computer vision problem. Cards share visual structure (same border style, same layout), differ in subtle ways (artwork, set symbol, card number, holographic pattern), and come in hundreds of variant printings that a model needs to distinguish reliably.

Training a model that performs well requires a large, clean, well-labelled dataset. Assembling that dataset — collecting images, verifying they map to the correct card and variant, normalising quality, and structuring labels in a format the training pipeline can consume — is its own substantial engineering problem.

Vision Lab is the tooling that makes that dataset assembly tractable. It's not the model itself — it's the infrastructure for building and maintaining the training data that the model learns from.

Approach

The core challenge is that training data quality matters more than quantity. A dataset of 100,000 poorly labelled or inconsistently cropped images produces a worse model than 20,000 carefully curated ones. Vision Lab is designed around that constraint — making it practical to collect images at volume while applying enough structure and verification to keep quality high.

We treat the dataset as a product: versioned, auditable, with clear provenance for every image. When the model performs badly on a specific card type, we can trace back to the training data for those cards, identify the issue, and fix it — rather than treating the dataset as an opaque blob that occasionally gets added to.

Key decisions
Canonical image identity from TCGDex — every training image is keyed to a TCGDex card ID, so labels are precise to the exact printing and variant
Curated over scraped — images are sourced and verified rather than bulk-scraped, which produces cleaner labels and fewer edge cases
Versioned dataset — each training run uses a snapshot of the dataset, so model performance can be traced to specific data states
Structured label format — labels include card ID, set, variant type, and image quality metadata, not just a class name
Review queue with pass/reject — every image goes through a verification step before entering the training set
What was built

Vision Lab has two main surfaces: an image collection and labelling interface, and a dataset management layer that produces the structured exports the training pipeline consumes.

Image intake and validation — accepts card image submissions, checks format and resolution, extracts metadata
Labelling interface — reviewers assign canonical card IDs to submitted images, confirm variant type, flag quality issues
Side-by-side reference view — submitted image shown alongside the canonical TCGDex reference image to verify correct identification
Quality scoring — each image receives metadata on crop alignment, lighting, glare, and focus that downstream training jobs can use as sample weights
Dataset versioning — point-in-time snapshots of the labelled dataset for reproducible training runs
Export pipeline — produces structured label files in formats compatible with common training frameworks
Coverage dashboard — shows which cards and variants have sufficient training examples and which are underrepresented
Contribution tracking — records image sources and reviewer decisions for provenance and audit
What was hard

Variant-level label precision

Pokémon cards have many variants: base, reverse holofoil, full art, alternate art, Poké Ball pattern, Master Ball pattern. A model that can identify 'Charizard' but not 'Charizard Poké Ball pattern reverse holofoil' is only partially useful. Getting labels to variant granularity required that the labelling interface understand the card data model well enough to present the right options — which means it's tightly coupled to the TCGDex dataset structure.

Defining 'good enough' image quality

Not every image in a training set needs to be perfect — some variation in lighting, angle, and quality is actually useful for making the model robust. But there are thresholds: too much glare makes holographic patterns unreadable, too much blur makes text illegible, too much crop removes identifying features. Defining these thresholds in a way that reviewers could apply consistently, and encoding them as structured metadata rather than a binary pass/fail, took iteration.

Coverage gaps and class imbalance

Common cards from recent sets are easy to collect images for. Older cards, regional variants, and lower-print-run promos are hard to find in sufficient quantity. A model trained on an imbalanced dataset will perform well on common cards and poorly on rare ones — which is the opposite of what's useful. The coverage dashboard exists specifically to surface these gaps so we can prioritise targeted collection rather than just adding more images of cards that are already well represented.

Keeping the dataset consistent as cards change

TCGDex occasionally corrects card data — a card number that was wrong, a variant relationship that was mismodelled. When that happens, training images labelled against the old data need to be updated. Building the dataset with card IDs as the primary key (rather than human-readable names or numbers) means these corrections can be propagated systematically rather than requiring manual re-labelling.

Stack
FrontendNext.js · React — labelling and review interface
BackendNode.js · REST API
Card dataTCGDex — canonical card and variant reference
StorageCloud storage — image store with versioned snapshots
DatabasePostgreSQL — labels, provenance, quality metadata
ML exportPython — structured label export for training frameworks
AuthInternal only — reviewer access control
Outcomes
Labelled training dataset with variant-level precision keyed to TCGDex card IDs
Reproducible training runs via versioned dataset snapshots
Coverage dashboard identifying underrepresented cards and variants
Provenance record for every image — source, reviewer, quality metadata
Export pipeline producing training-framework-compatible label files

Vision Lab is internal tooling. The card recognition model it trains is an ongoing project — the dataset and tooling are the current focus.