End‑to‑end toolkit for collecting hand‑gesture landmark data, training gesture‑recognition models, and exporting ready‑to‑deploy ONNX weights.
GestureForge lets you quickly record palm/arm landmarks (static and dynamic gestures), train a model (Random‑Forest for static, GRU for dynamic gestures), sanity‑check the result, and use the trained model for export as portable ONNX weights, enabling deployment across C++, web, and edge environments.
GestureForge/
│ .gitignore
│ LICENSE # MIT or Apache 2.0 (your pick)
│ README.md # short project summary
│ requirements.txt # exact library versions
│
├── ImageDataGenerator_Training_Inference/
│ ├── data_collect.py # record single‑frame samples
│ ├── train.py # RandomForest pipeline
│ ├── inference.py # quick accuracy sanity‑check
│ ├── outputs/ # pickles + encoders after each run
│ └── utils/ # utilities
│ ├── arm_detect.py # arm_detect
│ ├── hand_detect.py # hand_detect
│ ├── palm_detect.py # palm_detect
│ └── pose_landmarker_heavy.task # pose_landmarker_model
│
└── VideoDataGenerator_Training_Inference/
├── data_collect.py # record multi‑frame sequences
├── train.py # GRU training driven by YAML
├── inference.py # sanity‑check sequence model
├── train_config.yaml # epochs, hidden_size, etc.
├── outputs/ # *.pt, pickles, CSV metadata
└── utils/ # same detector modules as above
├── arm_detect.py # arm_detect
├── hand_detect.py # hand_detect
├── palm_detect.py # palm_detect
├── trainer.py # trainer_classes
└── pose_landmarker_heavy.task
python3.11 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
cd ImageDataGenerator_Training_Inference
# (1) Record N samples of label "wave" using only palm landmarks
python data_collect.py okay -r
# (2) Train Random‑Forest (defaults)
python train.py
# (3) Sanity‑check inference
python inference.py -ant
cd VideoDataGenerator_Training_Inference
# (1) Record sequences (palm + arm landmarks, 60 frames each)
python data_collect.py clap -r
# (2) Configure training
vim train_config.yaml
# (3) Train GRU
python train.py # writes model.pt & label_encoder.pkl
# (4) Sanity‑check inference
python inference.py
MediaPipe (raw landmarks)
└─ palm_detect → palm tensor (scaled + normalised)
└─ arm_detect → arm tensor (same normalisation)
└─ hand_detect (fusion + relative geometry)
→ final hand vector # used by trainer / inference
sklearn.ensemble.RandomForestClassifier with default hyper‑params(batch, seq_len, features)train_config.yaml:
data:
batch_size: 2
model:
bidirectional_gru: false
gru_hidden_size: 32
gru_num_layers: 2
training:
epochs: 100
learning_rate: 1e-3
early_stopping_accuracy_thresh: false
early_stopping_toll: 4
Both pipelines include an inference.py script that reproduces preprocessing and runs the freshly‑trained model to ensure data & training are correct before downstream deployment.
Contributions are welcome! Please fork the repository and submit a pull request for any enhancements or bug fixes. For more details and updates, visit the GitHub Repository.
GestureForge is built as a modular utility framework to streamline the process of recording, training, and validating hand gesture recognition models using landmark data. While the current implementation focuses on preparing models for accuracy verification and proof-of-concept testing, it lays the foundation for several impactful extensions in the future: