End‑to‑end toolkit for collecting hand‑gesture landmark data, training gesture‑recognition models, and exporting ready‑to‑deploy ONNX weights.
GestureForge lets you quickly record palm/arm landmarks (static and dynamic gestures), train a model (Random‑Forest for static, GRU for dynamic gestures), sanity‑check the result, and use the trained model for export as portable ONNX weights, enabling deployment across C++, web, and edge environments.
GestureForge/
│ .gitignore
│ LICENSE # MIT or Apache 2.0 (your pick)
│ README.md # short project summary
│ requirements.txt # exact library versions
│
├── ImageDataGenerator_Training_Inference/
│ ├── data_collect.py # record single‑frame samples
│ ├── train.py # RandomForest pipeline
│ ├── inference.py # quick accuracy sanity‑check
│ ├── outputs/ # pickles + encoders after each run
│ └── utils/ # utilities
│ ├── arm_detect.py # arm_detect
│ ├── hand_detect.py # hand_detect
│ ├── palm_detect.py # palm_detect
│ └── pose_landmarker_heavy.task # pose_landmarker_model
│
└── VideoDataGenerator_Training_Inference/
├── data_collect.py # record multi‑frame sequences
├── train.py # GRU training driven by YAML
├── inference.py # sanity‑check sequence model
├── train_config.yaml # epochs, hidden_size, etc.
├── outputs/ # *.pt, pickles, CSV metadata
└── utils/ # same detector modules as above
├── arm_detect.py # arm_detect
├── hand_detect.py # hand_detect
├── palm_detect.py # palm_detect
└── trainer.py # trainer_classes
python3.11 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
cd ImageDataGenerator_Training_Inference
# (1) Record N samples of label "wave" using only palm landmarks
python data_collect.py okay -r
# (2) Train Random‑Forest (defaults)
python train.py
# (3) Sanity‑check inference
python inference.py -ant
cd VideoDataGenerator_Training_Inference
# (1) Record sequences (palm + arm landmarks, 60 frames each)
python data_collect.py clap -r
# (2) Configure training
vim train_config.yaml
# (3) Train GRU
python train.py # writes model.pt & label_encoder.pkl
# (4) Sanity‑check inference
python inference.py
MediaPipe (raw landmarks)
└─ palm_detect → palm tensor (scaled + normalised)
└─ arm_detect → arm tensor (same normalisation)
└─ hand_detect (fusion + relative geometry)
→ final hand vector # used by trainer / inference
sklearn.ensemble.RandomForestClassifier
with default hyper‑params(batch, seq_len, features)
train_config.yaml
:
data:
batch_size: 2
model:
bidirectional_gru: false
gru_hidden_size: 32
gru_num_layers: 2
training:
epochs: 100
learning_rate: 1e-3
early_stopping_accuracy_thresh: false
early_stopping_toll: 4
Both pipelines include an inference.py
script that reproduces preprocessing and runs the freshly‑trained model to ensure data & training are correct before downstream deployment.
Contributions are welcome! Please fork the repository and submit a pull request for any enhancements or bug fixes. For more details and updates, visit the GitHub Repository.