Research Models

GRU Based Sequence To Sequence Model

using Mediapipe

Leveraging MediaPipe's powerful landmark detection capabilities, this model achieves an impressive 99% testing accuracy across 6 distinct actions. The model supports basic conversational signs including 'hello', 'how are you', 'sorry', 'welcome', and 'thank you', along with a 'blank' state for no action. Key Features: • Landmark extraction from student video data • Ready model implementation • Depth detection capabilities • ChatGPT integration for sentence creation • Automated sign processing

99% Testing Accuracy
6 Action Classes
Real-time Processing

ConvLSTM Based Model

Hybrid Architecture

A sophisticated hybrid approach combining Convolutional layers with LSTM cells for advanced video classification. This model represents a perfect balance between spatial and temporal feature extraction, trained extensively on our custom dataset. Key Features: • Hybrid CNN-LSTM architecture • Comprehensive video classification • CPU-optimized training process • Efficient model storage system • Real-time prediction capabilities

8 Hours Training
Optimized Architecture
Real-time Inference

Conv3D Based Video Classification

Advanced Approach

An ambitious implementation utilizing 3D convolutions for comprehensive spatio-temporal feature learning. While currently limited by hardware constraints, this model represents our vision for future development. Technical Requirements: • 32GB+ RAM • High-performance GPU • Powerful CPU • 3GB Model Size

3GB Model Size
32GB+ RAM Required
GPU Optimized