An optimized object detection and tracking system for counting inventory items in videos using YOLO models. Features advanced optimizations for 29x faster processing on Apple Silicon.
- Real-time Object Detection - Detects and counts objects in video frames
- Object Tracking - Tracks unique objects to avoid duplicate counts
- Multi-Model Support - Supports grocery_yolov5s.pt, YOLOv8n, ONNX, and OpenVINO formats
- High Performance - 29x faster than baseline with optimizations
- Automatic Fallback - Intelligently falls back to available models
| Configuration | Speed per Frame | Speedup |
|---|---|---|
| Original (YOLOv8x @ 640px) | ~700ms | 1x |
| Optimized (YOLOv8n @ 320px) | ~24ms | 29x faster β‘ |
- β Reduced image size from 640px to 320px (3x speedup)
- β Using YOLOv8n instead of YOLOv8x (smaller, faster model)
- β ONNX Runtime support (3x-6x faster on some systems)
- β OpenVINO support (5x-10x faster on Intel CPUs)
- β Optimized for Apple Silicon (M1/M2/M3)
Video_model/
βββ src/
β βββ detect_inventory.py # Frame-based object detection
β βββ track_inventory.py # Video-based object tracking
β βββ extract_frames.py # Video frame extraction
β βββ run.py # Main execution script
β βββ utils.py # Utility functions
βββ videos/
β βββ sample.mp4 # Input video files
βββ output/
β βββ inventory.json # Detection results
βββ grocery_yolov5s.pt # Grocery-specific model (optional)
βββ yolov8n.pt # Default YOLO model
βββ yolov8n.onnx # ONNX export (optional)
βββ yolov8n_openvino_model/ # OpenVINO export (optional)
βββ requirements.txt # Python dependencies
βββ readme.md # This file
- Python 3.8 or higher
- pip (Python package manager)
- Virtual environment (recommended)
- Clone the repository
git clone https://github.com/NSTKrishna/Video_model.git
cd Video_model- Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On macOS/Linux
# OR
venv\Scripts\activate # On Windows- Install dependencies
pip install -r requirements.txt- Add your model (Optional - for grocery detection)
- Place
grocery_yolov5s.ptin the project root directory - If not available, the system will automatically use YOLOv8n
- Place
Run the inventory detection on your video:
python src/run.pyπ¦ Final Inventory:
{
"frame_based_count": {
"mouse": 1,
"book": 2
},
"tracking_based_count": {
"mouse": 1,
"book": 4
}
}Place your video file in the videos/ directory and update src/run.py:
VIDEO_PATH = "videos/your_video.mp4"The system automatically selects the best available model:
- grocery_yolov5s.pt (if present) - Grocery-specific detection
- yolov8n.pt (fallback) - General object detection
Edit src/detect_inventory.py to customize:
# Change image size (smaller = faster, larger = more accurate)
results = model(frame, imgsz=320, verbose=False)
# Adjust frame extraction rate in extract_frames.py
frames = extract_frames(VIDEO_PATH, fps_interval=1) # Extract 1 frame per secondRun the export script to create optimized model formats:
python export_model.pyThis creates:
yolov8n.onnx- ONNX format (portable, faster)yolov8n_openvino_model/- OpenVINO format (fastest on Intel CPUs)
Test different model formats on your system:
python benchmark.pyMain dependencies:
ultralytics- YOLO implementationopencv-python- Video processingnumpy- Numerical operationsonnxruntime- ONNX inference (optional)openvino-dev- OpenVINO optimization (optional)
See requirements.txt for complete list.
- Extracts frames from video at specified intervals
- Runs YOLO detection on each frame
- Counts all detected objects across frames
- Processes video stream continuously
- Assigns unique IDs to objects using ByteTrack
- Counts only unique objects (avoids duplicates)
Video Input β Frame Extraction β YOLO Detection (320px) β Object Counting
β
Tracking (ByteTrack) β Unique ID Assignment
Solution: This is normal if you don't have the grocery model. The system will use YOLOv8n as fallback. To remove the warning, either:
- Add the
grocery_yolov5s.ptfile to the project root, OR - The code will continue working with YOLOv8n
Solutions:
- Ensure
imgsz=320is set in detection/tracking calls - Use YOLOv8n instead of larger models (YOLOv8x)
- Extract fewer frames (increase
fps_interval) - Run
python export_model.pyand use ONNX/OpenVINO
Solution: The code is optimized for CPU inference. GPU support is automatic if PyTorch detects CUDA.
- For Speed: Use
imgsz=320, YOLOv8n model, reduce frame extraction rate - For Accuracy: Use
imgsz=640, grocery_yolov5s.pt or larger models, extract more frames - For Balance: Current default settings (320px, YOLOv8n, 1 fps)
Additional documentation files:
FINAL_RESULTS.md- Complete optimization resultsGROCERY_MODEL_SETUP.md- Grocery model setup guideOPTIMIZATIONS.md- Detailed optimization explanationsQUICK_REFERENCE.md- Quick command reference
Contributions are welcome! Please feel free to submit a Pull Request.
This project is open source and available under the MIT License.
NSTKrishna
- GitHub: @NSTKrishna
- Ultralytics for YOLO implementation
- OpenVINO for CPU optimization
- ONNX Runtime for cross-platform inference
For issues and questions:
- Check the troubleshooting section above
- Review documentation in the project
- Open an issue on GitHub
Last Updated: November 25, 2025
Version: 2.0 (Optimized)
β If you find this project useful, please give it a star!