· 4+ years of experience applying AI to practical uses
· Develop and train computer vision models for tasks like:
· Object detection and tracking (YOLO, Faster R-CNN, etc.)
· Image classification, segmentation, OCR (e.g., PaddleOCR, Tesseract)
· Face recognition/blurring, anomaly detection, etc.
· Optimize models for performance on edge devices (e.g., NVIDIA Jetson, OpenVINO, TensorRT).
· Process and annotate image/video datasets; apply data augmentation techniques.
· Proficiency in Large Language Models.
· Strong understanding of statistical analysis and machine learning algorithms.
· Hands-on implementing various machine learning algorithms such as linear regression, logistic regression, decision trees, and clustering algorithms.
· Understanding of image processing concepts (thresholding, contour detection, transformations, etc.)
· Experience in model optimization, quantization, or deploying to edge (Jetson Nano/Xavier, Coral, etc.)
· Strong programming skills in Python (or C++), with expertise in:
· Implement and optimize machine learning pipelines and workflows for seamless integration into production systems.
· Hands-on experience with at least one real-time CV application (e.g., surveillance, retail analytics, industrial inspection, AR/VR).
· OpenCV, NumPy, PyTorch/TensorFlow
· Computer vision models like YOLOv5/v8, Mask R-CNN, DeepSORT
· Engage with multiple teams and contribute on key decisions.
· Expected to provide solutions to problems that apply across multiple teams.
· Lead the implementation of large language models in AI applications.
· Research and apply cutting-edge AI techniques to enhance system performance.
· Contribute to the development and deployment of AI solutions across various domains
· Design, develop, and deploy ML models for:
· OCR-based text extraction from scanned documents (PDFs, images)
· Table and line-item detection in invoices, receipts, and forms
· Named entity recognition (NER) and information classification
· Evaluate and integrate third-party OCR tools (e.g., Tesseract, Google Vision API, AWS Textract, Azure OCR,PaddleOCR, EasyOCR)
· Develop pre-processing and post-processing pipelines for noisy image/text data
· Familiarity with video analytics platforms (e.g., DeepStream, Streamlit-based dashboards).
· Experience with MLOps tools (MLflow, ONNX, Triton Inference Server).
· Background in academic CV research or published papers.
· Knowledge of GPU acceleration, CUDA, or hardware integration (cameras, sensors).