Browse a curated set of segmentation, detection, and vision architectures. Use the filters or search to quickly find what you need.
Sobel Attention Guided Multi-Scale UNet developed for complex 2D palm line segmentation. Features explicit input-level feature engineering with attention mechanisms on UNet skip connections to highlight fine, thin-line structural features.
Dual-stream architecture integrating Vision Transformer (ViT) for image feature extraction and Medical BERT for textual symptom processing. Generates 768-dimensional features from each modality, combined via early fusion (1536-dim) for oral disease classification.
A Variational Attention Framework with Content-Aware Upsampling. Novel architecture integrating variational autoencoder principles with attention mechanisms and content-aware upsampling strategies to achieve robust segmentation.
Lightweight novel model optimized for edge computing devices for steel surface defect detection. Features YOLOv5 with GhostNet backbone and optimized anchor boxes using K-Means++ clustering for efficient real-time inference.
Hybrid model combining CSPDarkNet53 for local feature extraction and Vision Transformers for global context modeling. Processes 4×4 patches (256-dim) through multi-head self-attention with positional encodings, efficiently classifying fingernail diseases through combined CNN and transformer layers.