Encoder/Decoder Transformer Model

A CTC Alignment-Based Non-Autoregressive Transformer for End-to-End Automatic Speech Recognition

Abstract: Recently, end-to-end models have been widely used in automatic speech recognition (ASR) systems. Two of the most representative approaches are connectionist temporal classification (CTC) and ...

Tech Times

Baidu OCR Breaks Long-Document Memory Wall: New Architecture Beats DeepSeek

Open-source OCR from Baidu eliminates the GPU memory wall that limits long-document parsing. Unlimited OCR uses a constant KV ...

GitHub

NVIDIA-AI-IOT/nanosam

NanoSAM is a Segment Anything (SAM) model variant that is capable of running in 🔥 real-time 🔥 on NVIDIA Jetson Orin Platforms with NVIDIA TensorRT. NanoSAM is trained by distilling the MobileSAM ...

IEEE

DEMAE: Diffusion-Enhanced Masked Autoencoder for Hyperspectral Image Classification With Few Labeled Samples

Abstract: Unlike other deep learning (DL) models, Transformer has the ability to extract long-range dependency features from hyperspectral image (HSI) data. Masked autoencoder (MAE), which is based on ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results