Abstract: Recently, end-to-end models have been widely used in automatic speech recognition (ASR) systems. Two of the most representative approaches are connectionist temporal classification (CTC) and ...
Open-source OCR from Baidu eliminates the GPU memory wall that limits long-document parsing. Unlimited OCR uses a constant KV ...
NanoSAM is a Segment Anything (SAM) model variant that is capable of running in 🔥 real-time 🔥 on NVIDIA Jetson Orin Platforms with NVIDIA TensorRT. NanoSAM is trained by distilling the MobileSAM ...
Abstract: Unlike other deep learning (DL) models, Transformer has the ability to extract long-range dependency features from hyperspectral image (HSI) data. Masked autoencoder (MAE), which is based on ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results