Chinese Blogs
论文阅读
CV
- RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
- RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching
- Learning Optical Flow from Continuous Spike Streams
- Advances in spike vision(关于spikeCV)
- “GrabCut” ——Interactive Foreground Extraction using Iterated Graph Cuts
- Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation
ROBOT
- GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping
- Graspness Discovery in Clutters for Fast and Accurate Grasp Detection
- 1000 FPS HDR Video with a Spike-RGB Hybrid Camera
NLP
VOICE
- Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
- Hybrid neural networks for on-device directional hearing
- Creating speech zones with self-distributing acoustic swarms
- Underwater 3D positioning on smart devices
Speech Recognition
- SPEECH-TRANSFORMER: A NO-RECURRENCE SEQUENCE-TO-SEQUENCE MODEL FOR SPEECH RECOGNITION
- THE SPEECHTRANSFORMER FOR LARGE-SCALE MANDARIN CHINESE SPEECH RECOGNITION
- Conformer: Convolution-augmented Transformer for Speech Recognition
Milestone article
- ImageNet Classification with Deep Convolutional Neural Networks(AlexNet)
- Deep Residual Learning for Image Recognition(ResNet)
- Attention Is All You Need(Transformer)
- A Gentle Introduction to Graph Neural Networks(GNN)
- Generative Adversarial Nets(GAN)
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE(ViT)
- Masked Autoencoders Are Scalable Vision Learners(MAE)
- Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
- ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
Domain Adaptation
立体匹配
科研记录
会议关键点合集
项目记录
一些工具的使用
机器学习与深度学习
保研复习整理
- 线性代数
- 概率论
- 数据结构
- 机器学习
- 计算机网络
- 操作系统
- 英语