[논문리뷰] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want 2024년 07월 06일 Alpha-CLIP 논문 리뷰 (CVPR 2024) Tags: Computer Vision, CVPR
[논문리뷰] Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models 2024년 07월 04일 Monkey 논문 리뷰 (CVPR 2024) Tags: Computer Vision, CVPR, Large Multimodal Model, NLP
[논문리뷰] TokenCompose: Grounding Diffusion with Token-level Supervision 2024년 07월 02일 TokenCompose 논문 리뷰 (CVPR 2024) Tags: Computer Vision, CVPR, Diffusion, Text-to-Image
[논문리뷰] MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding 2024년 06월 30일 MA-LMM 논문 리뷰 (CVPR 2024) Tags: Computer Vision, CVPR, Large Multimodal Model, Meta, Video Understanding
[논문리뷰] Pixel Aligned Language Models 2024년 06월 28일 PixelLLM 논문 리뷰 (CVPR 2024) Tags: Computer Vision, CVPR, Google, Image Captioning, NLP