[논문리뷰] TokenCompose: Grounding Diffusion with Token-level Supervision 2024년 07월 02일 TokenCompose 논문 리뷰 (CVPR 2024) Tags: Computer Vision, CVPR, Diffusion, Text-to-Image
[논문리뷰] MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding 2024년 06월 30일 MA-LMM 논문 리뷰 (CVPR 2024) Tags: Computer Vision, CVPR, Large Multimodal Model, Meta, Video Understanding
[논문리뷰] Pixel Aligned Language Models 2024년 06월 28일 PixelLLM 논문 리뷰 (CVPR 2024) Tags: Computer Vision, CVPR, Google, Image Captioning, NLP
[논문리뷰] Aligning and Prompting Everything All at Once for Universal Visual Perception 2024년 06월 26일 APE 논문 리뷰 (CVPR 2024) Tags: Computer Vision, CVPR, Image Segmentation
[논문리뷰] PixelLM: Pixel Reasoning with Large Multimodal Model 2024년 06월 24일 PixelLM 논문 리뷰 (CVPR 2024) Tags: Computer Vision, CVPR, Image Segmentation, Large Multimodal Model, NLP