
The Lottery Ticket Hypothesis in Denoising: Towards Semantic-Driven Initialization
The Lottery Ticket Hypothesis in Denoising: Towards Semantic-Driven Initialization
Robust Nearest Neighbors for Source-Free Domain Adaptation under Class Distribution Shift
How to Defend Image-Text Matching against Adversarial Attacks
Source-Free Domain Adaptation with Class Distribution Shift via Generic Features
LayoutFlow: Flow Matching for Layout Generation
Complementary-Contradictory Feature Regularization against Multimodal Overfitting
Multimodal color recommendation in vector graphic documents
Dissecting multimodal learning via regularized masking of multimodal features
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation
Color Recommendation for Vector Graphic Documents based on Multi-Palette Representation
Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization
An Intelligent Color Recommendation Tool for Landing Page Design
Optimal Correction Cost for Object Detection Evaluation
Does robustness on ImageNet transfer to downstream tasks?
Video Summarization Overview
Uncovering Hidden Challenges in Query-Based Video Moment Retrieval
BERT representations for Video Question Answering
Knowledge-Based Visual Question Answering in Videos
KnowIt VQA: Answering Knowledge-Based Questions about Videos
Visually Grounded Paraphrase Identification via Gating and Phrase Localization
コメディドラマにおける字幕と表情を用いた笑い予測
Contact