MindCube Logo MindCube
CVinW @ CVPR 2026

MindCube Challenge

Spatial Question Answering from Limited Multi-View Observations

📢

MindCube Test Set is now released!

The held-out test set is available for download here: MindCube Challenge Test Set.

Heads-up: The Test Phase on EvalAI is currently experiencing an issue. We are actively working with the EvalAI team to fix it. In the meantime, you can already download the test set and run your results locally — we will notify everyone (via the Challenge Slack) as soon as submissions are open.

🏛️
Hosted by: The 5th Workshop on Computer Vision in the Wild (CVinW), CVPR 2026.

We are organizing the MindCube Challenge, a spatial question answering benchmark designed to evaluate spatial mental modeling from limited multi-view observations. Participants will be ranked by accuracy on a held-out test set.

Challenge Overview

Goal

Given a multi-view observation and a question, predict the correct answer for each example.

What You Do

  1. Train / fine-tune on the MindCube training set.
  2. Develop and validate on MindCube_tinybench.
  3. Run inference on the held-out test set (now released) and submit predictions.

Data Splits

📚

Train

MindCube_train.jsonl

Public
🔬

Validation

MindCube_tinybench.jsonl

Public
🏆

Test (Held-out)

Final evaluation set

Released

Dataset: Data can be found at huggingface.co/datasets/MLL-Lab/MindCube

Format & loading: Please refer to the official instructions in the MindCube repository.

Evaluation

  • Metric: Accuracy (exact match) on the held-out test set.
  • Ranking: Teams are ranked by overall accuracy.
  • (Optional) We may additionally report accuracy by setting/sub-category for analysis.
  • Tie-break: Higher accuracy on a specific subset, then earlier submission time.

Challenge Leaderboard

Performance of submitted methods on the held-out test set.

Click on column headers to sort the results

Baseline Participants
Rank Team / Method Overall Rotation Among Around
- Random (chance) 32.35 36.36 32.29 30.66
- Random (frequency) 33.02 38.30 32.66 35.79
Challenge submissions coming soon...
Leaderboard will be updated after the test set is released and submissions are evaluated.

Submission

Submission File Format (JSONL)

Submit a single .jsonl file with one JSON object per line, containing:

  • id (string) — the question ID from the test set
  • answer (string) — the predicted option letter (e.g., "A", "B", "C", "D")
Example submission format
{"id": "among_group693_q1_5_2", "answer": "C"}
{"id": "around_group012_q3_1_0", "answer": "A"}

Requirements

  • Provide exactly one prediction for each id in the test set.
  • Duplicate IDs: Keep last / invalid submission
  • Missing IDs: Count as incorrect / invalid submission

How to Submit

1

Download the held-out test set from the Challenge Test Set link.

2

Generate your predictions.jsonl following the required format.

3

Name the file as: TeamName_MethodName.jsonl

4

Submit your predictions on EvalAI. Questions? Contact qinengw@u.northwestern.edu.

Submission Limit: Up to 5 submissions per day, 100 submissions total per team across the challenge
Dev Phase Deadline: May 22, 2026 (AoE)
Test Phase Deadline: May 25, 2026 (AoE)
Results Announcement: May 31, 2026

Rules

  • External data / models / APIs: Open-source models and external data are allowed. Commercial API-only (closed-source) models are disallowed. Please disclose any external resources used in your method description.
  • Human-in-the-loop labeling on test: Disallowed
  • Participants must not attempt to obtain test labels or manipulate evaluation.
  • Verification: Top teams may be asked to provide a brief method description and reproducibility details.

Baselines & Starter Kit

Baselines, data loaders, and evaluation scripts are available in the official MindCube repository:

github.com/mll-lab-nu/MindCube

Contact

For questions, please reach out via: