MindCube Challenge
Spatial Question Answering from Limited Multi-View Observations
We are organizing the MindCube Challenge, a spatial question answering benchmark designed to evaluate spatial mental modeling from limited multi-view observations. Participants will be ranked by accuracy on a held-out test set.
Quick Links
- Dataset & Code: github.com/mll-lab-nu/MindCube
- Challenge Contact: qinengw@u.northwestern.edu
- Submission Portal: Coming Soon
Challenge Overview
Goal
Given a multi-view observation and a question, predict the correct answer for each example.
What You Do
- Train / fine-tune on the MindCube training set.
- Develop and validate on
MindCube_tinybench. - Run inference on the held-out test set (to be released) and submit predictions.
Data Splits
Train
MindCube_train.jsonl
Validation
MindCube_tinybench.jsonl
Test (Held-out)
Final evaluation set
Coming SoonDataset: Data can be found at huggingface.co/datasets/MLL-Lab/MindCube
Format & loading: Please refer to the official instructions in the MindCube repository.
Evaluation
- Metric: Accuracy (exact match) on the held-out test set.
- Ranking: Teams are ranked by overall accuracy.
- (Optional) We may additionally report accuracy by setting/sub-category for analysis.
- Tie-break: Higher accuracy on a specific subset, then earlier submission time.
Challenge Leaderboard
Performance of submitted methods on the held-out test set.
Click on column headers to sort the results
| Rank ↕ | Team / Method ↕ | Overall ↕ | Rotation ↕ | Among ↕ | Around ↕ |
|---|---|---|---|---|---|
| - | Random (chance) | 32.35 | 36.36 | 32.29 | 30.66 |
| - | Random (frequency) | 33.02 | 38.30 | 32.66 | 35.79 |
| Challenge submissions coming soon... | |||||
Submission
Submission File Format (JSONL)
Submit a single .jsonl file with one JSON object per line, containing:
question_id(string)prediction(string or integer — choose one convention and keep it consistent)
{"question_id":"mc_000001","prediction":"B"}
{"question_id":"mc_000002","prediction":"A"} Requirements
- Provide exactly one prediction for each
question_idin the test set. - Duplicate IDs: Keep last / invalid submission
- Missing IDs: Count as incorrect / invalid submission
- (Recommended) You may gzip the file for size:
predictions.jsonl.gz
How to Submit
Download the held-out test set (coming soon)
Generate your predictions.jsonl following the required format.
Name the file as: TeamName_MethodName.jsonl (or .jsonl.gz)
Submit via upload form (coming soon) or email to
qinengw@u.northwestern.edu
with subject: MindCube Challenge Submission: TeamName
Rules
- External data / models / APIs: Allowed or disallowed — TBD; if allowed, require disclosure
- Human-in-the-loop labeling on test: Disallowed
- Participants must not attempt to obtain test labels or manipulate evaluation.
- Verification: Top teams may be asked to provide a brief method description and reproducibility details.
Baselines & Starter Kit
Baselines, data loaders, and evaluation scripts are available in the official MindCube repository:
github.com/mll-lab-nu/MindCubeContact
For questions, please reach out via: