1. ABOUT THE DATASET -------------------- Title: Creator(s): Fangjun Li [1], David Hogg [1], Anthony Cohn [1,2] Organisation(s): 1. University of Leeds. 2. Alan Turing Institute. Rights-holder(s):Unless otherwise stated, Copyright 2024 University of Leeds Publication Year: 2024 Description: This dataset, associated with the IJCAI-24 paper "Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative Reasoning", aims to enhance spatial reasoning evaluations in language models. Our benchmark encompasses a broad spectrum of qualitative spatial relationships, including topological, directional, and distance relations. These are presented with different viewing points, varied granularities, and density of relation constraints to mimic real-world complexities, promoting more accurate evaluation of language models' capabilities in spatial reasoning tasks. Cite as: Fangjun Li, David Hogg and Anthony Cohn (2024): RoomSpace. [Dataset]. https://doi.org/10.5518/1518 Related publication: Fangjun Li, David Hogg, and Anthony Cohn "Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative Reasoning", IJCAI, 2024 (Accepted). Contact: lifangjun95@gmail.com 2. TERMS OF USE --------------- Copyright 2024 University of Leeds 3. PROJECT AND FUNDING INFORMATION ---------------------------------- Title: Microsoft Research - Accelerating Foundation Models Research program Dates: 2023.04 - 2024.06 Funding organisation: Microsoft Title: the Turing’s Defence and Security programme through a partnership with the UK government in accordance with the framework agreement between GCHQ and The Alan Turing Institute Funding organisation: Alan Turing Institute Title: Economic and Social Research Council (ESRC) Grant no.: ES/W003473/1 4. CONTENTS ----------- ## Directory Structure The repository is structured as follows: ``` Data/ ├── SD-100/ ├── SD-1K/ └── SD-10K/ ``` - `SD-100`: Contains data for 100 room samples. - `SD-1K`: Contains data for 1,000 room samples, with the first 100 rooms identical to those in `SD-100`. - `SD-10K`: Contains data for 10,000 room samples, with the first 1,000 rooms identical to those in `SD-1K`. Each `SD-*` subdirectory includes the following: ``` SD-*/ ├── SD-*.json ├── Image/ │ ├── top-down/ │ ├── robot_south/ │ └── robot_south_rec/ ├── Text/ └── Logic/ ``` ### SD-*.json This JSON file contains settings for each scene within the respective subdirectory. It details the configuration of the rooms, objects placed, and any specific parameters or constraints relevant to the simulations in that dataset. ### Image This directory contains subdirectories for different views of the rooms: - `top-down`: Top-down images of the rooms. - `robot_south`: North-facing square images, assuming a robot stands at the south door. - `robot_south_rec`: North-facing rectangular images. Each image file is named `ID.png`, corresponding to the room's ID. ### Text Contains JSON files named `n*_m*_d*.json`, which include stories, questions, and answers derived from the images and a CSP (Constraint Satisfaction Problem) reasoner. The parameters `n`, `m`, and `d` represent the object number, constraint density, and domain size, respectively. ### Logic Includes JSON files under different spatial reasoning settings such as 'Layout', 'TPP', 'O2', 'O2+D2', 'O2+D3', 'Layout+O2+D2', 'Layout+O2+D3'. These files contain logical facts and reasoning results: - `n*_m*_d*.json`: Based on `n`, `m`, and `d` parameters as described. - `answers_lengths_d*.json`: Records the answer lengths for each setting and each example for different `d` values. - `times_fr_take_d*.json`: CPU time taken for each setting of the find relation question. - `times_yn_take_d*.json`: CPU time taken for each setting of the yes/no question. - `skip_id_d*.json`: Records the example IDs that do not meet the `m` requirements for the object numbers, which will be skipped. - `solution_id_d*.json`: Records the IDs that have solutions (answer length > 0) under each spatial setting. 5. METHODS ---------- Detailed generation settings are described in the paper.