Search for a command to run...
Context. Current observational and simulated large-scale structure (LSS) catalogues often lack consistency in assigning galaxies to specific structures, due to the absence of a universally accepted classification criterion. Aims. With the aim to generate synthetic empirical data for fine-tuning LSS s, as well as to train machine learning (ML) and deep learning (DL) models for the same purpose, this work presents a purely geometrical simulation based on the statistical spatial properties found in LSS surveys, using the spectroscopic main galaxy sample of the Sloan Digital Sky Survey (SDSS) catalogue up to a redshift of z ≃ 0.1 as a specific use case. Methods. A parallelism between the LSS and the 3D Voronoi tessellation was utilised, in which the nodes, links, surfaces, and cells of the diagram correspond to clusters, filaments, walls, and voids, respectively. The simulation used random positions within voids as seeds for tessellating the 3D space. The resulting tessellation structures were then randomly populated with galaxies that adhere to the statistical properties of their observational respective structures. As the galaxies were generated, they were tagged with their corresponding structure. Results. In each simulation, six LSS mock catalogues were generated, following the statistical behaviour observed in the SDSS catalogue, depending on the structure they belong to. In addition, the Malmquist bias and the redshift-space distortion, known as the Fingers of God (FoG) effect, were simulated as well. Conclusions. We present a novel geometrical LSS simulator, where generated galaxies mimic the statistical properties of their observational belonging structure. As an example, the simulator was tuned to mimic the SDSS catalogue, although any other catalogue can be considered in similar studies. With the generated catalogue, it is possible to adjust the LSS classification algorithms, train and test ML and DL models, and benchmark several LSS classification methods using this pre-labelled data to compare and contrast their results and performance.