Search for a command to run...
What's Changed [PR] Added Information About SC2AnonServerPy by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/4 [PR] Final Article Version Pull Request by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/3 [PR] Re-introduced the DatasetPreparator section, fixing section levels by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/5 [PR] Fixing Enumeration in Markdown by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/6 [PR] Sync Dev With Main by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/7 [PR] Fixing Links in README, Fixing References Headin by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/8 [PR] Generate Paper Pull Request Fix by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/10 [PR] Latest Revision After Review by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/11 [PR] Fixing Known Bad DOIs by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/12 [PR] Bibtex Edits by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/16 [PR] Content Edits by @Kaszanas in https://github.com/Kaszanas/SC2Tools/pull/15 New Contributors @Kaszanas made their first contribution in https://github.com/Kaszanas/SC2Tools/pull/4 Full Changelog: https://github.com/Kaszanas/SC2Tools/compare/1.0.0...1.1.0 SC2Tools: StarCraft II Toolset and Dataset API This repository contains a comprehensive toolset for working with StarCraft II replay files and datasets. The tools span multiple programming languages and are included as git submodules for easy management and development. Quickstart What Each Tool Does SC2InfoExtractorGo: Extracts detailed game data from .SC2Replay files into JSON format, for anonymization see SC2AnonServerPy. DatasetPreparator: Prepares and organizes large replay datasets for processing. SC2AnonServerPy: Provides anonymization gRPC service for player data and chat messages, works with SC2InfoExtractorGo. SC2_Datasets: Python library for loading and working with processed SC2 datasets. For a comprehensive information on each tool, please refer to their individual README.md files. Prerequisites Docker (recommended) or: Go 1.19+ for SC2InfoExtractorGo Python 3.10+ for Python-based tools Poetry for Python dependency management Git for submodule management [!NOTE] DatasetPreparator software container image contains the SC2InfoExtractorGo by default. Please refer to DatasetPreparator README for more details. Docker Usage (Recommended) The easiest way to get started is using our pre-built Docker images: Collect your .SC2Replay files into a replaypack, for example: replaypack_1/*.SC2Replay. If you do not have any replays, and you wish to run the following example, you can download some replaypacks from SC2ReSet HuggingFace or SC2ReSet Zenodo. Pull and run DatasetPreparator (full processing pipeline): Run the following to see available options: docker pull kaszanas/datasetpreparator:latest docker run -it --rm \ -v "${PWD}/processing":/app/processing \ kaszanas/datasetpreparator:latest \ python sc2egset_pipeline.py --help Place your replaypack directories in ./processing/data/replays directory. For example: ./processing/data/replays/replaypack_1/*.SC2Replay ./processing/data/replays/replaypack_2/*.SC2Replay To run the full processing pipeline (as in SC2ReSet and SC2EGSet datasets), execute: docker run -it --rm \ -v "${PWD}/processing/data":/app/processing/data \ kaszanas/datasetpreparator:latest \ python sc2egset_pipeline.py \ --input_path processing/data/replays \ --output_path processing/data/output \ --maps_path processing/maps \ --n_processes 4 \ --force_overwrite True To verify if everything worked correctly, check the generated logs and the processing/data/output directory for processed files. The directory_flattener directory should contain the structure as in the input directory, but with a single level directories containing raw .SC2Replay files and a mapping from the old directory structure to the filenames processed_mapping.json. Moreover, the sc2egset_replaypack_processor directory should contain the output from SC2InfoExtractorGo ran with the same arguments as the SC2EGSet dataset processing. Finally SC2ReSet and SC2EGSet directory should contain the raw replay files as organized in the respective datasets. Installation (Without Docker) Clone the repository with submodules: git clone --recurse-submodules https://github.com/Kaszanas/SC2Tools.git cd SC2Tools Initialize and update submodules: git submodule update --init --recursive [!NOTE] At this point, you should be able to use the tools directly on your system if you have the necessary dependencies installed. Please refer to each tool's README.md for specific installation and usage instructions. Individual Tool Documentation Each tool has its own comprehensive documentation: SC2InfoExtractorGo Documentation DatasetPreparator Documentation SC2AnonServerPy Documentation SC2_Datasets Documentation Contributing Contributions are welcome`! Please see the individual tool repositories for contribution guidelines and development setup instructions. Licenses [!NOTE] Each of the repositories (submodules) contains a separate license. Please refer to the respective submodule for its specific license terms.