OntoSplit

This repository contains the dataset, experimental scripts, and results for the study on ontology partitioning and SPARQL query optimization. The research focuses on improving the execution time of complex SPARQL queries by splitting large RDF/XML ontologies and leveraging parallel query execution in Apache Jena Fuseki.

📂 Contents:

📂 ./data - stores all ontology (or RDF/XML structures) files (original and partitioned), as well as any sample datasets or additional resources needed for experimentation and demonstrations.
📊 ./benchmarking-data – experiments data
📊 ./benchmarking-data/benchmark.xlsx – final tests results data: tables, charts
📜 ./benchmarking-data/sparql-queries – test SPARQL queries categorized by execution time (fast -1, medium -2, slow - 3)
📜 ./benchmarking-data/results-time - contains JSON files capturing the execution time for SPARQL queries of different categories (fast -1, medium -2, slow - 3) across various ontology partition configurations (1–15 parts)
🔧 ./benchmarking-data/scripts - Python scripts for benchmarking execution and results calculation
🔧 ./scripts/ontology-creation - Python scripts for ontology creation (PDF to JSON; JSON to XML/RDF ontology with different splitting options)
📕 ./parsed-pdfs-json - Stores files related to PDFs from the Dataset, including original PDFs (optional) and JSON outputs resulting from parsing scripts
📖 ./docs/ – methodology, findings, and implementation details - TODO

🚀 Sponsor this project

Please support @malakhovks. Despite the Wartime in Ukraine, R&D in the field of Digital Health and Ontology Engineering are being resumed:

Via credit card: https://send.monobank.ua/jar/5ad56oNAcD

Public Address to Receive USDT (BEP20): 0x1128A7b84728123dd4F55176c378754Dd396A674

Pay me via Trust Wallet: https://link.trustwallet.com/send?asset=c20000714_t0x55d398326f99059fF775485246999027B3197955&address=0x1128A7b84728123dd4F55176c378754Dd396A674

🔍 Key Topics:

SPARQL query optimization
Ontology partitioning (sharding)
Parallel query execution
Apache Jena Fuseki performance benchmarking
Semantic Web & RDF processing

🚀 Future Work:

The repository will be updated with further optimizations, including machine learning-based query performance prediction and dynamic ontology partitioning.

Contributions and discussions are welcome!

📖 How to Cite

If you use this repository in your research, please cite it as follows:

🔹 APA citation format for articles:

Palagin, O.V., Petrenko, M.G., Kaverinskiy, V.V., & Malakhov, K.S. (2025). Method for Increasing the Efficiency of OWL/RDF-Structures Processing in Apache Jena Semantic Web Framework Environment. Cybernetics and Systems Analysis, __(_), __ - __. https://doi.org/
Kaverinskiy, V.V., Petrenko, M.G., & Malakhov, K.S. (2025).

🔹 BibTeX citation format for repository:

@misc{OntoSplit,
  author = {Kyrylo Malakhov and Vladislav Kaverinskiy},
  title = {OntoSplit: Ontology Partitioning and SPARQL Query Optimization},
  year = {2024},
  howpublished = {GitHub Repository},
  url = {https://github.com/knowledge-ukraine/OntoSplit}
}

📕 Dataset

EBSCO articles dataset (domain knowledge: rehabilitation medicine) + JSON of every article

wget -O ./ebsco-rehabilitation-dataset.zip https://cdn.e-rehab.pp.ua/u/ebsco-rehabilitation-dataset.zip

💳 Funding

This study would not have been possible without the financial support of the National Research Foundation of Ukraine (Open Funder Registry: 10.13039/100018227). Our work was funded by Grant contract:

Development of the cloud-based platform for patient-centered telerehabilitation of oncology patients with mathematical-related modeling, application ID: 2021.01/0136.

Name	Name	Last commit message	Last commit date
Latest commit malakhovks Update README.md Feb 21, 2025 b00159c · Feb 21, 2025 History 7 Commits
benchmarking-data	benchmarking-data	Add more benchmarking data	Feb 21, 2025
data	data	Add all data	Feb 12, 2025
parsed-pdfs-json	parsed-pdfs-json	Add all data	Feb 12, 2025
scripts/ontology-creation	scripts/ontology-creation	Add all data	Feb 12, 2025
LICENSE	LICENSE	Add all data	Feb 12, 2025
README.md	README.md	Update README.md	Feb 21, 2025
logo_nrfu_eng.png	logo_nrfu_eng.png	Add all data	Feb 12, 2025
usdt-bsc.jpg	usdt-bsc.jpg	Add all data	Feb 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OntoSplit

📂 Contents:

🚀 Sponsor this project

🔍 Key Topics:

🚀 Future Work:

📖 How to Cite

📕 Dataset

💳 Funding

About

Releases

Packages

Languages

License

knowledge-ukraine/OntoSplit

Folders and files

Latest commit

History

Repository files navigation

OntoSplit

📂 Contents:

🚀 Sponsor this project

🔍 Key Topics:

🚀 Future Work:

📖 How to Cite

📕 Dataset

💳 Funding

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages