Swahili-English Dictionary Data and Interface

This repository contains code that extracts data from a Swahili-English dictionary in PDF form into a JSON dataset. Additionally it contains a simple CLI for searching through the data as well as a PWA serving the same purpose.

An article explaining the code in this repo is can be read online here or from docs/kamusi.qmd

Usage

The finished results are already part of the repo and can be used in two ways. The CLI interface requires the availability of duckdb and a Python environment with rich installed. First load the database into DuckDB and create a FTS index with:

duckdb data/kamusi.db < create_kamusi.sql

Then query for a term by passing it as an argument to query_kamusi.py

Using the PWA can be done by visiting the online version here or alternatively starting a server in the website directory of this repo:

cd website && python -m http.server

Alternatively if you want to run the extraction code then install the project's Python requirements:

pip install --requirement requirements.txt

Then run python kamusi.py

TODO: add link to GitHub repo to PWA

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
doc		doc
docs		docs
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
create_index.js		create_index.js
create_kamusi.sql		create_kamusi.sql
justfile		justfile
kamusi.py		kamusi.py
pyproject.toml		pyproject.toml
query_kamusi.py		query_kamusi.py
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Swahili-English Dictionary Data and Interface

Usage

About

Languages

julius383/kamusi

Folders and files

Latest commit

History

Repository files navigation

Swahili-English Dictionary Data and Interface

Usage

About

Resources

Stars

Watchers

Forks

Languages