Sources & Methodology

KitabiHub is built from open bibliographic data. This page explains exactly where the records come from, how they're filtered, and how often we re-build the catalogue.

Current build

  • Primary source: Open Library Search API (https://openlibrary.org/developers/api)
  • Records indexed: 600
  • Last build: May 3, 2026

Primary source: Open Library

Our default source is Open Library, an open, editable catalogue of more than 20 million books maintained by the Internet Archive. We query the Search API filtered to language:ara across the major Arabic literary subjects — literature, poetry, fiction, novels, drama, short stories — to surface Arabic-language works.

For each record we pull the title, author(s), publication year, subject tags, edition count, cover image identifier, and the longest available description. We then apply our own normalisation: deduplicating editions of the same work, mapping subject tags to KitabiHub's editorial categories, slugifying titles for clean URLs, and writing the result to disk as a static JSON file the PHP templates read at request time.

Fallback source: Wikidata

If the Open Library API is unavailable or returns insufficient records during a build, KitabiHub falls back to the Wikidata SPARQL endpoint. We query for literary works (Q7725634) in Arabic, joined to their Arabic authors via country-of-citizenship across Arab League states, retrieving title, author, genre, publication year, and the lead paragraph from the linked Wikipedia article.

Editorial filtering

We exclude:

  • Translations into Arabic — KitabiHub catalogues works originally written in Arabic.
  • Religious scripture and primary religious texts (these belong in dedicated specialist catalogues).
  • Textbooks, grammars, and language-learning materials.
  • Records with no usable title or author.

Categories

The category vocabulary on KitabiHub is editorial, not algorithmic. We map raw subject tags from the source databases onto a smaller, more readable set of literary categories: Poetry, Novels, Short Stories, Classical Arabic Literature, Modern Arabic Fiction, Drama, Essays & Letters, and a handful of regional and thematic shelves. Each book may appear in multiple categories.

Cover images

Cover thumbnails are loaded directly from the source catalogue's image CDN — we do not re-host them. If a cover is missing or has been taken down upstream, you'll see KitabiHub's neutral placeholder.

Update cadence

We re-build the full catalogue on a rolling schedule, typically every few weeks, so newly-indexed titles upstream eventually appear here. Major source-data corrections (a renamed author, a re-categorised work) propagate automatically on the next build.

Reproducibility

Because every page is rendered from the same on-disk JSON snapshot, two visits to the same URL on the same day will return identical content — handy for citing KitabiHub in academic work. The build date above tells you which snapshot you're looking at.