From db9321ca32f283f7fd59e2e8b5f8a695c66fce98 Mon Sep 17 00:00:00 2001 From: Terry Truong Date: Wed, 28 Dec 2022 19:46:34 +1100 Subject: Use scale-associated event scores for uniform spread Add 'dist' and 'scores' db tables Add 'scale' server query parameter (and remove 'excl') Disregard event-end when querying database --- backend/hist_data/README.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) (limited to 'backend/hist_data/README.md') diff --git a/backend/hist_data/README.md b/backend/hist_data/README.md index bfecc1e..1a832ef 100644 --- a/backend/hist_data/README.md +++ b/backend/hist_data/README.md @@ -27,9 +27,15 @@ This directory holds files used to generate the history database data.db. - `event_imgs`:
Format: `id INT PRIMARY KEY, img_id INT`
Assocates events with images -- `descs`
+- `descs`:
Format: `id INT PRIMARY KEY, wiki_id INT, desc TEXT`
Associates an event's enwiki title with a short description. +- `dist`:
+ Format: `scale INT, unit INT, count INT, PRIMARY KEY (scale, unit)`
+ Maps scale units to event counts. +- `scores`:
+ Format: `id INT, scale INT, score INT, PRIMARY KEY (id, scale)`
+ Maps events to score values for each scale (used to show events by popularity and uniformity across scale). # Generating the Database @@ -76,3 +82,6 @@ Some of the scripts use third-party packages: ## Remove Events Without Images/Descs 1. Run `reduce_event_data.py` to remove data for events that have no image/description. + +## Generate Distribution and Score Data +1. Run `gen_score_data.py`, which add the `dist` and `scores` tables. -- cgit v1.2.3