aboutsummaryrefslogtreecommitdiff
path: root/backend/hist_data/enwiki/README.md
diff options
context:
space:
mode:
authorTerry Truong <terry06890@gmail.com>2022-10-01 23:14:08 +1000
committerTerry Truong <terry06890@gmail.com>2022-10-02 00:01:01 +1000
commit1b4fc8667714ef4ce9f326bd14f795fc2417ecb9 (patch)
tree474f54e527a1f26e33c92fd54a718e697de75744 /backend/hist_data/enwiki/README.md
parent4edb7998012bcc804482a76277cd25b90fb373c9 (diff)
Add per-event-category image limit
Diffstat (limited to 'backend/hist_data/enwiki/README.md')
-rw-r--r--backend/hist_data/enwiki/README.md24
1 files changed, 12 insertions, 12 deletions
diff --git a/backend/hist_data/enwiki/README.md b/backend/hist_data/enwiki/README.md
index e50c7e2..dd090ca 100644
--- a/backend/hist_data/enwiki/README.md
+++ b/backend/hist_data/enwiki/README.md
@@ -29,6 +29,18 @@ This directory holds files obtained/derived from [English Wikipedia](https://en.
- `redirects`: `id INT PRIMARY KEY, target TEXT`
- `descs`: `id INT PRIMARY KEY, desc TEXT`
+# Page View Files
+- `pageviews/pageviews-*-user.bz2`
+ Each holds wikimedia article page view data for some month.
+ Obtained via <https://dumps.wikimedia.org/other/pageview_complete/monthly/>.
+ Some format info was available from <https://dumps.wikimedia.org/other/pageview_complete/readme.html>.
+- `gen_pageview_data.py` <br>
+ Reads pageview/* and `dump_index.db`, and creates a database holding average monthly pageview counts.
+- `pageview_data.db` <br>
+ Generated using `gen_pageview_data.py`. <br>
+ Tables: <br>
+ - `views`: `title TEXT PRIMARY KEY, id INT UNIQUE, views INT`
+
# Image Files
- `gen_img_data.py` <br>
Used to find infobox image names for page IDs, and store them into a database.
@@ -46,15 +58,3 @@ This directory holds files obtained/derived from [English Wikipedia](https://en.
Might lack some matches for `img_name` in `page_imgs`, due to licensing info unavailability.
- `download_imgs.py` <br>
Used to download image files into imgs/.
-
-# Page View Files
-- `pageviews/pageviews-*-user.bz2`
- Each holds wikimedia article page view data for some month.
- Obtained via <https://dumps.wikimedia.org/other/pageview_complete/monthly/>.
- Some format info was available from <https://dumps.wikimedia.org/other/pageview_complete/readme.html>.
-- `gen_pageview_data.py` <br>
- Reads pageview/* and `dump_index.db`, and creates a database holding average monthly pageview counts.
-- `pageview_data.db` <br>
- Generated using `gen_pageview_data.py`. <br>
- Tables: <br>
- - `views`: `title TEXT PRIMARY KEY, id INT UNIQUE, views INT`