diff options
Diffstat (limited to 'backend/hist_data/enwiki/README.md')
| -rw-r--r-- | backend/hist_data/enwiki/README.md | 10 |
1 files changed, 5 insertions, 5 deletions
diff --git a/backend/hist_data/enwiki/README.md b/backend/hist_data/enwiki/README.md index 29fc2ff..76d33e5 100644 --- a/backend/hist_data/enwiki/README.md +++ b/backend/hist_data/enwiki/README.md @@ -33,12 +33,12 @@ This directory holds files obtained/derived from [English Wikipedia](https://en. # Image Files - `gen_img_data.py` <br> - Used to find infobox image names for page IDs, and store them into a database. + Finds infobox image names for page IDs, and stores them into a database. - `download_img_license_info.py` <br> - Used to download licensing metadata for image names, via wikipedia's online API, and store them into a database. + Downloads licensing metadata for image names, via wikipedia's online API, and stores them into a database. - `img_data.db` <br> - Used to hold metadata about infobox images for a set of page IDs. - Generated using `get_enwiki_img_data.py` and `download_img_license_info.py`. <br> + Holds metadata about infobox images for a set of page IDs. + Generated using `gen_img_data.py` and `download_img_license_info.py`. <br> Tables: <br> - `page_imgs`: `page_id INT PRIMARY KEY, title TEXT UNIQUE, img_name TEXT` <br> `img_name` may be NULL, which means 'none found', and is used to avoid re-processing page IDs. @@ -47,7 +47,7 @@ This directory holds files obtained/derived from [English Wikipedia](https://en. <br> Might lack some matches for `img_name` in `page_imgs`, due to licensing info unavailability. - `download_imgs.py` <br> - Used to download image files into imgs/. + Downloads image files into imgs/. # Description Files - `gen_desc_data.py` <br> |
