diff options
| author | Terry Truong <terry06890@gmail.com> | 2022-05-07 11:09:03 +1000 |
|---|---|---|
| committer | Terry Truong <terry06890@gmail.com> | 2022-05-07 11:09:03 +1000 |
| commit | ad82c9dc1eb35036c4078b9cd36ae0924e1ff0d2 (patch) | |
| tree | 83db4a0308009e7d516daf864bf23897224f0508 /backend | |
| parent | 5f8c7e12b6978e50850b434efbdf4062a4284979 (diff) | |
Update README line breaks
Diffstat (limited to 'backend')
| -rw-r--r-- | backend/data/README.md | 17 | ||||
| -rw-r--r-- | backend/data/enwiki/README.md | 12 | ||||
| -rw-r--r-- | backend/data/eol/README.md | 8 | ||||
| -rw-r--r-- | backend/data/otol/README.md | 8 |
4 files changed, 25 insertions, 20 deletions
diff --git a/backend/data/README.md b/backend/data/README.md index 329de09..209a2cc 100644 --- a/backend/data/README.md +++ b/backend/data/README.md @@ -25,12 +25,17 @@ File Generation Process data.db tables ============== -nodes: name TEXT PRIMARY KEY, children TEXT, parent TEXT, tips INT, p\_support INT -names: name TEXT, alt\_name TEXT, pref\_alt INT, PRIMARY KEY(name, alt\_name) -eol\_ids: id INT PRIMARY KEY, name TEXT -spellfix\_alt\_names -images: eol\_id INT PRIMARY KEY, source\_url TEXT, license TEXT, copyright\_owner TEXT -descs: name TEXT PRIMARY KEY, desc TEXT, redirected INT +- nodes <br> + name TEXT PRIMARY KEY, children TEXT, parent TEXT, tips INT, p\_support INT +- names <br> + name TEXT, alt\_name TEXT, pref\_alt INT, PRIMARY KEY(name, alt\_name) +- eol\_ids <br> + id INT PRIMARY KEY, name TEXT +- spellfix\_alt\_names +- images <br> + eol\_id INT PRIMARY KEY, source\_url TEXT, license TEXT, copyright\_owner TEXT +- descs <br> + name TEXT PRIMARY KEY, desc TEXT, redirected INT spellfix.so =========== diff --git a/backend/data/enwiki/README.md b/backend/data/enwiki/README.md index 8e748c9..e4e1aae 100644 --- a/backend/data/enwiki/README.md +++ b/backend/data/enwiki/README.md @@ -1,22 +1,22 @@ Downloaded Files ================ -- enwiki\_content/enwiki-20220420-pages-articles-*.xml.gz: +- enwiki\_content/enwiki-20220420-pages-articles-*.xml.gz <br> Obtained via https://dumps.wikimedia.org/backup-index.html (site suggests downloading from a mirror). Contains text content and metadata for pages in English Wikipedia (current revision only, excludes talk pages). Some file content and format information was available from https://meta.wikimedia.org/wiki/Data_dumps/What%27s_available_for_download. -- enwiki-20220420-page.sql.gz: +- enwiki-20220420-page.sql.gz <br> Obtained like above. Contains page-table information including page id, namespace, title, etc. Format information was found at https://www.mediawiki.org/wiki/Manual:Page_table. -- enwiki-20220420-redirect.sql.gz: +- enwiki-20220420-redirect.sql.gz <br> Obtained like above. Contains page-redirection info. Format information was found at https://meta.wikimedia.org/wiki/Data_dumps/What%27s_available_for_download. Generated Files =============== -- enwiki\_content/enwiki-*.xml and enwiki-*.sql: +- enwiki\_content/enwiki-*.xml and enwiki-*.sql <br> Uncompressed versions of downloaded files. -- enwikiData.db: +- enwikiData.db <br> An sqlite database representing data from the enwiki dump files. Generation: 1 Install python, and packages mwsql, mwxml, and mwparsefromhell. Example: @@ -31,5 +31,5 @@ Generated Files 4 Run genDescData.py, which reads the page-content xml dumps, and the 'pages' and 'redirects' tables, and associates page ids with (potentially redirect-resolved) pages, and attempts to parse some wikitext within those pages to obtain the first descriptive paragraph, with markup removed. -- .venv: +- .venv <br> Provides a python virtual environment for packages needed to generate data. diff --git a/backend/data/eol/README.md b/backend/data/eol/README.md index d863099..6f1f6c6 100644 --- a/backend/data/eol/README.md +++ b/backend/data/eol/README.md @@ -1,15 +1,15 @@ Downloaded Files ================ -- imagesList.tgz: +- imagesList.tgz <br> Obtained from https://opendata.eol.org/dataset/images-list on 24/04/2022. Listed as being last updated on 05/02/2020. -- vernacularNames.csv: +- vernacularNames.csv <br> Obtained from https://opendata.eol.org/dataset/vernacular-names on 24/04/2022. Listed as being last updated on 27/10/2020. Generated Files =============== -- imagesList/: +- imagesList/ <br> Obtained by extracting imagesList.tgz. -- imagesList.db: +- imagesList.db <br> Represents data from eol/imagesList/*, and is created by genImagesListDb.sh. diff --git a/backend/data/otol/README.md b/backend/data/otol/README.md index 58aad3c..a6f13c2 100644 --- a/backend/data/otol/README.md +++ b/backend/data/otol/README.md @@ -1,6 +1,6 @@ Downloaded Files ================ -- labelled\_supertree\_ottnames.tre - Obtained from https://tree.opentreeoflife.org/about/synthesis-release/v13.4 -- annotations.json - Obtained from https://tree.opentreeoflife.org/about/synthesis-release/v13.4 +- labelled\_supertree\_ottnames.tre <br> + Obtained from https://tree.opentreeoflife.org/about/synthesis-release/v13.4. +- annotations.json <br> + Obtained from https://tree.opentreeoflife.org/about/synthesis-release/v13.4. |
