aboutsummaryrefslogtreecommitdiff
path: root/backend/tol_data/dbpedia/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'backend/tol_data/dbpedia/README.md')
-rw-r--r--backend/tol_data/dbpedia/README.md29
1 files changed, 29 insertions, 0 deletions
diff --git a/backend/tol_data/dbpedia/README.md b/backend/tol_data/dbpedia/README.md
new file mode 100644
index 0000000..a708122
--- /dev/null
+++ b/backend/tol_data/dbpedia/README.md
@@ -0,0 +1,29 @@
+This directory holds files obtained/derived from [Dbpedia](https://www.dbpedia.org).
+
+# Downloaded Files
+- `labels_lang=en.ttl.bz2` <br>
+ Obtained via https://databus.dbpedia.org/dbpedia/collections/latest-core.
+ Downloaded from <https://databus.dbpedia.org/dbpedia/generic/labels/2022.03.01/labels_lang=en.ttl.bz2>.
+- `page_lang=en_ids.ttl.bz2` <br>
+ Downloaded from <https://databus.dbpedia.org/dbpedia/generic/page/2022.03.01/page_lang=en_ids.ttl.bz2>
+- `redirects_lang=en_transitive.ttl.bz2` <br>
+ Downloaded from <https://databus.dbpedia.org/dbpedia/generic/redirects/2022.03.01/redirects_lang=en_transitive.ttl.bz2>.
+- `disambiguations_lang=en.ttl.bz2` <br>
+ Downloaded from <https://databus.dbpedia.org/dbpedia/generic/disambiguations/2022.03.01/disambiguations_lang=en.ttl.bz2>.
+- `instance-types_lang=en_specific.ttl.bz2` <br>
+ Downloaded from <https://databus.dbpedia.org/dbpedia/mappings/instance-types/2022.03.01/instance-types_lang=en_specific.ttl.bz2>.
+- `short-abstracts_lang=en.ttl.bz2` <br>
+ Downloaded from <https://databus.dbpedia.org/vehnem/text/short-abstracts/2021.05.01/short-abstracts_lang=en.ttl.bz2>.
+
+# Other Files
+- `gen_desc_data.py` <br>
+ Used to generate a database representing data from the ttl files.
+- `desc_data.db` <br>
+ Generated by `gen_desc_data.py`. <br>
+ Tables: <br>
+ - `labels`: `iri TEXT PRIMARY KEY, label TEXT `
+ - `ids`: `iri TEXT PRIMARY KEY, id INT`
+ - `redirects`: `iri TEXT PRIMARY KEY, target TEXT`
+ - `disambiguations`: `iri TEXT PRIMARY KEY`
+ - `types`: `iri TEXT, type TEXT`
+ - `abstracts`: `iri TEXT PRIMARY KEY, abstract TEXT`