Concept2vec: Metrics for Evaluating Quality of Embeddings for Ontological Concepts

March 12, 2018 · Entered Twilight · 🏛 arXiv.org

"Last commit was 7.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: Code, Data, README.md, Results, Tasks (Random Data Sampling)

Authors Faisal Alshargi, Saeedeh Shekarpour, Tommaso Soru, Amit Sheth arXiv ID 1803.04488 Category cs.CL: Computation & Language Cross-listed cs.AI Citations 13 Venue arXiv.org Repository https://github.com/alshargi/Concept2vec ⭐ 15 Last Checked 1 month ago

Abstract

Although there is an emerging trend towards generating embeddings for primarily unstructured data and, recently, for structured data, no systematic suite for measuring the quality of embeddings has been proposed yet. This deficiency is further sensed with respect to embeddings generated for structured data because there are no concrete evaluation metrics measuring the quality of the encoded structure as well as semantic patterns in the embedding space. In this paper, we introduce a framework containing three distinct tasks concerned with the individual aspects of ontological concepts: (i) the categorization aspect, (ii) the hierarchical aspect, and (iii) the relational aspect. Then, in the scope of each task, a number of intrinsic metrics are proposed for evaluating the quality of the embeddings. Furthermore, w.r.t. this framework, multiple experimental studies were run to compare the quality of the available embedding models. Employing this framework in future research can reduce misjudgment and provide greater insight about quality comparisons of embeddings for ontological concepts. We positioned our sampled data and code at https://github.com/alshargi/Concept2vec under GNU General Public License v3.0.