Linked Crunchbase: A Linked Data API and RDF Data Set About Innovative Companies

July 19, 2019 Β· Entered Twilight Β· πŸ› arXiv.org

πŸŒ… TWILIGHT: Old Age
Predates the code-sharing era β€” a pioneer of its time

"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: .gitattributes, .gitignore, README.md, index.html, jquery-1.9.1.min.js, jquery.singlePageNav.js, jquery.singlePageNav.min.js

Authors Michael FÀrber arXiv ID 1907.08671 Category cs.DB: Databases Cross-listed cs.AI, cs.IR Citations 3 Venue arXiv.org Repository https://github.com/ChrisWojcik/single-page-nav ⭐ 158 Last Checked 29 days ago
Abstract
Crunchbase is an online platform collecting information about startups and technology companies, including attributes and relations of companies, people, and investments. Data contained in Crunchbase is, to a large extent, not available elsewhere, making Crunchbase to a unique data source. In this paper, we present how to bring Crunchbase to the Web of Data so that its data can be used in the machine-readable RDF format by anyone on the Web. First, we give insights into how we developed and hosted a Linked Data API for Crunchbase and how sameAs links to other data sources are integrated. Then, we present our method for crawling RDF data based on this API to build a custom Crunchbase RDF knowledge graph. We created an RDF data set with over 347 million triples, including 781k people, 659k organizations, and 343k investments. Our Crunchbase Linked Data API is available online at http://linked-crunchbase.org.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Databases

R.I.P. πŸ‘» Ghosted

Datasheets for Datasets

Timnit Gebru, Jamie Morgenstern, ... (+5 more)

cs.DB πŸ› CACM πŸ“š 2.6K cites 8 years ago