An Exercise in Data Analytics on Bibliographic Data

TL;DR straight to results

Andreas Harth, November 2016

Goal

Create a taxonomy about "data intelligence" in the context of Semantic Web, Linked Data, Stream Reasoning, Internet of Things.

Straightforward Approach

Data-driven Approach

Idea: could we use data from the Semantic Web to get an overview of the topic of "data intelligence"?

Groups that likely represent the (vagely defined) term "data intelligence"

Now, how to get relevant topics, papers and persons in these groups?

Step 1: Identify Data Sources

Step 2: Extract and Prepare Data

Step 3: Integrate Data

DBLP schema
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dblp: <http://dblp.org/rdf/schema-2015-01-26#>

DESCRIBE ?x ?y ?paper ?paper1
FROM <focus-${FOCUS}.nt>
FROM <dblp-2016-07-03.nt>
FROM <dblp-citation-good-links.nt>
FROM <ngrams.nt>
WHERE {
  ?s foaf:focus ?x .
  ?paper dblp:authoredBy ?x .
  ?paper dblp:authoredBy ?y .
  OPTIONAL { ?paper1 dblp:authoredBy ?y . }
}

The result are RDF graphs containing data about the groups of researchers:

Step 4: Rank and Visualise

Step 5: Inspect Results

Authors: Data intelligence on the Internet of Things

Authors: The Clinical Data Intelligence Project - A smart data initiative

Stream Reasoning Workshop 2016 Programme Committee

ISWC 2015 Senior PC

ESWC 2016 Area Chairs

Internet Architecture Board Semantic Interoperability in IoT Workshop Chairs

Web Services and Formal Methods Workshop Chairs (3rd to 11th Edition)

Big Data Value Association Officials

Interpretation of Results

Observations

Follow-up Questions

You have an interesting thought regarding the results? Send me an email.

References