"Data-driven Assessment of Structural Evolution of RDF Graphs"

Execution Times of Global Comparisons (back to main page)

In this page, we can see the execution times of the different global comparisons performed in the paper. In the following tables, we can see the raw execution times (Its associated heat map version can be found here). Recall that the versions in the rows are the ones against which we compare the versions in the columns (i.e., we codify the transactions contained in the versions of the columns). Finally, note that all the comparisons take below 5 minutes, which shows the scalability of our approach.

 Execution Times (seconds)
 DBpedia 3.6DBpedia 3.7DBpedia 3.8DBpedia 3.9DBpedia 2014DBpedia 2015-10DBpedia 2016-10
DBpedia 3.6 102.443117.010170.439180.098277.487272.551
DBpedia 3.784.796 87.725131.463171.871214.211211.441
DBpedia 3.870.71270.870 98.002124.336175.739182.236
DBpedia 3.970.64277.64569.110 119.715164.336180.310
DBpedia 201464.70776.04580.116114.240 156.982169.473
DBpedia 2015-1072.48685.57395.254118.817138.493 193.315
DBpedia 2016-1068.58579.54783.774136.107127.810157.255

Here, we can find the details of each dataset:
 Dataset Details
 NonSingletonCodesAlphabet Size#Transactions#ItemsAverage Row
DBpedia 3.61,55416,4662,476,53840,567,13816.38
DBpedia 3.71,19926,8102,899,98964,303,68022.17
DBpedia 3.885929,4163,581,78383,231,51023.24
DBpedia 3.985137,1364,685,189114,064,97724.35
DBpedia 201463745,1625,063,500159,266,38931.45
DBpedia 2015-1090761,5805,948,202206,837,39634.77
DBpedia 2016-1074161,1986,601,796209,169,34231.68

In order to show a tendency, we have to analyze the cost of the algorithms used to calculate the similarity. Without entering the details, the main components of the cost of codification of a database are the number of nonSingleton codes that we have to check for each transaction, and the number of transactions (the actual size of the transaction should be also in the equation, but is usually much smaller than the other two terms, so for the tendency we consider it to be a constant). Naively, each measure implies the codification of the database twice, once with each code table, so to show the tendency of the algorithm, the following graph show the execution times against the sum of nonSingleton codes in both code tables times the number of transactions. We can see how the execution times follow a linear tendency on this variable (/NonSingletonCodes/*/#Transactions/), which makes our approach really scalable.