TamTam Research blog

Czech vocabularies development

Within the LOD-ROADTRAN18 project, we proposed 2 satellite ontologies.

For both ontologies several resources could be used to create linked data record. However it is rather sparse for Roads vocabulary.

Roads vocabulary creation

The Road in Czechia are functionally separated into 3 types: Motorways, national roads, purpose roads and streets. National roads are further separated to 4 classes, where 1st class is maintained at the state level by RSD CR and lower classes at district level by respective administrative domain.

Czech Roads vocabulary
source: TamTam Research

There is no publicly available official database of all roads maintained by either Ministry of Transport or ŘSD ČR. There is however a naumber of semioffcial resources that could be used a a basis to fill in the vocabulary. As a starting point we have used Czech ALERT-C location database from which we extracted information about Motorways and National roads, i.e. Name, Destination and Origin, Location Code. As additional sources, to fill in Road itinerary we used official RSD web combined Wikipedia information on a particular road.

To add linked data element we have connected road records to its counterparts (by partly manual and partly automatic processing) from:

The records for 3rd na 4th functional class could not be connected to anything since there is no comprehensive and complete online resource for such data. During the creation of Road vocabulary, we have considered using INSPIRE published road geometry data, but they were not of the sufficient granularity as well a quality.

Administrative Units vocabulary creation

Administrative units ontology and vocabulary was rather difficult to create because there is a number of different classification schemes and also there are historical administrative areas that have been disintegrated into smaller areas while one of them retaining the name of previous larger one.

There is a problem with NUTS coding scheme because we do not use NUTS1 level, i.e. level 0 a 1 are the same, level 2 are superficial statistical areas and level 3 (the last level) represents topmost Czech administrative units, called Regions, which is quite a difference against Spanish division where NUTS4 represents Districts (in Czech they are LAU1).

Czech ADU vocabulary
source: TamTam Research

There is also a problem with City of Prague, which is a Region itself but it is not divided to District but has ONE municipality divided into several City parts. There is also a problem that Prague actually does not have any district (the level is skipped)in some classifications while in other id does.

There are several official databases for administrative units, as a starting point we have used Czech ALERT-C location database from which we extracted information about administrative levels, hierarchy, names and Location Code. The NUTS and LAU codes were extracted from online databases available from Czech Statistical office.

To add linked data element we have connected a records to its counterparts (by partly manual and partly automatic processing) from:

We have processed the information into separate MS Excel tables, created xml templates and by a transforming code written in Python converted a database into a RDF vocabulary that we imported into out SPARQL endpoint.

Czech Roads template
source: TamTam Research

Logo

The LOD-RoadTran18 that aims to support the re-use of dynamic Road Traffic Data in and across the Czech Republic and Spain is co-financed by the European Health and Digital Executive Agency (HaDEA) (2018-EU-IA-0088) through the CEF Telecom programme.1


  1. The contents of this publication are the sole responsibility of TamTam Research and do not necessarily reflect the opinion of the European Union. 

Comments