1
1. This paper is a part of the family of policy and guidance documents associated with the cross-Government Enterprise Architecture (xGEA). Further guidance will be created to cover related topics such as:
1. This paper is a part of the family of policy and guidance documents associated with the cross-Government Enterprise Architecture (xGEA). Further guidance will be created to cover related topics such as:3
2. This document defines the design considerations and guidance by which UK public sector Universal Resource Identifier (URI) sets should be developed and maintained. They are designed both to encourage those that definitively own reference data to make it available for re-use, and to give those that have data that could be linked, the confidence to re-use a URI set that is not under their direct control.
2. This document defines the design considerations and guidance by which UK public sector Universal Resource Identifier (URI) sets should be developed and maintained. They are designed both to encourage those that definitively own reference data to make it available for re-use, and to give those that have data that could be linked, the confidence to re-use a URI set that is not under their direct control.5
- Owners of reference data in the UK public sector
- Data owners who wish to improve the re-use of their data by incorporating URIs that they do not control
- Solution providers to the UK public sector
6
4. Some definitions and frameworks are laid out to define the types of resources that URIs can name, and the relationships between those types. A number of principles are then proposed against which a series of design considerations are made.
4. Some definitions and frameworks are laid out to define the types of resources that URIs can name, and the relationships between those types. A number of principles are then proposed against which a series of design considerations are made.8
- Choosing the right domain for URI sets
- The path structure for URIs
- Coping with change and the passage of time
- How to ‘look up’ a URI
- The quality characteristics that apply to all URIs within a set
- Machine-readable and human-readable formats
- The governance arrangements necessary to allow the confidence to use and re-use UK public sector URIs
10
2
5. Typically, government departments and agencies keep a list for each type of ‘Thing’ that they are responsible for, or handle in some way, and associate an identifying reference to each entry on the list. They then make use of that identifier as they make statements about the ‘Thing’ in their data. The lists therefore contain the ‘Reference Data’ that provide a common meaning and common identifier to refer to the same ‘Thing’ within that department or agency.
2
5. Typically, government departments and agencies keep a list for each type of ‘Thing’ that they are responsible for, or handle in some way, and associate an identifying reference to each entry on the list. They then make use of that identifier as they make statements about the ‘Thing’ in their data. The lists therefore contain the ‘Reference Data’ that provide a common meaning and common identifier to refer to the same ‘Thing’ within that department or agency.11
6. URI sets provide an opportunity to share common meaning and common identifiers across the public sector, and with the public, to join-up otherwise disparate data from many sources. Those that have confidence that the set is fit for their purpose are then likely to re-use it, rather than create their own.
6. URI sets provide an opportunity to share common meaning and common identifiers across the public sector, and with the public, to join-up otherwise disparate data from many sources. Those that have confidence that the set is fit for their purpose are then likely to re-use it, rather than create their own.12
1
7. Universal Resource Identifiers (URIs), a component of the World Wide Web, provide one means of uniquely naming a ‘Thing’ (or ‘Resource’). The principles of ‘Linked Open Data’ rely on the RDF data model, where statements are made about resources, identified by URI(s).
1
7. Universal Resource Identifiers (URIs), a component of the World Wide Web, provide one means of uniquely naming a ‘Thing’ (or ‘Resource’). The principles of ‘Linked Open Data’ rely on the RDF data model, where statements are made about resources, identified by URI(s).13
8. URI sets will be an integral component of a UK Public Sector Information Architecture that supports many goals including the release of government data, reduced duplication, and increased information sharing towards transforming government services.
8. URI sets will be an integral component of a UK Public Sector Information Architecture that supports many goals including the release of government data, reduced duplication, and increased information sharing towards transforming government services.14
1
9. URI sets can be published by the UK public sector to provide comprehensive and reliable identifiers for ‘Things’ such as schools, roads, legislation, locations, projects, events and so on. Where the quality of these sets can be described consistently, other data owners will have the confidence to re-use them in their own data, leading to a web of data that can be linked, queried, and aggregated.
1
9. URI sets can be published by the UK public sector to provide comprehensive and reliable identifiers for ‘Things’ such as schools, roads, legislation, locations, projects, events and so on. Where the quality of these sets can be described consistently, other data owners will have the confidence to re-use them in their own data, leading to a web of data that can be linked, queried, and aggregated.15
10. The existing UK public sector standards for metadata and ‘findability’ work well when applied to documents, but are not sufficient to support a ‘Web of Data’, where each individual statement can be queried and linked.
10. The existing UK public sector standards for metadata and ‘findability’ work well when applied to documents, but are not sufficient to support a ‘Web of Data’, where each individual statement can be queried and linked.17
11. As at September 2009, there are only a handful of early adopters of URI sets in the UK public sector, such as
11. As at September 2009, there are only a handful of early adopters of URI sets in the UK public sector, such as19
It is noticeable that, while each has faced similar design issues and choices, the implementations are quite different.
It is noticeable that, while each has faced similar design issues and choices, the implementations are quite different.20
12. Much of the design in this paper is based on established and emerging good practice, whereas some implementation decisions are made to meet the specific needs of the UK public sector. In brief, these include:
12. Much of the design in this paper is based on established and emerging good practice, whereas some implementation decisions are made to meet the specific needs of the UK public sector. In brief, these include:21
4
4
- Use of data.gov.uk as the domain to root those URI sets that are promoted for re-use
- Organisation of URI sets into ‘sectors’ (e.g. education, transport, health ) with a lead department or agency
- Consistent use of metadata to describe the quality characteristics of each URI set
23
13. This document is the result of a series of workshops organised by the Chief Technology Officer (CTO) Council’s Information Domain during July and August 2009, and wider feedback to early drafts.
13. This document is the result of a series of workshops organised by the Chief Technology Officer (CTO) Council’s Information Domain during July and August 2009, and wider feedback to early drafts.24
14. As URI sets are built and trialled, some scenarios may emerge that suggest that a rule may not be appropriate in some circumstances. Similarly, it may become apparent that there is value in considering that a piece of guidance should become a rule. It can therefore be expected that the interim design will be tested, challenged and proved by some early adopters, leading to a refresh.
14. As URI sets are built and trialled, some scenarios may emerge that suggest that a rule may not be appropriate in some circumstances. Similarly, it may become apparent that there is value in considering that a piece of guidance should become a rule. It can therefore be expected that the interim design will be tested, challenged and proved by some early adopters, leading to a refresh.25
15. Some scenarios, such as defining locations, may not fit well with the general principles of naming resources in this way. A further refresh of the design will illustrate how various scenarios are incorporated and aligned with sector-specific approaches to publishing reference data.
15. Some scenarios, such as defining locations, may not fit well with the general principles of naming resources in this way. A further refresh of the design will illustrate how various scenarios are incorporated and aligned with sector-specific approaches to publishing reference data.27
- Technical implementation guidance and bindings
- Worked examples
- Glossary of terms and definitions
- Links to the published material that was used to support this guidance
- Links to further material and good practice
Tags: BBC, Chief Technology Officer, CTO, government services, Governor, RDF, United Kingdom, web-based community
Table of Contents
Comments
Commenters
How will “sectors” work if departments are reorganised. Will we see, say, “data.gov.uk/schools/” being redirected to “data.gov.uk/dcfs/” ?
I assume from reading the paper, that “Sectors” MUST remain static – so is the answer “they won’t change”?? I assume that somehow, if you reorganise from schools to dcfs, that the sector won’t change, the data.gov.uk/schools will not change, but somehow, your local “pointers” will now say “dcfs” POINTS TO data.gov.uk/schools.
Please someone tell me if this is correct, or if I am completely lost here.
Thanks!
Yes the intention is that the URI components describe broad categories for the sets of Things, for example, not the political bodies that deal with them. ‘schools’ are always Schools, regardless of this year’s name for the department that looks after them.
The trick will be ensure that the selected names really are static over the long-term. There needs to be careful thought about the future – and the past. Once upon a time, you might have had “polytechnics” in there, for example — a redundant term now. So the components need to be extremely generic.
“Where the quality of these sets can be described consistently” – I think therein lies the key, and also the biggest stumbling block.
Some concepts probably will remain consistent for a long, long time, such as a school, which is a “thing”, a place to obtain education. But other concepts might not be so easy, there could be more than one accepted term for an important, unique thing. And yes, you can use the “pick one and stick with it method” but invariably you will pick the term that then goes out of favour in five years’ time. Does one also assume that the entire concept of “level” just disappears, that all URI terms are created equal – which doesn’t bother me, I’ve seen the dissolution of the hierarchical approach on the way for years, and I welcome it rather than fear it (and in anticipation, am building my own hierarchies more and more shallow, as much as possible until URIs rule the world anyway) but what happens when one URI represents what is generally recognised, by some significant criteria, to always be a “subset” of another URI ??? Do they exist on an equal plane, even though in their previous existence they were on different levels? What if we are seeking one term but don’t know it, and we approach via what would normally be a subset term, which leads us to the real term – shouldn’t it be clear that this subset should be at a lower “level” or somehow designate that it is NOT the primary term. I guess I am saying – how do we handle synonyms, related terms, and what used to be “lower level” or “navigational” terms?
I’d like to get a clearer picture of this, and I wonder if indeed this is a totally flat (?), or very shallow (?) structure?
It seems that it should be, but then how do all the relationships I am talking about here get “described”?
I hope this makes sense, it’s difficult to articulate.
Thank you,
Dave
Although a matter of implementation rather than definition, the devolved nature of government together with government commitments to equality of languages needs to be referenced here, if not elsewhere in the document. Not have a clear recognition, appreciation or methodology to handle these issues might see the work become geographically ghettoised, in contradiction of the UK-wide focus of the intentions.
Bill,
I am not quite sure I understand your reply. I feel that Councils’ committment to using common languages (such as those created via esd-toolkit) is less than what it should be, and I feel that most Councils’ haven’t even caught up with those languages yet, they don’t really understand them or use them, although they may be STARTING to.
So how Councils can make the leap from not even having implemented understood such basic lists as the esd-toolkit published Scottish Services List (SSL), to the even more abstract concept of the URI – I don’t see that they are “ready”.
And while someone like our good Peter Winstanley or your good self can “explain” these topics to SOME people who work for Councils – well, I’ve worked with hierarchies, categorisation schemes, and standardised languages for over a decade, and I am struggling to “see” how the URI concept “works” – but moreso, from a practical level, I am not sure that Councils even understand the concept that they have a “mapping” to “Things” – so there is this thing, called “School” and the Council has a department which is a “pointer” to that Thing.
Instead, the perception quickly becomes, that the department IS the Thing, or it represents it directly, and the reality of the abstract connection (e.g. the concept that the POINTER can and will change, based on political flavour of the moment, but the THING remains static) becomes lost.
So for us, it used to be called “Children’s Services”. That meant – “Schools”. But…it also meant “Psychological Services”. Sometimes. OR maybe “Social Care For Children”. Sometimes. It meant a LOT of things, but, very quickly, it is assumed that if you said “Children’s Services” you were probably talking about one or two primary “Things”. Schools mainly. But it was never…clear.
So I don’t think there is ANY awareness, at the purely abstract level that the URI concept needs to operate at to succeed, of the fact that Councils have multiple, everchanging names for departments, that “Point” at “Things” that are UNCHANGING.
How to get this across? I am not even happy with this post, I’ve explained it REALLY badly. And if I can’t explain it, how will you be able to?
There is supposed intention to honour the concept of common languages. But the reality is – very few have implemented; some are implementing; many are probably still at square one, saying “what is IPSV? what is SSL? what is SNL? what is LRCRS?”. And what on EARTH am I supposed to DO with them? Imagine how folk at that level would react to the language used in the conversations we are having about URI?
How can the truly “floating” real terms that are unchanging, real Things – how can we get people to see them as a set of real, static objects, and understand that their department is a CONNECTION, a POINTER to a Thing or Things?
I really think 99% of people don’t “get it” or ever think about it. Which means…you have more than an uphill battle both selling the concept of, and populating the terms of, the URI.
But I hope you win this battle. I would like to help, but I also have a sick feeling in my stomach that this technology is misunderstood, and it’s arrived about five or ten years TOO EARLY for Councils to fully embrace.
Please prove me wrong.
Thanks – and if you could clarify your post please Bill, I really did not understand it very well.
Cheers, Dave
It may help here to include a real example for context.
Hi Steve
Do you mean a real example of a list of “Things” ?? Point 10 ??
I mean, to my mind, that could be anyTHING. It might be a concept that “everyone knows” – for example, here at my Council, if we see ES, that means, to us, Environment Services. That’s a thing that was defined, and everybody used the same name, Environmental Services, to describe that arm of the service. However – now, that is obsolete. That service has been split into many services – so that definition is now obsolete.
So now we have to get used to some new concepts, new service arms with new names. In a few years time, we will all “agree” certain terms mean certain “things”.
The mindset of the URI group is to find ways to name things so they can WITHSTAND a change like I describe.
The challenge is, how to name “things” so you don’t ever need to CHANGE their name.
So – at a lower level, if we could all agree to call “Housing” “Housing” – and not some other term, and we could agree that that is a “THING” that WILL NOT CHANGE, it’s permanent, people will always need housing, and housing of some kind will always exist. So that MIGHT be an example of a truly static “THING” – maybe.
However, what doesn’t help, is when you do one of two things 1) keep changing the name just to put your own stamp or ego onto it – aka a change of a political nature or 2) keep referring to a single thing by multiple nicknames instead of by the standard “THING” name.
Or am I completely out of line here?
CAN someone post a “real example” ??
It’s all about finding a single, repeatable, unique name that will not need to be changed every few years – it can be kept static, PERMANENTLY.
That’s what I think we are talking about. We need to define what those are. “Housing” is the one example I can readily think of, it seems quite unique, but other “THINGS” may not be so easily made unique…
Sigh.
Dave