| Size: 21592 Comment:  | Size: 42191 Comment: Final edit?!? after Alun's comments on the summary | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 1: | Line 1: | 
| This page contains notes building towards a formal document regarding the role of Shibboleth with grids.  It necessarily challenges some basic assumptions of the way that authentication and authorisation are currently managed in grids. This work forms the bulk of the eSP-grid workpackage five (Shibboleth Evaluation). | ## page was renamed from ShibEvaluation This work forms the output of the ESP-GRID workpackages five and six (PKI and Shibboleth Evaluations). ---- = Summary = This document examines the theoretical applicability of Shibboleth and of client-certificate based public key infrastructure (PKI) to use within a grid environment. The analysis is addressed through examinations of identity management (where and by whom?), attribute management, levels of assurance and trust. Shibboleth is highly compatible with the current 'Information Environment' standards and procedures of identity management and role based access control. PKI has been seen as more secure, with its authentication assertions having a higher level of assurance. However, there are flaws in the implementations of PKI-based identity management and the solutions to these flaws could be found in either identity being managed completely separately to authorisation attributes or in identity being managed by local institutions. In current PKI implementations, the conflation of identity management and authorisation management is a difficult problem to solve. Shibboleth provides some answers to the separation of authentication and authorisation and also enables identities to be managed very 'close to home', within the organisations to which the individual belongs. PKI implementations suffer from the problem of 'trustworthy' individuals (for example, Registration Authorities) being trusted to do more than they are able. This security problem is therefore somewhat hidden: it is better to be explicit and honest about trust. When the situation regarding trust is analysed, there is not a great difference between well-run organisation-based identity management and highly controlled PKI identity management via competent Registration Authorities. However, due to the usual lack of active revocation procedures with centralised PKI, it is likely that, on balance, a well-run locally managed system is more secure. There are several reasons whereby scalability is an issue with centralised PKI, not least that of usability for less technical grid users. Therefore - as Shibboleth is more naturally compatible with locally managed identities - it is very likely that Shibboleth will have to be considered a strong candidate in the drive to expand the use of grids to greater numbers of users. | 
| Line 8: | Line 15: | 
| [#mustscale Grids must scale] ; [#identitymanscalability Identity management is a scalability bottleneck] ; [#identtrustorg Identity is best managed by a very trustworthy organisation] ; [#attributemanscale Attribute management is a scalability bottleneck] ; [#trustminimum Trust must be kept to a minimum on grids] ; [#securityieinadeq Security levels in the 'information environment' are inadequate] | [#mustscale Grids must scale] ; [#identitymanscalability Identity management is a scalability bottleneck] ; [#identtrustorg Identity is best managed by a very trustworthy organisation] ; [#attributemanscale Attribute management is a scalability bottleneck] ; [#LoA Grids need good levels of assurance] ; [#trustminimum Trust must be kept to a minimum on grids] ; [#securityieinadeq Security levels in the 'information environment' are inadequate] | 
| Line 10: | Line 17: | 
| [#PKImustscale Grids must scale] ; [#PKIidentitymanscalability Identity management is a scalability bottleneck] ; [#PKIidenttrustorg Identity is best managed by a very trustworthy organisation] ; [#PKIattributemanscale Attribute management is a scalability bottleneck] ; [#PKItrustminimum Trust must be kept to a minimum on grids] ; [#PKIsecurityieinadeq Security levels in the 'information environment' are inadequate] | [#PKImustscale Grids must scale] ; [#PKIidentitymanscalability Identity management is a scalability bottleneck] ; [#PKIidenttrustorg Identity is best managed by a very trustworthy organisation] ; [#PKIattributemanscale Attribute management is a scalability bottleneck] ; [#PKILoA Grids need good levels of assurance] ; [#PKItrustminimum Trust must be kept to a minimum on grids] ; [#PKIsecurityieinadeq Security levels in the 'information environment' are inadequate] | 
| Line 12: | Line 19: | 
| [#SHIBmustscale Grids must scale] ; [#SHIBidentitymanscalability Identity management is a scalability bottleneck] ; [#SHIBidenttrustorg Identity is best managed by a very trustworthy organisation] ; [#SHIBattributemanscale Attribute management is a scalability bottleneck] ; [#SHIBtrustminimum Trust must be kept to a minimum on grids] ; [#SHIBsecurityieinadeq Security levels in the 'information environment' are inadequate] | [#SHIBmustscale Grids must scale] ; [#SHIBidentitymanscalability Identity management is a scalability bottleneck] ; [#SHIBidenttrustorg Identity is best managed by a very trustworthy organisation] ; [#SHIBattributemanscale Attribute management is a scalability bottleneck] ; [#SHIBLoA Grids need good levels of assurance] ; [#SHIBtrustminimum Trust must be kept to a minimum on grids] ; [#SHIBsecurityieinadeq Security levels in the 'information environment' are inadequate] | 
| Line 14: | Line 21: | 
| [#ConcsDevolvedAuthn Devolved authentication and ID management] ; [#ConcsAuthZAttribMgmt Authorisation and attribute management] ; [#ConcsTrustworthiness Trustworthiness and security] ; [#ConcsScalability Scalability]  ; [#ConcsUsabilityCerts The usability of client certificates] ; [#ConcsSummary Summary] | |
| Line 17: | Line 26: | 
| Following this introduction, this document is arranged into three major sections. The first addresses the [#contempassumpts Contemporary Assumptions] of grid security and other aspects of access management. Most of the assumptions portrayed are based on sound security principles, but some are possibly a little misplaced. Following this assertion of the current basic principles, we consider (briefly) [#PKIvsassumpts How PKI lives up to these assumptions], considering each assumption in turn. This is followed by a similar treatment regarding [#SHIBrole How Shibboleth could play a role]. This is followed by the general [#conclusions Conclusions]. | Following this introduction, this document is arranged into four major sections. The first addresses the ''[#contempassumpts Contemporary Assumptions]'' of grid security and other aspects of access management. Most of the assumptions portrayed are based on sound security principles, but some are possibly a little misplaced. Following this assertion of the current basic principles, we consider (briefly) ''[#PKIvsassumpts How PKI lives up to these assumptions]'', considering each assumption in turn. This is followed by a similar treatment regarding ''[#SHIBrole How Shibboleth could play a role]''. This is followed by the general ''[#conclusions Conclusions]''. | 
| Line 24: | Line 33: | 
| "The Grid" or "grids" are currently viewed by many as to be at the equivalent stage of conceptual development as was the world wide web and information environment intranets in the late 1980s. There is a widespread assumption that grid use will grow enormously as more people (and other end entities) find a use for high powered and distributed (computing) resources. It is also clear that access management is a far greater issue than for the web, as much more than 'read only' access is required. We have to assume, therefore, that secure access management is a current limiting factor for the ability of the technologies to scale to serve large numbers of users. As an extension to this assumption, resource owners of computing power and expensive instrumentation are far more likely to open up their resources to a grid if they are confident that their resources are secure from harm and from the use of unauthorised others outside their (grid) community. | "The Grid" or "grids" are currently viewed by many as to be at the equivalent stage of conceptual development as was the world wide web and [:Information_Environment:information environment] intranets in the late 1980s. There is a widespread assumption that grid use will grow enormously as more people (and other end entities) find a use for high powered and distributed (computing) resources. It is also clear that access management is a far greater issue than for the web, as much more than 'read-only' access is required. We have to assume, therefore, that secure access management is a current limiting factor for the ability of the technologies to scale to serve large numbers of users. As an extension to this assumption, resource owners of computing power and expensive instrumentation are far more likely to open up their resources to a grid if they are confident that their resources are secure from harm and from the use of unauthorised others outside their (grid) community. | 
| Line 28: | Line 37: | 
| Unlike a resource that is meant to be accessed 'read-only', a grid needs to identify its users. The management of those identities is an onerous task and one that needs to be executed via policies which all owners of grid resources can trust. As the numbers of users (or end entities) increases, this becomes an even more difficult task. | Unlike a resource that is meant to be accessed read-only, a grid needs to identify its users. The management of those identities is an onerous task and one that needs to be executed via policies which all owners of grid resources can trust. As the numbers of users (or end entities) increases, this becomes an even more difficult task. | 
| Line 32: | Line 41: | 
| The concept of authentication is often (erroneously) associated with the separate elements of identity establishment and subsequent on-line authentication. Authentication is the act of verifying that an electronic identity (username, distinguished name etc.) is being employed by the entity, person or process to whom it was issued. Therefore, this relies upon the fact that the electronic identity was issued accurately in the first place. Thus, this early establishment of identity and the subsequent use of the identity needs to be managed by a trustworthy organisation. | The concept of authentication is often (somewhat erroneously) associated with the separate elements of identity establishment and subsequent on-line identity assertion. Authentication is the act of verifying that an electronic identity (username, distinguished name etc.) is being employed by the entity, person or process to whom/which it was issued. This relies upon the fact that the electronic identity was issued accurately in the first place. Thus, this early establishment of identity and the subsequent use of the identity need to be managed by a trustworthy organisation. | 
| Line 38: | Line 47: | 
| [[Anchor(LoA)]] == Grids need good levels of assurance == When a request is made and supported by an assertion that 'This is user 1234' or 'This user has been authenticated and I am happy that they are who they appear to be', there is a natural ''confidence level'' associated with the assertion. That level of confidence may be informed by the fact that ''we know'' that Organisation A doesn't issue its accounts (e.g. usernames/passwords, digital certificates etc.) in a secure way or that the Organisation has never been known to revoke a user, even though people obviously leave and are sacked etc. This would lead me to be less certain that the user is who he appears to be. Conversely, I may know that Organisation B has excellent procedures and that ''this'' user has a digital certificate, so I am very confident that she is who she says she is. This confidence comes from the nature of the electronic credential, and how it is used, as well as the procedures (security policies) of the issuing organisation (the Identity Provider). Thus, when an assertion is made, either by something like Security Assertion Markup Language (SAML) or the presenting of a digital certificate, a ''level of assurance'' (LoA) can either be inferred or asserted as well. Different !LoAs are commonly expressed as 'Minimum Level', 'Basic Level', 'Medium Level' or 'High Level'. These different expressions have become associated with different encryption standards, security practices and, especially, how the original identity of the user was established. The current UK e-Science Grid uses client digital certificates (PKI - Public Key Infrastructure) and is considered to be at 'Medium Level Assurance'. Many currently believe that this is the minimum LoA that could be applied for grid use. We argue that this is too simplistic and short-sighted (see the discussion on ''[#PKILoA PKI Levels of Assurance]'' below). | |
| Line 40: | Line 57: | 
| This is always true as a general principle. Nevertheless, people often do not consider the related question of the difficulty of the task that they are choosing to trust another entity to carry out. For example, it may be better to trust a total of three entities to carry out a task (if it can be divided and where each sub-task is appropriately handled by each entity) than to trust one entity to carry out that same task (if the task is too difficult for that one entity). | This is always true as a general principle.  Nevertheless, people often do not consider the related question of the difficulty of the task that they are choosing to trust another entity to carry out.  For example, it may be better to trust a total of three entities to carry out a task (if it can be divided and where each sub-task is appropriately handled by each entity) than to trust one entity to carry out that same task (if the task is too difficult for that one entity).  This is a potentially complex concept, but which is expanded upon below (see ''[#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck]'') for an argument about Registration Authorities (RAs) having responsibility for checking data to which they cannot easily access).  However, to illustrate this point, here is an abstract scenario: ''Consider a hotel that may have a special service where a Greeter meets guests and takes them directly to their rooms. This is the Greeter's primary or only job: assigning guests to rooms. The Greeter does not have time to look at the booking database and so receives instructions from Reception as to which room to take new arrivals.'' {{{ Task: Assigning guests to rooms, accurately and politely Responsibility: Greeter}}} ''Now imagine that the hotel owner hears that some guests have been assigned to the wrong rooms. An audit or study of the Greeter at work finds that he is 100% accurate in assigning the guests. Should the Greeter get the sack because he is responsible for this task? No, the audit should have looked at the whole operation and determined that the accuracy of Reception is also critical in this process. The Greeter cannot take responsibility for the whole task as he is not able to complete the whole task. The task should be expressed as:'' {{{ Task: Assigning guests to rooms, accurately and politely Responsibility: Reception (Assigning Rooms), Greeter (Delivering guests to rooms)}}} ''And if there are any problems, both the work of Reception and the Greeter should be investigated.'' When expressing the security of a system, because the architects know that trust must be kept to a minimum, it is tempting to conflate roles or tasks to a single entity (the Greeter, in our example). This can be highly misleading. If one entity (e.g. the Greeter) must trust another entity (e.g. Reception), then this should be highlighted explicitly. | 
| Line 44: | Line 75: | 
| By 'information environment' we mean the environment that is managed for most of the users who join a local network and access many (often web based) resources.  It is in contrast to a 'grid' environment. This assumption has been included so that it can be explored further, below. Many consider that large grids cannot trust the identity management and authentication credentials issued from users' home organisations where levels of security may reflect the historic situation where users play more passive roles. In short, many grid users believe that universities, businesses and government agencies - to name a few examples - cannot be trusted to manage identities and user attributes that are used on grids. | By [:Information_Environment:'information environment'] we mean the environment that is managed for most of the users who join a local network and access many (often web-based) resources.  It is in contrast to a 'grid' environment where users wish to execute/run a process or operate something remotely via grid middleware. This assumption has been included so that it can be explored further, below. Many consider that large grids cannot trust the identity management and authentication credentials issued from users' home organisations where levels of security may reflect the historic situation in which users play more passive roles. In short, many grid users believe that universities, businesses and government agencies - to name a few examples - cannot be trusted to manage identities and user attributes that are used on grids. | 
| Line 53: | Line 84: | 
| == Grids must scale == The eSP-grid project remains agnostic on this issue. Some experienced commentators believe that the scalability of 'the grid' is being limited by the difficult usability problems of client-based PKI. In the background to this project is the DCOCE project (http://www.dcoce.ox.ac.uk) which found that the usability problems tend to come from many small issues, most of which would be trivial to fix. Nevertheless, the usability problems are many and this has great negative effects on the user experience. | == (PKI) Grids must scale == The ESP-GRID project remains agnostic on this issue. Some experienced commentators believe that the scalability of 'the grid' is being limited by the difficult usability problems of client-based PKI (Public Key Infrastructure)[[FootNote(See the [http://www.usabilitea.org/ Usabilitea] initiative and, for example, Beckles, B., Brostoff, S., and Ballard, B. A first attempt: initial steps toward determining scientific users’ requirements and appropriate security paradigms for computational grids (2004). Proceedings of the Workshop on Requirements Capture for Collaboration in e- Science, Edinburgh, 14-15 January 2004, 17-43.)]] . In the background to this project is the DCOCE project (http://www.dcoce.ox.ac.uk) which found that the usability problems tend to come from many small issues, most of which would be trivial to fix. Nevertheless, the usability problems are many, and with the passing years are ''not'' being fixed, and these have great negative effects on the user experience. | 
| Line 57: | Line 89: | 
| == Identity management is a scalability bottleneck == PKI-based grids address the requirement of good identity management by authorizing nominated, trustworthy Registration Authorities (RAs) to check the identity of the certificate applicant. This system requires the RAs to work to strict security policies, which is good. However, as the RAs usually have to attend a training course in order to be deemed to be trustworthy, there tends to be fewer of them than there should be, ideally. This therefore becomes a scalability bottleneck. Many users have to physically travel large distances to visit their nearest RA. This is likely to deter some ligitimate users. Furthermore, even though the main purpose of the RA is for authentication and identity management, there is some implicit authorisation taking place during the transaction. This is greatly undesirable, but most PKI implementations mandate it. For example, consider a regional RA performing duties for users at 3 educational establishments, 2 businesses and one government office. The RA may require users to bring proof of identity, such as a passport, and/or a university/security card/pass with their name (and/or photograph) on it. The user will subsequently be issued a deigital certificate where their 'organisational unit' (OU) is (for example) University A. To perform the role of RA properly, the RA should, therefore, check the lists of whoever leaves the 6 organisational units for which s/he performs RA duties. This is an onerous task that is probably never properly fulfilled. Failing this, the Certification Authority (CA) should take on the task. However, the CA is at an even greater distance - organisationally - from the end units than is the RA, and therefore this is a near impossible task. We conclude that identity management via PKI should work effectively, but only if the RA is an integral part of the home organisation (or at teh absolute ideal, the OU) of the end user. Further, the RA should therefore be in prime control over the revocation process. As neither of these are usually true, the PKI is fundamentally flawed, or at best very difficult to scale (as large numbers of RAs are needed). | == (PKI) Identity management is a scalability bottleneck == PKI-based grids address the requirement of good identity management by authorising nominated, trustworthy RAs to check the identity of the certificate applicant. This system requires the RAs to work to strict security policies, which is good. However, as the RAs usually have to attend a training course in order to be deemed to be trustworthy, there tends to be too few of them. This therefore becomes a scalability bottleneck. Many users have to physically travel large distances to visit their nearest RA. This is likely to deter some legitimate users. Furthermore, even though the main purpose of the RA is for authentication and identity management, there is some implicit authorisation taking place during the transaction. This is greatly undesirable, but most PKI implementations mandate it. For example, consider a regional RA performing duties for users at 3 educational establishments, 2 businesses and one government office. The RA may require users to bring proof of identity, such as a passport, and/or a university/security card/pass with their name (and photograph) on it. The RA is therefore trusting that the university card or security pass was issued correctly in the first instance. The user will subsequently be issued a digital certificate where their 'Organisation' (O) or 'Organisational Unit' (OU) is (for example) University A. To perform the role of RA properly, the RA should, therefore, check the lists of whoever departs from the 6 "O"s for which he performs RA duties. This is an onerous task that is probably never properly fulfilled. Failing this, the Certification Authority (CA) should take on the task. However, the CA is at an even greater distance - organisationally - from the end units than is the RA, and therefore this is a near impossible task. We conclude that identity management via PKI ''should'' work effectively, but only if the RA is an integral part of the home organisation (or at the absolute ideal, the group or department) of the end user and has access to the main enterprise directory of the users/members. Further, the RA should therefore be in prime control over the revocation process. As neither of these is usually true, the PKI is fundamentally flawed, or at best very difficult to scale (as large numbers of RAs are needed). [[Anchor(assocRevo)]] For the sake of brevity, the above two problems - that of the RA having to trust another 'identity provider' and that of no appropriate person(s) managing the revocation process - are hereby summarised as the '''association-revocation gap'''. | 
| Line 65: | Line 100: | 
| === Identity is best managed by a very trustworthy organisation === As noted above, identity is easier to manage than attributes such as role and status. Again, as noted above, it is futile that a very trustworthy body or person is trusted to carry out a task that is nearly impossible for it to achieve to an adequate level of service in practice. As identity in PKI is almost always closely coupled with OU status, then most RAs are unable to manage this task, despite being well trained and trustworthy. RAs ''have'' to trust the personnel or registration departments within the OUs in order for them to carry out this task. This extra 'leaf' to the chain of trust is rarely, if ever, acknowledged. If identity could be truly un-coupled from changeable attributes, such as OU and other status information, then PKI may be more reliable, but this seems to be difficult to implement. | === (PKI) Identity is best managed by a very trustworthy organisation === As noted above, identity is easier to manage than attributes such as role and status. Again, as noted above, it is futile that a very trustworthy body or person is trusted to carry out a task that is, in practice, nearly impossible for them to achieve to an adequate level of service. As identity in PKI is almost always tied to the membership of the O or OU, then most RAs are unable to manage this task (again, due to the 'association-revocation gap'), despite being well trained and trustworthy. RAs ''have'' to trust the personnel or registration departments within the OUs in order for them to carry out this task. This extra 'leaf' (or link) to the chain of trust is rarely, if ever, acknowledged. (This concept is very important, but difficult to express concisely. For more information, see Mark Norman's presentation ''[http://www.nesc.ac.uk/action/esi/contribution.cfm?Title=622 The case for devolved authentication: over-centralised security doesn't work]'' (Powerpoint format), given at the National e-Science Centre in Edinburgh, 20 October 2005). The same presentation is to be found here: ''[http://users.ox.ac.uk/~markn/Presentations/JiscNeSCMiddwareBriefingOct05.pdf Devolved authentication (lower file size and in PDF)]''.) If identity could be truly un-coupled from changeable attributes, such as O, OU and other status information, then PKI may be more reliable, but there seems to be little will in the community to implement it in this way. | 
| Line 71: | Line 106: | 
| == Attribute management is a scalability bottleneck == Attribute management is not usually performed using PKI (although there are some possibilities using attribute certificates, or even attribute fields in the certificates themselves: the use of which should generally be greatly discouraged!). In the two sections immediately above, we argue that RAs are in a very poor place to manage attributes that will be used for authorisation decisions. Therefore this may be considered as either a flaw of PKI or a necessary absence. | == (PKI) Attribute management is a scalability bottleneck == Attribute management is not usually performed using PKI (although there are some possibilities using attribute certificates, or even attribute fields in the certificates themselves: the use of which is often inappropriate with long-term certificates). In the two sections immediately above, we argue that RAs are in a very poor place to manage attributes that will be used for authorisation decisions. Therefore this may be considered as either a flaw of PKI or a necessary absence. [[Anchor(PKILoA)]] == (PKI) Grids need good levels of assurance == In future we anticipate that there will be much grid use that is operated in a Customer-!ServiceProvider (like) manner and basic level assurance may be acceptable for these uses. Nevertheless, there will be much activity that requires medium or high level assurance, as users are able to - at least in part - control the behaviour of machines and to exclude others, temporarily, from using those machines. Thus medium level assurance will still be a requirement for some technical users, but a mixed economy of users will probably exist where basic level will be adequate for the majority of users. A confusing side issue is the high degree of trust in the electronic nature of the credential (e.g. digital certificate). This can mask the importance of the security policies that were used to issue the credential: so much so, that many people will forget this important, 'bureacratic' consideration altogether. Within the 'access management community' in the UK, there is certainly an emphasis at present in trusting the electronic credential more than the issuing policy and provenance of that credential. Thus, a high degree of trust is given to a user presenting with a digital certificate, but a lower degree of trust is given to a user who is known to have authenticated via username and password with HTTPS. The electronic credential is relevant, but may be secondary in importance to the policy by which the credential is issued. For example, some CAs will issue digital certificates to people who only need a valid email address[[FootNote(Some certificate issuers will place a value in a field within the certificate or have the certificate signed by a particular root certificate that will indicate (in some way) 'Minimum Level Assurance'. Nevertheless people and machines often place too much trust in such client digital certificates.)]], whereas other institutions have very careful procedures for handing out their usernames and passwords. In this case the latter is more trustworthy and should be given a higher level of assurance (LoA). We also argue that the [#assocRevo 'association-revocation gap'] in current practices of issuing digital certificates (see ''[#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck]'', above) means that the LoA of many 'information environment' identity providers (!IdPs) is as good, if not better than that of the Grid CA. This is mostly due to the lack of active revocation by the CA or RA. | 
| Line 75: | Line 119: | 
| == Trust must be kept to a minimum on grids == As noted above, this general principle is true. Due to the difficulties of handling authorisation-triggering attributes via PKI, authorisation is almost synonymous with authentication at most grid resources. i.e. authorisation decisions are usuallly based upon knowledge of identity. For example, a resource considers that these 200 distinguished-name-holders are able to use the resource. This level of sophistication with respect to authorisation is unlikely to be satisfactory in the future where many more grid users exist and where the membership of virtual organisations change dynamically. | == (PKI) Trust must be kept to a minimum on grids == As noted above, this general principle is true. Due to the difficulties of handling authorisation-triggering attributes via PKI, authorisation is almost synonymous with authentication at many grid resources; i.e. authorisation decisions are usually based upon knowledge of identity. For example, a resource owner may consider that only 'these 200 distinguished-name-holders are able to use this resource'. This level of sophistication with respect to authorisation is unlikely to be satisfactory in the future where many more grid users exist and where the memberships of virtual organisations (VOs) change dynamically. | 
| Line 80: | Line 124: | 
| The main area where PKI falls short in this concept is regarding the [#assocRevo 'association-revocation gap'] (see ''[#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck]'', above).  It is desirable (for some) to depict the PKI having a chain of trust that ends with the RA.  However, this is misleading and untrue: the chain of trust ends with the person who issued the identity credential upon which the RA depends.  (The section ''[#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck]'', above, gives more details of this argument). | |
| Line 81: | Line 128: | 
| == Security levels in the 'information environment' are inadequate == In the above sections, we highlight the flaws in the current arrangements with respect to PKI implementations. Because it is a fact that PKI ''can'' support very high levels of security, there is a danger that people perceive that the technology=security, rather than the ''careful implementation of the technology and following strict policies''=''good security''. It is our assertion that PKI would work well if RAs were embedded deeply in the organisational units (OUs). This is mostly due to security policies and knowledge of the status of individual users. | == (PKI) Security levels in the 'information environment' are inadequate == In the above sections, we highlight the flaws in the current arrangements with respect to PKI implementations. Because it is a fact that PKI ''can'' support very high levels of security, there is a danger that people perceive that the ''technology=security'', rather than the ''careful implementation of the technology and following strict policies''=''good security''. It is our assertion that PKI would work well if RAs were embedded deeply in the organisational units (OUs). This is mostly due to security policies and knowledge of the status of individual users. | 
| Line 90: | Line 137: | 
| == Grids must scale == Having considered that identity management performed centrally may be a threat to the scalability of grids, Shibboleth as a system of supporting ''devolved authentication'' can be seen to be a solution. The need for devolved authentication is highlighted by the arguments within the [#PKIidentitymanscalability Identity management is a scalability bottleneck] (PKI) section above. Some assume that PKI avoids the difficulties of devolved authentication by using RAs and long term, secure digital authentication credentials. This is clearly false as RAs rely on third parties to vouch for the identities and statuses of the applicants that come before them. This is (conveniently) not generally recognised. However, who issued that user's university card? It was a trusted devolved third party. | == (Shib) Grids must scale == Having considered that identity management performed centrally may be a threat to the scalability of grids, Shibboleth as a system of supporting ''devolved authentication'' can be seen to be a solution. The need for devolved authentication is highlighted by the arguments within the ''[#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck]'' section above. Some assume that PKI avoids the difficulties of devolved authentication by using RAs and long term, secure digital authentication credentials. This is clearly false as RAs rely on third parties to vouch for the identities and statuses of the applicants that come before them. This is the "association" part within the [#assocRevo 'association-revocation gap'] concept that was mentioned earlier. This is (conveniently) not generally recognised. However, who issued that user's university card? It was a trusted devolved third party. | 
| Line 94: | Line 141: | 
| == Identity management is a scalability bottleneck == | == (Shib) Identity management is a scalability bottleneck == | 
| Line 98: | Line 145: | 
| === Identity is best managed by a very trustworthy organisation === If identity can be separated completely from status, roles and other attributes - a separation in which Shibboleth excels - then identity becomes of lesser importance. Identity can be established once, using strict security policies and does not need active management. | === (Shib) Identity is best managed by a very trustworthy organisation === If identity can be separated completely from status, roles and other attributes - a separation in which Shibboleth excels - then identity becomes of lesser importance. Identity can be established once, using strict security policies and needs little active management. | 
| Line 102: | Line 149: | 
| == Attribute management is a scalability bottleneck == In a grid that relied upon Shibboleth, this would be true. Active attribute management would become the most onerous part of the security matrix. This is as it should be. Organisationally and procedurally, this is the most complex responsibility. Therefore, identifying this as the true bottleneck should be as expected. Using attribute authorities (AAs) and Shibboleth/SAML transport, attributes can be managed so that they can change frequently and in near-real time. Most advantageously, they can be managed by the individuals who are truly in a position to judge or control their status. One area where the Shibboleth architecture may fall a little short is in supporting virtual organisations (VOs). Currently, there is space in the main architecture for only one attribute authority (AA) to be associated with an identity provider (IdP). For VOs to be managed easily, a users main/home AA must be able to chain or devolve to third party AAs. An alternative model would be for the resource to initiate a new query to the secondary or VO AA (let's call this AA`2`). These two possible mechanisms are illustrated below: === Chaining/devolving AAs === In a Shibboleth transaction, a user tries to access or use a grid resource and is directed back to his/her IdP to actively authenticate. This is followed by the resource contacting the user's AA to pick up attributes with which to make the authorisation decision. (For the sake of our example, consider that the resource only allows access/use to entities which belong to a particular VO and have some particular attributes managed by that VO). The VO manages lists of users (or unique IDs) and the users' attributes. The resource 'asks' the home/main AA whether the user is a member of the VO. The AA is unable to manage this information but contains a pointer to the VO's AA`2`. The AA thus queries AA`2` and sends the information back to the resource in the SAML exchange. (In this way, the VO AA`2` trusts the IdP to authenticate the user - something which it cannot do itself - but then confirms back to the home/main AA that the user is (not) a member of the VO and he/she holds these attributes. Of course, the AA would also have to transmit the signed assertion from AA`2` to the resource). The only difficulty with this model is that it is incumbent on the AA to know where is the appropriate AA`2`. This may be achieved by setting it as a user-editable attribute. === Secondary query to AA2 === This is a possibly simpler mechanism whereby the resource has the IdP authenticate the user, and possibly supply some attributes, but the resource initiates a separate transaction to the VO AA`2` to find out whether the user is a member and to obtain his/her VO attributes. This is achievable using the current Shibboleth architecture (although the query to the AA`2` would be an extension). However, the AA would have to always pass a permanent unique user identifier to the resource. There will, therefore, be some future use cases where this model will break an anonymity or pseudonymity requirement. The user will always have to present the same unique identifier to the resource to gain access or use. This is likely to be an uncommon requirement, but will almost inevitably exist in some cases. Therefore, it may be that the ''chaining/devolving AAs'' model is preferable as it should be usable in both of these situations. | == (Shib) Attribute management is a scalability bottleneck == In a grid that relied upon Shibboleth, this would be true. Active attribute management would become the most onerous part of the security matrix. This is as it should be. Organisationally and procedurally, this is the most complex responsibility. Therefore, identifying attribute management should be, as expected to be, the true bottleneck. Using attribute authorities (AAs) and Shibboleth/SAML transport, attributes can be managed so that they can change frequently and in near-real time. Most advantageously, they can be managed by the individuals who are truly in a position to judge or control their status. One area where the Shibboleth architecture may fall a little short is in supporting virtual organisations (VOs). Currently, there is space in the main architecture for only one attribute authority (AA) to be associated with an identity provider (IdP). For VOs to be managed easily, a user's main/home AA must be able to chain or devolve to third party AAs. An alternative model would be for the resource to initiate a new query to the secondary or VO AA (let's call this AA`2`). These two possible mechanisms are illustrated below: === (Shib) Chaining/devolving AAs === Figure 1, below, shows a possible generalised mechanism for this scenario. The direct user interaction has not been shown in order to highlight the most important points. In a Shibboleth transaction, a user tries to access or use a grid resource and is directed back to his/her IdP to actively authenticate. After the SP receives a 'handle' (temporary session identifier for the user - 1), this is followed by the resource contacting the user's AA to pick up attributes with which to make the authorisation decision (2). (For the sake of our example, consider that the resource only allows access/use to entities which belong to a particular VO and have some particular attributes managed by that VO). The VO manages lists of users (or unique IDs) and the users' attributes. The resource 'asks' the home/main AA whether the user is a member of the VO. The AA is unable to manage this information but contains a pointer to the VO's AA`2`. The AA thus queries AA`2` (3 and 4) and sends the information back to the resource in the SAML exchange (5). (In this way, the VO AA`2` trusts the IdP to authenticate the user - something which it cannot do itself - but then confirms back to the home/main AA that the user is (or is not) a member of the VO and she holds these attributes. The AA transmits the signed assertions from AA`2` to the resource.) http://users.ox.ac.uk/~markn/wikifiles/AttribsFromVO1.png ~-Figure 1. Diagram of possible mechanism of chaining AAs for enabling VOs-~ The two difficulties with this model are that: * It is incumbent on the AA to know where is the appropriate AA`2`. This may be achieved by setting it as a user-editable attribute. * The home IdP/AA must again be used if - at a later time in the session - the SP needs to check membership of another VO (or the same one if this needs to be checked frequently). === (Shib) Secondary query to AA2 === This is a possibly simpler mechanism whereby the resource has the IdP authenticate the user, and possibly supply some attributes, but the resource initiates a separate transaction to the VO AA`2` to find out whether the user is a member and to obtain his/her VO attributes. This is achievable using the current Shibboleth architecture (although the query to the AA`2` would be an extension). However, the AA would have to always pass a permanent unique user identifier to the resource. Figure 2 shows a brief summary of such a mechanism. There will, therefore, be some future use cases where this model will break an anonymity or pseudonymity requirement. The user will always have to present the same unique identifier to the resource to gain access or use. This is likely to be an uncommon requirement, but will inevitably exist in some cases. Therefore, it may be that the ''chaining/devolving AAs'' model is preferable as it should be usable in both of these situations. Nevertheless, in the major initiative investigating these issues, it is variations on this second model that are apparently favoured (see the GridShib project). http://users.ox.ac.uk/~markn/wikifiles/AttribsFromVO2.png ~-Figure 2. Diagram of possible mechanism of chaining AAs for enabling VOs-~ [[Anchor(SHIBLoA)]] == (Shib) Grids need good levels of assurance == We hope that the arguments expressed in ''[#PKILoA (PKI) Grids need good levels of assurance]'' above have fully addressed this issue. In short, if there is a great distance between the day-to-day identity management and revocation and the central security system the (apparent) LoA is misleading. A user who robbed his university's computer suite last week and has had all his university passes, cards and accounts revoked is very likely to still hold his grid digital certificate because that is managed nationally. In this case, the university's HTTPS username/password system may provide far more ''assurance'' than his digital certificate that takes a year to expire. Of course, in the scenario above, the villain's identity has not changed, so you could argue that there is nothing wrong with assuring a third party that he is 'Mr Smith' but we have already discussed the implicit authorisation involved in holding a digital certificate identifying the user as a member of the O and the OU (and access decisions will be based upon these attributes). There are many scenarios where the university could have found that he is an identity fraudster and removed his cards and accounts. The central grid security personnel may be the last to know. | 
| Line 120: | Line 183: | 
| == Trust must be kept to a minimum on grids == We hope that the arguments put forward in the sections above, but especially in the [#PKIidenttrustorg Identity is best managed by a very trustworthy organisation] (PKI) section help to prove that there are more levels of trust in PKI than are usually acknowledged. In a devolved authentication system, the trust is always explicit and this may be seen as an advantage. With the planned support of virtual organisations (VOs) comes a necessary devolution of authorisation-supporting attributes. Shibboleth is generally a good architecture to support this. | == (Shib) Trust must be kept to a minimum on grids == We hope that the arguments put forward in the sections above, but especially in the ''[#PKIidenttrustorg (PKI) Identity is best managed by a very trustworthy organisation]'' section help to prove that there are more levels of trust in PKI than are usually acknowledged. In a devolved authentication system, the trust is always explicit and this may be seen as an advantage. With the planned support of virtual organisations (VOs) comes a necessary devolution of authorisation-supporting attributes. Shibboleth is generally a good architecture to support this. | 
| Line 124: | Line 187: | 
| == Security levels in the information environment are inadequate == See the above [#PKIsecurityieinadeq Security levels in the information environment are inadequate (PKI)] section as it is equally applicable here. | == (Shib) Security levels in the information environment are inadequate == See the above ''[#PKIsecurityieinadeq (PKI) Security levels in the information environment are inadequate]'' section as it is equally applicable here. In summary: * The PKI-based identity management is dependent upon the 'information environment' current identity management procedures, so why not trust Shibboleth? * The use of digital certificates by users should not be taken as a short-hand for 'high security'. It is the policies that support the identity management system, combined with cryptographically secure assertions that gives a higher level of security assertion. * The over-centralisation of the PKI-mediated identity management means that active revocation is not pursued. This, arguably, tends to give the 'information environment' somewhat better security than the PKI-based arrangement. As Shibboleth is highly compatible with the existing information environment procedures and there are some security flaws in the existing PKI model, this lends some weight to the argument for using Shibboleth for realms of higher security, such as grids. Of course, Shibboleth and the local organisations' single sign-on mechanisms could be extended to accept certificates and, conversely, the PKI could be modified so that it ''explicitly'' trusts local identity management staff so that users are revoked more actively. | 
| Line 129: | Line 197: | 
| == Devolved authentication == The first argument to explore is the need for '''devolved authentication''', which Shibboleth can support easily, and which PKI ''could'' support, but usually does not. We believe that there is a strong argument for devolved authentication for both good security and high scalability reasons. As long as the mal-practicing user can be successfully and quickly traced by the resource provider or grid node, then devolved authentication is very nearly a necessity for a highly scalable and secure grid. Shibboleth provides possibilities for devolved authentication that should be secure enough for grid use as long as Federation security policies are followed by the home organisations (Identity Providers: `IdPs`). | [[Anchor(ConcsDevolvedAuthn)]] == Devolved authentication and ID management == The first argument to explore is the need for '''devolved ID management''' and identity assertion, which Shibboleth can support easily, and which PKI ''could'' support, but usually does not. We believe that there is a strong argument for devolved ID management for both good security and high scalability reasons. As long as the malpractising user can be traced, successfully and quickly, by the resource provider or grid node, then devolved ID management and authentication is very nearly a necessity for a highly scalable and secure grid. Shibboleth provides possibilities for devolved authentication and ID management that should be secure enough for grid use as long as Federation security policies are followed by the home organisations (Identity Providers: !IdPs). [[Anchor(ConcsAuthZAttribMgmt)]] | 
| Line 135: | Line 205: | 
| In any case - and especially for virtual organisations to be a possibility - it must be possible for the appropriate 'managers/administrators' to be able to set user attributes as easily as possible, and to be able to change these frequently and easily. Shibboleth could be easily extended to provide this functionality (see the section on [#SHIBattributemanscale Attribute management is a scalability bottleneck] (Shib) above for ideas as to how this may be achieved). See also the [http://gridshib.globus.org/ GridShib project] for further ideas and proposed solutions. Solutions such as [http://edg-wp2.web.cern.ch/edg-wp2/security/voms/voms.html VOMS] are addressing this with in the PKI arena. | In any case - and especially for virtual organisations to be a possibility - it must be possible for the appropriate 'managers/administrators' to be able to set user attributes as easily as possible, and to be able to change these frequently and easily.  Shibboleth could be easily extended to provide this functionality (see the section on ''[#SHIBattributemanscale (Shib) Attribute management is a scalability bottleneck]'' above for ideas as to how this may be achieved).  See also the ''[http://gridshib.globus.org/ GridShib project]'' for further ideas and proposed solutions.  Solutions such as ''[http://edg-wp2.web.cern.ch/edg-wp2/security/voms/voms.html VOMS (Virtual Organization Membership Service)]'' are addressing this within the PKI arena.  See also our thoughts on [:PolicyManagementAndExchange:policy management] for a summary of some of the work in this area. [[Anchor(ConcsTrustworthiness)]] == Trustworthiness and security == We outlined the difficulties of the [#assocRevo 'association-revocation gap'] in various sections above, but especially in ''[#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck]'' and ''[#SHIBmustscale (Shib) Grids must scale]''. If resource owners or grid middleware experts accept the premise that ''many trustworthy individuals performing tasks that are within their capabilities'' is a more secure situation than ''few very trustworthy individuals performing tasks that they cannot carry out securely'', then Shibboleth can be proved easily to be an excellent architecture for use within a next generation grid. Furthermore, Shibboleth allows for !IdPs and Resource Providers to belong to Federations with specific security rules. It could be possible that Federations include specific security policies for grids. Another possibility is that the [http://www.oasis-open.org/committees/security/ SAML] assertions that underpin Shibboleth can transmit a value of 'Level of Assurance' and/or 'Authentication Method' (e.g. password, kerberos, certificate etc.). We would prefer ''Level of Assurance'' to be used, and for this to remain separate from the ''Authentication Method''. LoA could then be used with grid security policies without the heavyweight need to establish new Federations. [[Anchor(ConcsScalability)]] == Scalability == Current grid security methods and policies - for example, those used within the UK e-Science Grid community - will not scale to encompass many more users. The current policies are geared towards each resource owner having an active and trusting relationship with each user (individually). This is clearly unscalable. A scalable solution necessarily involves devolved authentication, or at least the devolution of the management of user identities to local organisations. This is very difficult to achieve - although not impossible - using PKI and client certificates. It is easier to achieve via Shibboleth, as devolved authentication was a primary driver behind its design. Another 'model' of grid use that will grow in popularity is the Customer-!ServiceProvider model whereby a Service Provider takes on the responsibility of authenticating and authorising users. This could be direct or could again be devolved, using Shibboleth, for example. With this model, the Service Provider is the entity that is truly the grid 'user' but will run jobs on the grid on behalf of the person or entity making the request. This model is likely to be the most frequently employed by 'users' and would usually break the [http://www.dcoce.ox.ac.uk/glossary/index.xml.ID=CPS Certification Practices Statement] (issuing policy) associated with client-certificate PKI, but does not necessarily break the use of the technology. [[Anchor(ConcsUsabilityCerts)]] == The usability of client certificates == There appears to be a tension between the belief that client certificates are too difficult for 'normal' end users and the need for high security. We hope that, in our arguments above, we have clearly called into question the high security of client certificate PKI. The [http://www.dcoce.ox.ac.uk DCOCE project] found that digital certificates ''should'' not be too difficult for such users to employ successfully, but that currently they do pose difficulties. There are many small usability issues, each of which appears trivial for manufacturers and service providers to solve, but yet these issues remain, and their cumulative effect is of severe usability difficulties. Shibboleth would allow for the use of certificates as well as more user-friendly modes of authentication, such as https-based username/password authentication systems employed by a large number of organisations at present. This 'mixed economy' may provide the flexibility needed by the grid community. [[Anchor(ConcsSummary)]] == Summary == Shibboleth appears to provide a solution to the issues of scalability and of managing the large amount of identity information that is necessary. PKI could provide this solution in some forms, but it would require the policies that accompany PKI to be changed to devolve the identity management to people and bodies that are able to properly undertake this task. Whereas this ''can'' be achieved (as the [http://www.dcoce.ox.ac.uk DCOCE project] established), it should be far easier to integrate grid access management with that of the information environment community - i.e. using Shibboleth - and to concentrate efforts on security policies and Levels of Assurance. Thus, in practical terms, Shibboleth may be the best way of achieving grid access scalability and the high security and encryption that PKI provides can be used where it fits the purpose. Furthermore, it may be beneficial to allow many users who do not require 'deep' or 'technical' access to grid resources - but who nevertheless need to benefit from the grid - to avoid having to use client digital certificates. Shibboleth should provide an excellent mechanism to facilitate the use of many electronic identity/authentication credentials from username/password to digital certificates. | 
This work forms the output of the ESP-GRID workpackages five and six (PKI and Shibboleth Evaluations).
Summary
This document examines the theoretical applicability of Shibboleth and of client-certificate based public key infrastructure (PKI) to use within a grid environment. The analysis is addressed through examinations of identity management (where and by whom?), attribute management, levels of assurance and trust. Shibboleth is highly compatible with the current 'Information Environment' standards and procedures of identity management and role based access control. PKI has been seen as more secure, with its authentication assertions having a higher level of assurance. However, there are flaws in the implementations of PKI-based identity management and the solutions to these flaws could be found in either identity being managed completely separately to authorisation attributes or in identity being managed by local institutions. In current PKI implementations, the conflation of identity management and authorisation management is a difficult problem to solve. Shibboleth provides some answers to the separation of authentication and authorisation and also enables identities to be managed very 'close to home', within the organisations to which the individual belongs.
PKI implementations suffer from the problem of 'trustworthy' individuals (for example, Registration Authorities) being trusted to do more than they are able. This security problem is therefore somewhat hidden: it is better to be explicit and honest about trust. When the situation regarding trust is analysed, there is not a great difference between well-run organisation-based identity management and highly controlled PKI identity management via competent Registration Authorities. However, due to the usual lack of active revocation procedures with centralised PKI, it is likely that, on balance, a well-run locally managed system is more secure. There are several reasons whereby scalability is an issue with centralised PKI, not least that of usability for less technical grid users. Therefore - as Shibboleth is more naturally compatible with locally managed identities - it is very likely that Shibboleth will have to be considered a strong candidate in the drive to expand the use of grids to greater numbers of users.
Contents
- [#intro Introduction: how to use this document]
- [#contempassumpts Contemporary Assumptions] - [#mustscale Grids must scale] ; [#identitymanscalability Identity management is a scalability bottleneck] ; [#identtrustorg Identity is best managed by a very trustworthy organisation] ; [#attributemanscale Attribute management is a scalability bottleneck] ; [#LoA Grids need good levels of assurance] ; [#trustminimum Trust must be kept to a minimum on grids] ; [#securityieinadeq Security levels in the 'information environment' are inadequate]
 
- [#PKIvsassumpts How does PKI live up to these assumptions?] - [#PKImustscale Grids must scale] ; [#PKIidentitymanscalability Identity management is a scalability bottleneck] ; [#PKIidenttrustorg Identity is best managed by a very trustworthy organisation] ; [#PKIattributemanscale Attribute management is a scalability bottleneck] ; [#PKILoA Grids need good levels of assurance] ; [#PKItrustminimum Trust must be kept to a minimum on grids] ; [#PKIsecurityieinadeq Security levels in the 'information environment' are inadequate]
 
- [#SHIBrole How could Shibboleth play a role?] - [#SHIBmustscale Grids must scale] ; [#SHIBidentitymanscalability Identity management is a scalability bottleneck] ; [#SHIBidenttrustorg Identity is best managed by a very trustworthy organisation] ; [#SHIBattributemanscale Attribute management is a scalability bottleneck] ; [#SHIBLoA Grids need good levels of assurance] ; [#SHIBtrustminimum Trust must be kept to a minimum on grids] ; [#SHIBsecurityieinadeq Security levels in the 'information environment' are inadequate]
 
- [#conclusions Conclusions] - [#ConcsDevolvedAuthn Devolved authentication and ID management] ; [#ConcsAuthZAttribMgmt Authorisation and attribute management] ; [#ConcsTrustworthiness Trustworthiness and security] ; [#ConcsScalability Scalability] ; [#ConcsUsabilityCerts The usability of client certificates] ; [#ConcsSummary Summary] 
 
Introduction: how to use this document
Following this introduction, this document is arranged into four major sections. The first addresses the [#contempassumpts Contemporary Assumptions] of grid security and other aspects of access management. Most of the assumptions portrayed are based on sound security principles, but some are possibly a little misplaced. Following this assertion of the current basic principles, we consider (briefly) [#PKIvsassumpts How PKI lives up to these assumptions], considering each assumption in turn. This is followed by a similar treatment regarding [#SHIBrole How Shibboleth could play a role]. This is followed by the general [#conclusions Conclusions].
Contemporary Assumptions
Grids must scale
"The Grid" or "grids" are currently viewed by many as to be at the equivalent stage of conceptual development as was the world wide web and [:Information_Environment:information environment] intranets in the late 1980s. There is a widespread assumption that grid use will grow enormously as more people (and other end entities) find a use for high powered and distributed (computing) resources. It is also clear that access management is a far greater issue than for the web, as much more than 'read-only' access is required. We have to assume, therefore, that secure access management is a current limiting factor for the ability of the technologies to scale to serve large numbers of users. As an extension to this assumption, resource owners of computing power and expensive instrumentation are far more likely to open up their resources to a grid if they are confident that their resources are secure from harm and from the use of unauthorised others outside their (grid) community.
Anchor(identitymanscalability)
Identity management is a scalability bottleneck
Unlike a resource that is meant to be accessed read-only, a grid needs to identify its users. The management of those identities is an onerous task and one that needs to be executed via policies which all owners of grid resources can trust. As the numbers of users (or end entities) increases, this becomes an even more difficult task.
Identity is best managed by a very trustworthy organisation
The concept of authentication is often (somewhat erroneously) associated with the separate elements of identity establishment and subsequent on-line identity assertion. Authentication is the act of verifying that an electronic identity (username, distinguished name etc.) is being employed by the entity, person or process to whom/which it was issued. This relies upon the fact that the electronic identity was issued accurately in the first place. Thus, this early establishment of identity and the subsequent use of the identity need to be managed by a trustworthy organisation.
Attribute management is a scalability bottleneck
A user's attributes (roles, status etc.) change frequently, whereas his/her identity should change very infrequently. Therefore, the management of such attributes - which may be used as decision-triggers during authorisation - may be more onerous than the management of the identity.
Grids need good levels of assurance
When a request is made and supported by an assertion that 'This is user 1234' or 'This user has been authenticated and I am happy that they are who they appear to be', there is a natural confidence level associated with the assertion. That level of confidence may be informed by the fact that we know that Organisation A doesn't issue its accounts (e.g. usernames/passwords, digital certificates etc.) in a secure way or that the Organisation has never been known to revoke a user, even though people obviously leave and are sacked etc. This would lead me to be less certain that the user is who he appears to be. Conversely, I may know that Organisation B has excellent procedures and that this user has a digital certificate, so I am very confident that she is who she says she is.
This confidence comes from the nature of the electronic credential, and how it is used, as well as the procedures (security policies) of the issuing organisation (the Identity Provider). Thus, when an assertion is made, either by something like Security Assertion Markup Language (SAML) or the presenting of a digital certificate, a level of assurance (LoA) can either be inferred or asserted as well. Different LoAs are commonly expressed as 'Minimum Level', 'Basic Level', 'Medium Level' or 'High Level'. These different expressions have become associated with different encryption standards, security practices and, especially, how the original identity of the user was established.
The current UK e-Science Grid uses client digital certificates (PKI - Public Key Infrastructure) and is considered to be at 'Medium Level Assurance'. Many currently believe that this is the minimum LoA that could be applied for grid use. We argue that this is too simplistic and short-sighted (see the discussion on [#PKILoA PKI Levels of Assurance] below).
Trust must be kept to a minimum on grids
This is always true as a general principle. Nevertheless, people often do not consider the related question of the difficulty of the task that they are choosing to trust another entity to carry out. For example, it may be better to trust a total of three entities to carry out a task (if it can be divided and where each sub-task is appropriately handled by each entity) than to trust one entity to carry out that same task (if the task is too difficult for that one entity). This is a potentially complex concept, but which is expanded upon below (see [#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck]) for an argument about Registration Authorities (RAs) having responsibility for checking data to which they cannot easily access). However, to illustrate this point, here is an abstract scenario:
- Consider a hotel that may have a special service where a Greeter meets guests and takes them directly to their rooms. This is the Greeter's primary or only job: assigning guests to rooms. The Greeter does not have time to look at the booking database and so receives instructions from Reception as to which room to take new arrivals. {{{ Task: Assigning guests to rooms, accurately and politely 
Responsibility: Greeter}}}
- Now imagine that the hotel owner hears that some guests have been assigned to the wrong rooms. An audit or study of the Greeter at work finds that he is 100% accurate in assigning the guests. Should the Greeter get the sack because he is responsible for this task? No, the audit should have looked at the whole operation and determined that the accuracy of Reception is also critical in this process. The Greeter cannot take responsibility for the whole task as he is not able to complete the whole task. The task should be expressed as: {{{ Task: Assigning guests to rooms, accurately and politely 
Responsibility: Reception (Assigning Rooms), Greeter (Delivering guests to rooms)}}}
- And if there are any problems, both the work of Reception and the Greeter should be investigated. 
When expressing the security of a system, because the architects know that trust must be kept to a minimum, it is tempting to conflate roles or tasks to a single entity (the Greeter, in our example). This can be highly misleading. If one entity (e.g. the Greeter) must trust another entity (e.g. Reception), then this should be highlighted explicitly.
Security levels in the 'information environment' are inadequate
By [:Information_Environment:'information environment'] we mean the environment that is managed for most of the users who join a local network and access many (often web-based) resources. It is in contrast to a 'grid' environment where users wish to execute/run a process or operate something remotely via grid middleware.
This assumption has been included so that it can be explored further, below. Many consider that large grids cannot trust the identity management and authentication credentials issued from users' home organisations where levels of security may reflect the historic situation in which users play more passive roles. In short, many grid users believe that universities, businesses and government agencies - to name a few examples - cannot be trusted to manage identities and user attributes that are used on grids.
How does PKI live up to these assumptions?
(PKI) Grids must scale
The ESP-GRID project remains agnostic on this issue. Some experienced commentators believe that the scalability of 'the grid' is being limited by the difficult usability problems of client-based PKI (Public Key Infrastructure)FootNote(See the [http://www.usabilitea.org/ Usabilitea] initiative and, for example, Beckles, B., Brostoff, S., and Ballard, B. A first attempt: initial steps toward determining scientific users’ requirements and appropriate security paradigms for computational grids (2004). Proceedings of the Workshop on Requirements Capture for Collaboration in e- Science, Edinburgh, 14-15 January 2004, 17-43.) . In the background to this project is the DCOCE project (http://www.dcoce.ox.ac.uk) which found that the usability problems tend to come from many small issues, most of which would be trivial to fix. Nevertheless, the usability problems are many, and with the passing years are not being fixed, and these have great negative effects on the user experience.
Anchor(PKIidentitymanscalability)
(PKI) Identity management is a scalability bottleneck
PKI-based grids address the requirement of good identity management by authorising nominated, trustworthy RAs to check the identity of the certificate applicant. This system requires the RAs to work to strict security policies, which is good. However, as the RAs usually have to attend a training course in order to be deemed to be trustworthy, there tends to be too few of them. This therefore becomes a scalability bottleneck. Many users have to physically travel large distances to visit their nearest RA. This is likely to deter some legitimate users.
Furthermore, even though the main purpose of the RA is for authentication and identity management, there is some implicit authorisation taking place during the transaction. This is greatly undesirable, but most PKI implementations mandate it. For example, consider a regional RA performing duties for users at 3 educational establishments, 2 businesses and one government office. The RA may require users to bring proof of identity, such as a passport, and/or a university/security card/pass with their name (and photograph) on it. The RA is therefore trusting that the university card or security pass was issued correctly in the first instance. The user will subsequently be issued a digital certificate where their 'Organisation' (O) or 'Organisational Unit' (OU) is (for example) University A. To perform the role of RA properly, the RA should, therefore, check the lists of whoever departs from the 6 "O"s for which he performs RA duties. This is an onerous task that is probably never properly fulfilled. Failing this, the Certification Authority (CA) should take on the task. However, the CA is at an even greater distance - organisationally - from the end units than is the RA, and therefore this is a near impossible task.
We conclude that identity management via PKI should work effectively, but only if the RA is an integral part of the home organisation (or at the absolute ideal, the group or department) of the end user and has access to the main enterprise directory of the users/members. Further, the RA should therefore be in prime control over the revocation process. As neither of these is usually true, the PKI is fundamentally flawed, or at best very difficult to scale (as large numbers of RAs are needed).
Anchor(assocRevo) For the sake of brevity, the above two problems - that of the RA having to trust another 'identity provider' and that of no appropriate person(s) managing the revocation process - are hereby summarised as the association-revocation gap.
(PKI) Identity is best managed by a very trustworthy organisation
As noted above, identity is easier to manage than attributes such as role and status. Again, as noted above, it is futile that a very trustworthy body or person is trusted to carry out a task that is, in practice, nearly impossible for them to achieve to an adequate level of service. As identity in PKI is almost always tied to the membership of the O or OU, then most RAs are unable to manage this task (again, due to the 'association-revocation gap'), despite being well trained and trustworthy. RAs have to trust the personnel or registration departments within the OUs in order for them to carry out this task. This extra 'leaf' (or link) to the chain of trust is rarely, if ever, acknowledged. (This concept is very important, but difficult to express concisely. For more information, see Mark Norman's presentation [http://www.nesc.ac.uk/action/esi/contribution.cfm?Title=622 The case for devolved authentication: over-centralised security doesn't work] (Powerpoint format), given at the National e-Science Centre in Edinburgh, 20 October 2005). The same presentation is to be found here: [http://users.ox.ac.uk/~markn/Presentations/JiscNeSCMiddwareBriefingOct05.pdf Devolved authentication (lower file size and in PDF)].)
If identity could be truly un-coupled from changeable attributes, such as O, OU and other status information, then PKI may be more reliable, but there seems to be little will in the community to implement it in this way.
(PKI) Attribute management is a scalability bottleneck
Attribute management is not usually performed using PKI (although there are some possibilities using attribute certificates, or even attribute fields in the certificates themselves: the use of which is often inappropriate with long-term certificates). In the two sections immediately above, we argue that RAs are in a very poor place to manage attributes that will be used for authorisation decisions. Therefore this may be considered as either a flaw of PKI or a necessary absence.
(PKI) Grids need good levels of assurance
In future we anticipate that there will be much grid use that is operated in a Customer-ServiceProvider (like) manner and basic level assurance may be acceptable for these uses. Nevertheless, there will be much activity that requires medium or high level assurance, as users are able to - at least in part - control the behaviour of machines and to exclude others, temporarily, from using those machines. Thus medium level assurance will still be a requirement for some technical users, but a mixed economy of users will probably exist where basic level will be adequate for the majority of users.
A confusing side issue is the high degree of trust in the electronic nature of the credential (e.g. digital certificate). This can mask the importance of the security policies that were used to issue the credential: so much so, that many people will forget this important, 'bureacratic' consideration altogether. Within the 'access management community' in the UK, there is certainly an emphasis at present in trusting the electronic credential more than the issuing policy and provenance of that credential. Thus, a high degree of trust is given to a user presenting with a digital certificate, but a lower degree of trust is given to a user who is known to have authenticated via username and password with HTTPS. The electronic credential is relevant, but may be secondary in importance to the policy by which the credential is issued. For example, some CAs will issue digital certificates to people who only need a valid email addressFootNote(Some certificate issuers will place a value in a field within the certificate or have the certificate signed by a particular root certificate that will indicate (in some way) 'Minimum Level Assurance'. Nevertheless people and machines often place too much trust in such client digital certificates.), whereas other institutions have very careful procedures for handing out their usernames and passwords. In this case the latter is more trustworthy and should be given a higher level of assurance (LoA).
We also argue that the [#assocRevo 'association-revocation gap'] in current practices of issuing digital certificates (see [#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck], above) means that the LoA of many 'information environment' identity providers (IdPs) is as good, if not better than that of the Grid CA. This is mostly due to the lack of active revocation by the CA or RA.
(PKI) Trust must be kept to a minimum on grids
As noted above, this general principle is true. Due to the difficulties of handling authorisation-triggering attributes via PKI, authorisation is almost synonymous with authentication at many grid resources; i.e. authorisation decisions are usually based upon knowledge of identity. For example, a resource owner may consider that only 'these 200 distinguished-name-holders are able to use this resource'. This level of sophistication with respect to authorisation is unlikely to be satisfactory in the future where many more grid users exist and where the memberships of virtual organisations (VOs) change dynamically.
As a resource owner, it may not be possible for you to manage more than n users, as there must be an optimum number with whom you could have a direct relationship. Therefore you will have to trust third parties. Even for a very low number of users, a grid resource owner may be the last to find out that a user has been convicted for fraud, or has been determined to have hacked another resource.
The main area where PKI falls short in this concept is regarding the [#assocRevo 'association-revocation gap'] (see [#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck], above). It is desirable (for some) to depict the PKI having a chain of trust that ends with the RA. However, this is misleading and untrue: the chain of trust ends with the person who issued the identity credential upon which the RA depends. (The section [#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck], above, gives more details of this argument).
(PKI) Security levels in the 'information environment' are inadequate
In the above sections, we highlight the flaws in the current arrangements with respect to PKI implementations. Because it is a fact that PKI can support very high levels of security, there is a danger that people perceive that the technology=security, rather than the careful implementation of the technology and following strict policies=good security. It is our assertion that PKI would work well if RAs were embedded deeply in the organisational units (OUs). This is mostly due to security policies and knowledge of the status of individual users.
It follows that the concept of deeply embedded RAs is nearly that which exists with registration or personnel officers that manage the information environment authentication process. On this level, the information environment processes are more secure. Some of the technology used may be less secure (although this is changing rapidly). Therefore, with a greater knowledge of security pervading these 'information environment' managers, it could be that an adequate level of security could be achieved for grid use.
How could Shibboleth play a role?
(Shib) Grids must scale
Having considered that identity management performed centrally may be a threat to the scalability of grids, Shibboleth as a system of supporting devolved authentication can be seen to be a solution. The need for devolved authentication is highlighted by the arguments within the [#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck] section above. Some assume that PKI avoids the difficulties of devolved authentication by using RAs and long term, secure digital authentication credentials. This is clearly false as RAs rely on third parties to vouch for the identities and statuses of the applicants that come before them. This is the "association" part within the [#assocRevo 'association-revocation gap'] concept that was mentioned earlier. This is (conveniently) not generally recognised. However, who issued that user's university card? It was a trusted devolved third party.
Anchor(SHIBidentitymanscalability)
(Shib) Identity management is a scalability bottleneck
Shibboleth would avoid this bottleneck as, almost by definition, the identity providers already support all of the users that are necessary. The only bottleneck would be if there were different authentication policies required by the grid communities: a reasonable request in some cases, but one which may place more demands on the identity managers than they have currently.
(Shib) Identity is best managed by a very trustworthy organisation
If identity can be separated completely from status, roles and other attributes - a separation in which Shibboleth excels - then identity becomes of lesser importance. Identity can be established once, using strict security policies and needs little active management.
(Shib) Attribute management is a scalability bottleneck
In a grid that relied upon Shibboleth, this would be true. Active attribute management would become the most onerous part of the security matrix. This is as it should be. Organisationally and procedurally, this is the most complex responsibility. Therefore, identifying attribute management should be, as expected to be, the true bottleneck. Using attribute authorities (AAs) and Shibboleth/SAML transport, attributes can be managed so that they can change frequently and in near-real time. Most advantageously, they can be managed by the individuals who are truly in a position to judge or control their status.
One area where the Shibboleth architecture may fall a little short is in supporting virtual organisations (VOs). Currently, there is space in the main architecture for only one attribute authority (AA) to be associated with an identity provider (IdP). For VOs to be managed easily, a user's main/home AA must be able to chain or devolve to third party AAs. An alternative model would be for the resource to initiate a new query to the secondary or VO AA (let's call this AA2). These two possible mechanisms are illustrated below:
(Shib) Chaining/devolving AAs
Figure 1, below, shows a possible generalised mechanism for this scenario. The direct user interaction has not been shown in order to highlight the most important points. In a Shibboleth transaction, a user tries to access or use a grid resource and is directed back to his/her IdP to actively authenticate. After the SP receives a 'handle' (temporary session identifier for the user - 1), this is followed by the resource contacting the user's AA to pick up attributes with which to make the authorisation decision (2). (For the sake of our example, consider that the resource only allows access/use to entities which belong to a particular VO and have some particular attributes managed by that VO). The VO manages lists of users (or unique IDs) and the users' attributes. The resource 'asks' the home/main AA whether the user is a member of the VO. The AA is unable to manage this information but contains a pointer to the VO's AA2. The AA thus queries AA2 (3 and 4) and sends the information back to the resource in the SAML exchange (5).
(In this way, the VO AA2 trusts the IdP to authenticate the user - something which it cannot do itself - but then confirms back to the home/main AA that the user is (or is not) a member of the VO and she holds these attributes. The AA transmits the signed assertions from AA2 to the resource.)
- http://users.ox.ac.uk/~markn/wikifiles/AttribsFromVO1.png - Figure 1. Diagram of possible mechanism of chaining AAs for enabling VOs 
The two difficulties with this model are that:
- It is incumbent on the AA to know where is the appropriate AA2. This may be achieved by setting it as a user-editable attribute. 
- The home IdP/AA must again be used if - at a later time in the session - the SP needs to check membership of another VO (or the same one if this needs to be checked frequently).
(Shib) Secondary query to AA2
This is a possibly simpler mechanism whereby the resource has the IdP authenticate the user, and possibly supply some attributes, but the resource initiates a separate transaction to the VO AA2 to find out whether the user is a member and to obtain his/her VO attributes. This is achievable using the current Shibboleth architecture (although the query to the AA2 would be an extension). However, the AA would have to always pass a permanent unique user identifier to the resource. Figure 2 shows a brief summary of such a mechanism.
There will, therefore, be some future use cases where this model will break an anonymity or pseudonymity requirement. The user will always have to present the same unique identifier to the resource to gain access or use. This is likely to be an uncommon requirement, but will inevitably exist in some cases. Therefore, it may be that the chaining/devolving AAs model is preferable as it should be usable in both of these situations. Nevertheless, in the major initiative investigating these issues, it is variations on this second model that are apparently favoured (see the GridShib project).
- http://users.ox.ac.uk/~markn/wikifiles/AttribsFromVO2.png - Figure 2. Diagram of possible mechanism of chaining AAs for enabling VOs 
(Shib) Grids need good levels of assurance
We hope that the arguments expressed in [#PKILoA (PKI) Grids need good levels of assurance] above have fully addressed this issue. In short, if there is a great distance between the day-to-day identity management and revocation and the central security system the (apparent) LoA is misleading. A user who robbed his university's computer suite last week and has had all his university passes, cards and accounts revoked is very likely to still hold his grid digital certificate because that is managed nationally. In this case, the university's HTTPS username/password system may provide far more assurance than his digital certificate that takes a year to expire.
Of course, in the scenario above, the villain's identity has not changed, so you could argue that there is nothing wrong with assuring a third party that he is 'Mr Smith' but we have already discussed the implicit authorisation involved in holding a digital certificate identifying the user as a member of the O and the OU (and access decisions will be based upon these attributes). There are many scenarios where the university could have found that he is an identity fraudster and removed his cards and accounts. The central grid security personnel may be the last to know.
(Shib) Trust must be kept to a minimum on grids
We hope that the arguments put forward in the sections above, but especially in the [#PKIidenttrustorg (PKI) Identity is best managed by a very trustworthy organisation] section help to prove that there are more levels of trust in PKI than are usually acknowledged. In a devolved authentication system, the trust is always explicit and this may be seen as an advantage. With the planned support of virtual organisations (VOs) comes a necessary devolution of authorisation-supporting attributes. Shibboleth is generally a good architecture to support this.
(Shib) Security levels in the information environment are inadequate
See the above [#PKIsecurityieinadeq (PKI) Security levels in the information environment are inadequate] section as it is equally applicable here. In summary:
- The PKI-based identity management is dependent upon the 'information environment' current identity management procedures, so why not trust Shibboleth?
- The use of digital certificates by users should not be taken as a short-hand for 'high security'. It is the policies that support the identity management system, combined with cryptographically secure assertions that gives a higher level of security assertion.
- The over-centralisation of the PKI-mediated identity management means that active revocation is not pursued. This, arguably, tends to give the 'information environment' somewhat better security than the PKI-based arrangement.
As Shibboleth is highly compatible with the existing information environment procedures and there are some security flaws in the existing PKI model, this lends some weight to the argument for using Shibboleth for realms of higher security, such as grids. Of course, Shibboleth and the local organisations' single sign-on mechanisms could be extended to accept certificates and, conversely, the PKI could be modified so that it explicitly trusts local identity management staff so that users are revoked more actively.
Conclusions
Devolved authentication and ID management
The first argument to explore is the need for devolved ID management and identity assertion, which Shibboleth can support easily, and which PKI could support, but usually does not. We believe that there is a strong argument for devolved ID management for both good security and high scalability reasons. As long as the malpractising user can be traced, successfully and quickly, by the resource provider or grid node, then devolved ID management and authentication is very nearly a necessity for a highly scalable and secure grid. Shibboleth provides possibilities for devolved authentication and ID management that should be secure enough for grid use as long as Federation security policies are followed by the home organisations (Identity Providers: IdPs).
Authorisation and attribute management
The blurring of what is authorisation and where it occurs often confuses thinking on these matters. It is our assertion that resource owners should set authorisation policies. Home organisations and virtual organisations should associate attributes (such as roles and status) with users and entities that allow the grid resources to make the authorisation decisions. We make no apologies for re-stating these principles. However, much thinking seems to be along the lines that a virtual organisation grants access to a resource; whereas this may be true in practice in many cases, it is theoretically only a subset case of the main principle that has just been stated.
In any case - and especially for virtual organisations to be a possibility - it must be possible for the appropriate 'managers/administrators' to be able to set user attributes as easily as possible, and to be able to change these frequently and easily. Shibboleth could be easily extended to provide this functionality (see the section on [#SHIBattributemanscale (Shib) Attribute management is a scalability bottleneck] above for ideas as to how this may be achieved). See also the [http://gridshib.globus.org/ GridShib project] for further ideas and proposed solutions. Solutions such as [http://edg-wp2.web.cern.ch/edg-wp2/security/voms/voms.html VOMS (Virtual Organization Membership Service)] are addressing this within the PKI arena. See also our thoughts on [:PolicyManagementAndExchange:policy management] for a summary of some of the work in this area.
Trustworthiness and security
We outlined the difficulties of the [#assocRevo 'association-revocation gap'] in various sections above, but especially in [#PKIidentitymanscalability (PKI) Identity management is a scalability bottleneck] and [#SHIBmustscale (Shib) Grids must scale].
If resource owners or grid middleware experts accept the premise that many trustworthy individuals performing tasks that are within their capabilities is a more secure situation than few very trustworthy individuals performing tasks that they cannot carry out securely, then Shibboleth can be proved easily to be an excellent architecture for use within a next generation grid.
Furthermore, Shibboleth allows for IdPs and Resource Providers to belong to Federations with specific security rules. It could be possible that Federations include specific security policies for grids. Another possibility is that the [http://www.oasis-open.org/committees/security/ SAML] assertions that underpin Shibboleth can transmit a value of 'Level of Assurance' and/or 'Authentication Method' (e.g. password, kerberos, certificate etc.). We would prefer Level of Assurance to be used, and for this to remain separate from the Authentication Method. LoA could then be used with grid security policies without the heavyweight need to establish new Federations.
Scalability
Current grid security methods and policies - for example, those used within the UK e-Science Grid community - will not scale to encompass many more users. The current policies are geared towards each resource owner having an active and trusting relationship with each user (individually). This is clearly unscalable. A scalable solution necessarily involves devolved authentication, or at least the devolution of the management of user identities to local organisations. This is very difficult to achieve - although not impossible - using PKI and client certificates. It is easier to achieve via Shibboleth, as devolved authentication was a primary driver behind its design.
Another 'model' of grid use that will grow in popularity is the Customer-ServiceProvider model whereby a Service Provider takes on the responsibility of authenticating and authorising users. This could be direct or could again be devolved, using Shibboleth, for example. With this model, the Service Provider is the entity that is truly the grid 'user' but will run jobs on the grid on behalf of the person or entity making the request. This model is likely to be the most frequently employed by 'users' and would usually break the [http://www.dcoce.ox.ac.uk/glossary/index.xml.ID=CPS Certification Practices Statement] (issuing policy) associated with client-certificate PKI, but does not necessarily break the use of the technology.
The usability of client certificates
There appears to be a tension between the belief that client certificates are too difficult for 'normal' end users and the need for high security. We hope that, in our arguments above, we have clearly called into question the high security of client certificate PKI. The [http://www.dcoce.ox.ac.uk DCOCE project] found that digital certificates should not be too difficult for such users to employ successfully, but that currently they do pose difficulties. There are many small usability issues, each of which appears trivial for manufacturers and service providers to solve, but yet these issues remain, and their cumulative effect is of severe usability difficulties. Shibboleth would allow for the use of certificates as well as more user-friendly modes of authentication, such as https-based username/password authentication systems employed by a large number of organisations at present. This 'mixed economy' may provide the flexibility needed by the grid community.
Summary
Shibboleth appears to provide a solution to the issues of scalability and of managing the large amount of identity information that is necessary. PKI could provide this solution in some forms, but it would require the policies that accompany PKI to be changed to devolve the identity management to people and bodies that are able to properly undertake this task. Whereas this can be achieved (as the [http://www.dcoce.ox.ac.uk DCOCE project] established), it should be far easier to integrate grid access management with that of the information environment community - i.e. using Shibboleth - and to concentrate efforts on security policies and Levels of Assurance. Thus, in practical terms, Shibboleth may be the best way of achieving grid access scalability and the high security and encryption that PKI provides can be used where it fits the purpose. Furthermore, it may be beneficial to allow many users who do not require 'deep' or 'technical' access to grid resources - but who nevertheless need to benefit from the grid - to avoid having to use client digital certificates. Shibboleth should provide an excellent mechanism to facilitate the use of many electronic identity/authentication credentials from username/password to digital certificates.