| Size: 3562 Comment:  | Size: 12008 Comment:  | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 1: | Line 1: | 
| This page contains notes building towards a formal document regarding the role of Shibboleth with grids. It necessarily challenges some basic assumptions of the way that authentication and authorisation are currently managed in grids. | ~+ What is a VO? - towards a definition +~ | 
| Line 3: | Line 3: | 
| This work forms the bulk of the eSP-grid workpackage five (Shibboleth Evaluation). | ---- Contents | 
| Line 5: | Line 6: | 
| = Assumptions = | [#General General Expectations of a VO] | 
| Line 7: | Line 8: | 
| [[Anchor(mustscale)]] == Grids must scale == | [#1stPrinc First principles] | 
| Line 10: | Line 10: | 
| [[Anchor(identitymanscalability)]] == Identity management is a scalability bottleneck == | [#Userperspec What is a VO from a user's perspective?] | 
| Line 13: | Line 12: | 
| [[Anchor(identhomeorg)]] === Identity is best managed by "home organisations" === But need not be - identity is easier to manage than role etc. | [#ServicePerspec What is a VO from a service developer's (or owner's) perspective?] | 
| Line 17: | Line 14: | 
| [[Anchor(attributemanscale)]] == Attribute management is a scalability bottleneck == | [#RPsPerspec What is a VO from a resource provider's perspective?] | 
| Line 20: | Line 16: | 
| [[Anchor(trustminimum)]] == Trust must be kept to a minimum on grids == Yes, general principle is true. However, as a resource owner it may not be ''possible'' to manage more than n users and therefore you ''have'' to trust third parties. Even for a very low number of users, a grid resource owner may be the last to find out that a user has been convicted as a criminal for fraud, or has been determined to have hacked another resource. | [#ResVirtAbs Where resource virtualization is not present] [#ResVirtExists In the future, where resource virtualization exists] [#Respons Responsibilities diagram] | 
| Line 24: | Line 22: | 
| [[Anchor(securityieinadeq)]] == Security levels in the information environment are inadequate == Grid cannot trust levels of authN in users home organisations. Grid RAs and CAs are better. | [#Adef A definition?] [#GenNotes General notes and justifications] [#Definition First attempt at a definition] ---- [[Anchor(General)]] = General Expectations of a VO = [[Anchor(1stPrinc)]] == First principles == 1. AuthZ is performed at the resource or service. It isn't the responsibility of the VO to do this (although this is ''usually'' the effective outcome). 1. The VO houses and maintains attributes about users. These are given, on demand or 'up front', to the resource/service where the access decision is made. 1. AuthN is not normally performed at the VOs (this would arguably make them "O's"). 1. Groups of resources are not VOs. They might be "grids" or they may have another collective name. (e.g. a campus grid is not a VO: it is a set of resources used by a sub-set of people on and off the campus. The set of users (and other entities) may be a VO. 1. In the examples that follow, the term grid or grids denote groups of resources that are typically collaborating in some way. Typically, they would share the same middleware or key protocols. However, a single resource could theoretically belong to many grids. [[Anchor(Userperspec)]] == What is a VO from a user's perspective? == An (end) user (i.e. someone who does not run a service or own a grid node) should know (or find out) that for him to use a particular grid service, or collection of services, he must join a particular VO. For example, a biologist wants to run her data through the grid-based extinction-rate algorithm service. She finds out that this is provided by computer scientists working at the International Ecological Society. She joins the IES. When she attempts to use the Extinction Rate Grid Application, the underlying service finds out that she is a member of the IES and allows her to proceed. [[Anchor(ServicePerspec)]] == What is a VO from a service developer's (or owner's) perspective? == A service may be provided and maintained within a grid by someone who decides to (or is mandated to) serve users within a particular community. The VO represents that 'community'. For example, a developer has been funded by the IES to provide a service for all authenticated members of higher education instituations throughout the world as well as all members of the IES. The service developer is partly responsible for ensuring that the service cannot be used by people outside these communities. [[Anchor(RPsPerspec)]] == What is a VO from a resource provider's perspective? == A resource provider may own machines upon which grid services are run. The resource provider may have a personal preference (or one which comes from his organisation) as to which services are run on his resource. For example, the resource provider may wish to exclude all biological services in favour of providing resource for text-mining services. Therefore, he does not have to worry about biological VOs. Where the resource provider allows services, he may wish to account for the use of his resource. He could do this in two ways: 1. Count the cycles used by particular services or applications and bill the owners/maintainers of those services/apps (and leave them to charge their users if they wish to do so). 1. Identify every user and bill them directly for cycles used. 1. (As a combination of the above), identify every user, look them up in the VOs to exclude the VOs to which he wishes to provide his resource free of charge. Bill the appropriate users. (Clearly #1 in the above list is the easiest, logically, but mechanisms need to exist to enable this scenario). Those resource providers who are not deliberately joining a diverse grid may wish to restrict the use of their resources to only some services and only some VOs. In this case, the resource provider has the same concerns and * restricts the services and/or * restricts the VOs that can use the resource. [[Anchor(ResVirtAbs)]] === Where resource virtualization is not present === In grids where the end user is fully aware of using a particular grid node, then the node owner may be considered to have a similar interest to that of the service developer in the above examples. The node owner is directly concerned with users and what they do on her machine. [[Anchor(ResVirtExists)]] === In the future, where resource virtualization exists === End users will not have a direct relationship with resource owners. AuthN, AuthZ (access control) and accounting will ''have'' to be performed at either the application or service levels (whichever is appropriate). Alternatively, resource brokers may have to exclude users in certain VOs from accessing certain resources. Resource owners may wish to bill service providers and/or application providers (which may be synonymous to VOs) for the CPU time when those services or applications are active at their resources. [[Anchor(Respons)]] == Responsibilities diagram == {{{ Architecture Level: Resource Service/application User Authentication AuthN of AuthN of user (may be AuthN'ed by service devolved, but proof service or needed) resource or 3rd party trusted by serv/resource | 
| Line 29: | Line 94: | 
| [[Anchor(PKIvsassumpts)]] = How does PKI live up to these assumptions? = | Authorization  AuthZ of          AuthZ of user               AuthZ at service <*VO lookup*> service -- OR -- AuthZ of AuthZ at all users --------------------------------> every <*VO lookup*> --------------------------------> resource -- OR -- AuthZ devolved to AuthZ at res. broker res. broker | 
| Line 32: | Line 105: | 
| [[Anchor(PKImustscale)]] == Grids must scale == | Accounting/    Bills       ----->      Bills [1,2]  -------> Is billed [1] Billing service/app User by service/app. or VO or VO -- OR -- Bills user --------------------------------> Is billed by directly --------------------------------> (possibly many) resources [1] Optional: Some services will not bill users, as they may be funded directly (without accounting) by the VO. | 
| Line 35: | Line 117: | 
| [[Anchor(PKIidentitymanscalability)]] == Identity management is a scalability bottleneck == | [2] Service/app may have to provide usage info to VO so that VO can bill users accurately. | 
| Line 38: | Line 120: | 
| [[Anchor(PKIidenthomeorg)]] === Identity is best managed by "home organisations" === But need not be - identity is easier to manage than role etc. | VO possible        1. Provide info. to service/app for AuthZ decision. responsibilities 2. Provide info. to resource for AuthZ decision. 3. Provide info. to res. broker for AuthZ decision. 4. Hold a repository of usage statistics for individual users. 5. Hold a mapping of identifiers from IdPs (e.g. DNs) to VOs own user identifier. VO mandatory 1. Provide user info. to service/apps and/or resources responsibilities and/or resource brokers for AuthZ decisions. | 
| Line 42: | Line 132: | 
| [[Anchor(PKIattributemanscale)]] == Attribute management is a scalability bottleneck == [[Anchor(PKItrustminimum)]] == Trust must be kept to a minimum on grids == Yes, general principle is true. However, as a resource owner it may not be ''possible'' to manage more than n users and therefore you ''have'' to trust third parties. Even for a very low number of users, a grid resource owner may be the last to find out that a user has been convicted as a criminal for fraud, or has been determined to have hacked another resource. [[Anchor(PKIsecurityieinadeq)]] == Security levels in the information environment are inadequate == Grid cannot trust levels of authN in users home organisations. Grid RAs and CAs are better. | }}} | 
| Line 54: | Line 135: | 
| = How can Shibboleth play a role? = [[Anchor(SHIBmustscale)]] == Grids must scale == | [[Anchor(Adef)]] = A definition? = | 
| Line 58: | Line 138: | 
| [[Anchor(SHIBidentitymanscalability)]] == Identity management is a scalability bottleneck == | [[Anchor(GenNotes)]] == General notes and justifications == | 
| Line 61: | Line 141: | 
| [[Anchor(SHIBidenthomeorg)]] === Identity is best managed by "home organisations" === But need not be - identity is easier to manage than role etc. | Foster et al. (2001) described VOs as "A Virtual Organization is a collection of individuals and institutions that is defined according to a set of resource sharing rules".[A]  This definition seems too narrow and possibly alien to the real world.  The reasons for saying this are: * "institutions" clouds the issue. In the last few years, VOs have been thought of as most likely to be subsets of users from within institutions linking with other subsets of users at other institutions. Perhaps the definition would be better with the word institution absent altogether or included as "and/or institutions" if necessary. * "a set of resource sharing rules" is again too narrow. Users can belong to a VO (such as the International Ecological Society) and resources (or services) choose to be available to the VOs users. * the definition also implies that the VO owns the resources (or at least drives the access policies of the resources) which it may or may not in the real world examples that have arisen recently. Although citing the above Foster et al. definition, in 2004, the community developing the Virtual Organization Membership Service (VOMS) software discussed VOs in terms of users, rather than institutions or resources. [B] Alfieri et al. noted that "VOs generally share resources", but they clearly will not ''always'' share resources and certainly there will be VOs in existence that do not own any resources at all. Later in the same document, Alfieri et al. stress the concept of the VO is that "VOs administer users, grant them permissions and establish agreements with resource providers (RPs). RPs, in turn, enforce local authorization." This seems far closer to a realistic definition of a VO. Alfieri et al. state that "the owner of a resource (i.e. the RP) should be able to enforce local user authorization based on various user characteristics such as his membership in a VO, roles he can have or his identity. | 
| Line 65: | Line 148: | 
| [[Anchor(SHIBattributemanscale)]] == Attribute management is a scalability bottleneck == | [A] I. Foster, C. Kesselman and S. Tuecke, The Anatomy of the Grid, Interna- tional Journal of High performance Computing Applications, 15, 3, 2001 | 
| Line 68: | Line 151: | 
| [[Anchor(SHIBtrustminimum)]] == Trust must be kept to a minimum on grids == Yes, general principle is true. However, as a resource owner it may not be ''possible'' to manage more than n users and therefore you ''have'' to trust third parties. Even for a very low number of users, a grid resource owner may be the last to find out that a user has been convicted as a criminal for fraud, or has been determined to have hacked another resource. | [B] R. Alfieri, R. Cecchini, V. Ciaschini, F. Spataro, L. dell'Agnello, A. Frohner and K. Lörentey, From gridmap-file to VOMS: managing Authorization in a Grid environment, xxxxURLxxxx, April, 2004. | 
| Line 72: | Line 153: | 
| [[Anchor(SHIBsecurityieinadeq)]] == Security levels in the information environment are inadequate == Grid cannot trust levels of authN in users home organisations. Grid RAs and CAs are better. | === Characteristics of a VO === The following are therefore characteristics of a VO, expressed in terms that attempt to avoid over-restriction or promoting concepts such as "usually" or "generally" too highly. VOs * represent groups of users which may cross administrative boundaries * should imply difinable membership in such groups in that users can join and leave * (with respect to grids) represent a community for which access to grid resources or services or applications may be granted or denied * may contain varying statuses of user and attributes about those users * are definable in themselves as lists of members * do not normally provide identity, but instead rely on externally trusted parties for identity establishment (authentication). [[Anchor(Definition)]] == First attempt at a definition == A VO is definable as a list of identified users that represents a real-world group of people that have a clear membership. The VO is not usually the primary point for the establishment or assertion of identity and may be relied upon by grid resources, services and applications to provide information for authorisation decisions. At its simplest a VO contains a list of members and their unique identifiers. At its most complex a VO may contain different status levels of members and many attributes about the members as well as accounting information regarding members' use of grid resources, services or applications. | 
What is a VO? - towards a definition
Contents
[#General General Expectations of a VO]
[#1stPrinc First principles]
[#Userperspec What is a VO from a user's perspective?]
[#ServicePerspec What is a VO from a service developer's (or owner's) perspective?]
[#RPsPerspec What is a VO from a resource provider's perspective?]
- [#ResVirtAbs Where resource virtualization is not present] - [#ResVirtExists In the future, where resource virtualization exists] 
[#Respons Responsibilities diagram]
[#Adef A definition?]
- [#GenNotes General notes and justifications] [#Definition First attempt at a definition] 
General Expectations of a VO
First principles
- AuthZ is performed at the resource or service. It isn't the responsibility of the VO to do this (although this is usually the effective outcome). 
- The VO houses and maintains attributes about users. These are given, on demand or 'up front', to the resource/service where the access decision is made.
- AuthN is not normally performed at the VOs (this would arguably make them "O's").
- Groups of resources are not VOs. They might be "grids" or they may have another collective name. (e.g. a campus grid is not a VO: it is a set of resources used by a sub-set of people on and off the campus. The set of users (and other entities) may be a VO.
- In the examples that follow, the term grid or grids denote groups of resources that are typically collaborating in some way. Typically, they would share the same middleware or key protocols. However, a single resource could theoretically belong to many grids.
What is a VO from a user's perspective?
An (end) user (i.e. someone who does not run a service or own a grid node) should know (or find out) that for him to use a particular grid service, or collection of services, he must join a particular VO.
- For example, a biologist wants to run her data through the grid-based extinction-rate algorithm service. She finds out that this is provided by computer scientists working at the International Ecological Society. She joins the IES. When she attempts to use the Extinction Rate Grid Application, the underlying service finds out that she is a member of the IES and allows her to proceed.
What is a VO from a service developer's (or owner's) perspective?
A service may be provided and maintained within a grid by someone who decides to (or is mandated to) serve users within a particular community. The VO represents that 'community'.
- For example, a developer has been funded by the IES to provide a service for all authenticated members of higher education instituations throughout the world as well as all members of the IES. The service developer is partly responsible for ensuring that the service cannot be used by people outside these communities.
What is a VO from a resource provider's perspective?
A resource provider may own machines upon which grid services are run. The resource provider may have a personal preference (or one which comes from his organisation) as to which services are run on his resource.
- For example, the resource provider may wish to exclude all biological services in favour of providing resource for text-mining services. Therefore, he does not have to worry about biological VOs.
Where the resource provider allows services, he may wish to account for the use of his resource. He could do this in two ways:
- Count the cycles used by particular services or applications and bill the owners/maintainers of those services/apps (and leave them to charge their users if they wish to do so).
- Identify every user and bill them directly for cycles used.
- (As a combination of the above), identify every user, look them up in the VOs to exclude the VOs to which he wishes to provide his resource free of charge. Bill the appropriate users. (Clearly #1 in the above list is the easiest, logically, but mechanisms need to exist to enable this scenario).
Those resource providers who are not deliberately joining a diverse grid may wish to restrict the use of their resources to only some services and only some VOs. In this case, the resource provider has the same concerns and
- restricts the services and/or
- restricts the VOs
that can use the resource.
Where resource virtualization is not present
In grids where the end user is fully aware of using a particular grid node, then the node owner may be considered to have a similar interest to that of the service developer in the above examples. The node owner is directly concerned with users and what they do on her machine.
In the future, where resource virtualization exists
End users will not have a direct relationship with resource owners. AuthN, AuthZ (access control) and accounting will have to be performed at either the application or service levels (whichever is appropriate). Alternatively, resource brokers may have to exclude users in certain VOs from accessing certain resources. Resource owners may wish to bill service providers and/or application providers (which may be synonymous to VOs) for the CPU time when those services or applications are active at their resources.
Responsibilities diagram
Architecture
Level:         Resource          Service/application         User
Authentication AuthN of          AuthN of user (may be       AuthN'ed by
               service           devolved, but proof         service or
                                 needed)                     resource or
                                                             3rd party
                                                             trusted by
                                                             serv/resource
Authorization  AuthZ of          AuthZ of user               AuthZ at
               service           <*VO lookup*>               service
                                    -- OR --
               AuthZ of                                      AuthZ at
               all users   --------------------------------> every
             <*VO lookup*> --------------------------------> resource
                                    -- OR --
               AuthZ
               devolved to                                   AuthZ at
               res. broker                                   res. broker
Accounting/    Bills       ----->      Bills [1,2]  -------> Is billed [1]
Billing        service/app             User                  by service/app.
               or VO                                         or VO
                                    -- OR --
               Bills user  --------------------------------> Is billed by
               directly    --------------------------------> (possibly many)
                                                             resources
                                                             
                                                             
[1] Optional: Some services will not bill users, as they may be funded
directly (without accounting) by the VO.
[2] Service/app may have to provide usage info to VO so that VO can bill
users accurately.
VO possible        1. Provide info. to service/app for AuthZ decision.
responsibilities   2. Provide info. to resource    for AuthZ decision.
                   3. Provide info. to res. broker for AuthZ decision.
                   4. Hold a repository of usage statistics for individual
                      users.
                   5. Hold a mapping of identifiers from IdPs (e.g. DNs) to
                      VOs own user identifier.
                      
VO mandatory       1. Provide user info. to service/apps and/or resources
responsibilities      and/or resource brokers for AuthZ decisions.
                   A definition?
General notes and justifications
Foster et al. (2001) described VOs as "A Virtual Organization is a collection of individuals and institutions that is defined according to a set of resource sharing rules".[A] This definition seems too narrow and possibly alien to the real world. The reasons for saying this are:
- "institutions" clouds the issue. In the last few years, VOs have been thought of as most likely to be subsets of users from within institutions linking with other subsets of users at other institutions. Perhaps the definition would be better with the word institution absent altogether or included as "and/or institutions" if necessary.
- "a set of resource sharing rules" is again too narrow. Users can belong to a VO (such as the International Ecological Society) and resources (or services) choose to be available to the VOs users.
- the definition also implies that the VO owns the resources (or at least drives the access policies of the resources) which it may or may not in the real world examples that have arisen recently.
Although citing the above Foster et al. definition, in 2004, the community developing the Virtual Organization Membership Service (VOMS) software discussed VOs in terms of users, rather than institutions or resources. [B] Alfieri et al. noted that "VOs generally share resources", but they clearly will not always share resources and certainly there will be VOs in existence that do not own any resources at all. Later in the same document, Alfieri et al. stress the concept of the VO is that "VOs administer users, grant them permissions and establish agreements with resource providers (RPs). RPs, in turn, enforce local authorization." This seems far closer to a realistic definition of a VO. Alfieri et al. state that "the owner of a resource (i.e. the RP) should be able to enforce local user authorization based on various user characteristics such as his membership in a VO, roles he can have or his identity.
[A] I. Foster, C. Kesselman and S. Tuecke, The Anatomy of the Grid, Interna- tional Journal of High performance Computing Applications, 15, 3, 2001
[B] R. Alfieri, R. Cecchini, V. Ciaschini, F. Spataro, L. dell'Agnello, A. Frohner and K. Lörentey, From gridmap-file to VOMS: managing Authorization in a Grid environment, xxxxURLxxxx, April, 2004.
Characteristics of a VO
The following are therefore characteristics of a VO, expressed in terms that attempt to avoid over-restriction or promoting concepts such as "usually" or "generally" too highly. VOs
- represent groups of users which may cross administrative boundaries
- should imply difinable membership in such groups in that users can join and leave
- (with respect to grids) represent a community for which access to grid resources or services or applications may be granted or denied
- may contain varying statuses of user and attributes about those users
- are definable in themselves as lists of members
- do not normally provide identity, but instead rely on externally trusted parties for identity establishment (authentication).
First attempt at a definition
A VO is definable as a list of identified users that represents a real-world group of people that have a clear membership. The VO is not usually the primary point for the establishment or assertion of identity and may be relied upon by grid resources, services and applications to provide information for authorisation decisions. At its simplest a VO contains a list of members and their unique identifiers. At its most complex a VO may contain different status levels of members and many attributes about the members as well as accounting information regarding members' use of grid resources, services or applications.