What is a VO? - towards a definition
Please edit - or at least put in your comments to - this page.
You need to log in first. Create yourself a Profile first, by clicking the Login link above. Please do!
When you then leave your comment, please leave your name and the date/time. A quick way of doing this can be done using the following example:
@USER@ @TIME@
which will render something like:
-- markn DateTime(2006-02-20T18:47:22Z)
Contents
[#General General Expectations of a VO]
- [#1stPrinc First principles] [#Userperspec What is a VO from a user's perspective?]
[#ServicePerspec What is a VO from a service developer's (or owner's) perspective?] [#RPsPerspec What is a VO from a resource provider's perspective?]
[#ResVirtAbs Where resource virtualization is not present]
[#ResVirtExists In the future, where resource virtualization exists]
[#Adef A definition?]
[#GenNotes General notes and justifications] [#Definition First attempt at a definition]
General Expectations of a VO
First principles
(N.B. "authN" = Authentication; "authZ" = Authorisation)
- AuthZ is performed at the resource or service. It isn't the responsibility of the VO to do this (although this is usually the effective outcome).
- The VO houses and maintains attributes about users. These are given, on demand or 'up front', to the resource/service where the access decision is made.
- AuthN is not normally performed at the VOs (this would arguably make them "O's").
- Groups of resources are not VOs. They might be "grids" or they may have another collective name. (e.g. a campus grid is not a VO: it is a set of resources used by a sub-set of people on and off the campus. The set of users (and other entities) may be a VO.
- In the examples that follow, the term grid or grids denote groups of resources that are typically collaborating in some way. Typically, they would share the same middleware or key protocols. However, a single resource could theoretically belong to many grids.
What is a VO from a user's perspective?
An (end) user (i.e. someone who does not run a service or own a grid node) should know (or find out) that for him to use a particular grid service, or collection of services, he must join a particular VO.
- For example, a biologist wants to run her data through the grid-based extinction-rate algorithm service. She finds out that this is provided by computer scientists working at the International Ecological Society. She joins the IES. When she attempts to use the Extinction Rate Grid Application, the underlying service finds out that she is a member of the IES and allows her to proceed.
What is a VO from a service developer's (or service owner's) perspective?
A service may be provided and maintained within a grid by someone who decides to (or is mandated to) serve users within a particular community. The VO represents that 'community'.
- For example, a developer has been funded by the IES to provide a service for all authenticated members of higher education instituations throughout the world as well as all members of the IES. The service developer is partly responsible for ensuring that the service cannot be used by people outside these communities.
What is a VO from a resource provider's perspective?
A resource provider may own machines upon which grid services are run. The resource provider may have a personal preference (or one which comes from his organisation) as to which services are run on his resource.
- For example, the resource provider may wish to exclude all biological services in favour of providing resource for text-mining services. Therefore, he does not have to worry about biological VOs.
Where the resource provider allows services, he may wish to account for the use of his resource. He could do this in two ways:
- Count the cycles used by particular services or applications and bill the owners/maintainers of those services/apps (and leave them to charge their users if they wish to do so).
- Identify every user and bill them directly for cycles used.
- (As a combination of the above), identify every user, look them up in the VOs to exclude the VOs to which he wishes to provide his resource free of charge. Bill the appropriate users. (Clearly #1 in the above list is the easiest, logically, but mechanisms need to exist to enable this scenario).
Those resource providers who are not deliberately joining a diverse grid may wish to restrict the use of their resources to only some services and only some VOs. In this case, the resource provider has the same concerns and
- restricts the services and/or
- restricts the VOs
that can use the resource.
Where resource virtualization is not present
In grids where the end user is fully aware of using a particular grid node, then the node owner may be considered to have a similar interest to that of the service developer in the above examples. The node owner is directly concerned with users and what they do on her machine.
(In the future, or...) Where resource virtualization exists
End users will not have a direct relationship with resource owners. AuthN, AuthZ (access control) and accounting will have to be performed at either the application or service levels (whichever is appropriate). Alternatively, resource brokers may have to exclude users in certain VOs from accessing certain resources. Resource owners may wish to bill service providers and/or application providers (which may be synonymous to VOs) for the CPU time when those services or applications are active at their resources.
Responsibilities diagram
Note that I have used the concept of "billing" below as an example of accounting. I think that this is a useful concept in bringing VOs into focus. Even if we cannot envisage actually charging for the use of a service or resource, it provides a useful example of metering and quota-filling (which are far easier to envisage) as well as sophisticated authZ.
Architecture Level: Resource Service/application User Authentication AuthN of AuthN of user (may be AuthN'ed by service devolved, but proof service or needed) resource or 3rd party trusted by serv/resource Authorization AuthZ of AuthZ of user AuthZ at service <*VO lookup*> service -- OR -- AuthZ of AuthZ at all users --------------------------------> every <*VO lookup*> --------------------------------> resource -- OR -- AuthZ devolved to AuthZ at res. broker res. broker Accounting/ Bills -----> Bills [1,2] -------> Is billed [1] Billing service/app User by service/app. or VO or VO -- OR -- Bills user --------------------------------> Is billed by directly --------------------------------> (possibly many) resource provider(s) [1] Optional: Some services will not bill users, as they may be funded directly (without accounting) by the VO. [2] Service/app may have to provide usage info to VO so that VO can bill users accurately. VO possible 1. Provide info. to service/app for AuthZ decision. responsibilities 2. Provide info. to resource for AuthZ decision. 3. Provide info. to res. broker for AuthZ decision. 4. Hold a repository of usage statistics for individual users. 5. Hold a mapping of identifiers from IdPs (e.g. DNs) to VOs own user identifier. VO mandatory 1. Provide user info. to service/apps and/or resources responsibilities and/or resource brokers for AuthZ decisions.
A definition?
General notes and justifications
Foster et al. (2001) suggested, "A Virtual Organization is a collection of individuals and institutions that is defined according to a set of resource sharing rules".[A] This definition seems too narrow and possibly alien to the real world. The reasons for saying this are:
"institutions" clouds the issue. In the last few years, VOs have been thought of as most likely to be subsets of users from within institutions grouped with other subsets of users at other institutions. Perhaps the definition would be better with the word institution absent altogether or included as "and/or institutions" if necessary.
- "a set of resource sharing rules" is again too narrow. Users can belong to a VO (such as the International Ecological Society) and resources (or services) choose to be available to the VOs users.
- the definition also implies that the VO owns the resources (or at least drives the access policies of the resources) which it may or may not be the case (in the real world examples that have arisen recently).
Although citing the above Foster et al. definition, in 2004, the community developing the Virtual Organization Membership Service (VOMS) software discussed VOs in terms of users, rather than institutions or resources.[B] Alfieri et al. noted that "VOs generally share resources", but they clearly will not always share resources and certainly there will be VOs in existence that do not own any resources at all. Later in the same document, Alfieri et al. stress the concept of the VO is that "VOs administer users, grant them permissions and establish agreements with resource providers (RPs). RPs, in turn, enforce local authorization." This seems far closer to a realistic definition of a VO. Alfieri et al. state that "the owner of a resource (i.e. the RP) should be able to enforce local user authorization based on various user characteristics such as his membership in a VO, roles he can have or his identity.
[A] I. Foster, C. Kesselman and S. Tuecke, The Anatomy of the Grid, Interna- tional Journal of High performance Computing Applications, 15, 3, 2001
[B] R. Alfieri, R. Cecchini, V. Ciaschini, F. Spataro, L. dell'Agnello, A. Frohner and K. Lörentey, From gridmap-file to VOMS: managing Authorization in a Grid environment, Future Generation Computer Systems, 2005, http://grid-auth.infn.it/docs/voms-FGCS.pdf, April, 2004.
Characteristics of a VO
The following are therefore characteristics of a VO, expressed in terms that attempt to avoid over-restriction or promoting concepts such as "usually" or "generally" too highly. VOs
- represent groups of users which may cross administrative boundaries
should imply a definable membership in that users can join and leave such groups
- (with respect to grids) represent communities for which access to grid resources or services or applications may be granted or denied
- may contain varying statuses of user and attributes about those users
- are definable in themselves as lists of members
- do not normally provide identity, but instead rely on externally trusted parties for identity establishment (authentication).
First attempt at a definition
A VO is definable as a list of identified users that represents a real-world group of people [*] that have a clear membership. The VO is not usually the primary point for the establishment or assertion of identity and may be relied upon by grid resources, services and applications to provide information for authorisation decisions. At its simplest, a VO contains a list of members and their unique identifiers. At its most complex, a VO may contain different status levels of members and many attributes about the members as well as accounting information regarding members' use of grid resources, services or applications.
END OF DEFINITION
Questions
[*] I was very tempted to put "and other entities" here, but maybe we don't need those words? It is hard to imagine a 'machine', for example, being given a membership of a VO so that any user or service running on that machine can use other machines that have been made available to the VO. Surely, it's the end users who are the important factors. But maybe someone needs to come up with a good example!!
Previous Work
I have created a page of links etc. to PreviousVOwork. This should help us see how definitions have evolved and how they have been used in the past few years.