Work in progress This is work building up to a final requirements document on grid AAA/Security. It is based directly upon [wiki:RequirementsBibliography Shawn Mullen et al's] (2004) document on "Grid Authentication, Authorization and Accounting Requirements" and will hopefully form a thread in taking the Mullen et al work forward. Most of the requirement text is from that document and where we disagree, we have indicated this.

Thus:


REAL TITLE: The requirements for an ideal grid

Should this be "an 'ideal' grid" or "an idealised grid" or whatever?

This document

The sections of this document are:

  1. Abstract
  2. Terminology and definitions
  3. Grid use models
  4. Site Authentication Requirements
    • (Terminology and definitions)
  5. Site Authorization Requirements
    • Unfinished list - make it better later! (Could also use some internal anchors.

Abstract

To be written last

However, it should include brief description of the grid models (e.g. Customer-Service, Technical Research User (agnostic of node), Technical Research User (node specific), Privacy/Conf./High security etc. (The musts and shoulds will be different for each grid model)

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [http://www.ietf.org/rfc/rfc2119.txt RFC2119] (Bradner 1997).

Terminology and definitions

To be written late-on when we know what we need.

Grid use models

Introduction

The use models of grids are written about elsewhere by this project. However, for clarity, they are reproduced here in summary form.

BLAH BLAH

(Chapter 1) Site Authentication Requirements

Terminology and definitions

"User secrets" refers to values intended to be known only by the user, known by the user and an authentication infrastructure, or known only to an authentication infrastructure and employed on the user's behalf after the user has authenticated with some other secret(s).

To sidestep such questions as whether "a day" means eight hours or 24 hours and just how long a month is, we will deal in seconds but not quibble over implementation variances at the 10% or 20% level.

Credentials are assumed to have lifetimes which bound their period of validity. "Long-lived" credentials have lifetimes of 1,000,000 seconds (1 megasecond or 1 Ms) or more. "Short-lived" credentials have lifetimes of 100,000 seconds (0.1 Ms) or less. Lifetimes between those limits are "intermediate." The terms long-lived and short-lived may also be applied to the secrets employed by a user to acquire credentials, although the only short-lived user secrets known to be commonly employed are one-time (or "single-use") authenticators.

(Conversions: 0.1 Ms is a bit more than a day; 1 Ms is a bit less than 2 weeks.)

If a credential's lifetime can be extended by the user, using no more proof of identity than the credential itself, this is considered "renewal" of the credential, while if the process of extending the lifetime requires measures equivalent to those employed in its initial acquisition, we consider the result a new credential.

We specifically do not consider "post-dated" credentials -- those with lifetimes that begin at some point later than the time of the authentication act. Neither do we consider the relative strengths of cryptographic protocols, algorithms, and key lengths. We assume they are always designed, selected and implemented appropriately.

Identity

Sites will often make authorization decisions on an aggregate basis: on Virtual Organization (VO) membership or group membership. However, at times it may be necessary to set access rights at the granularity of a single user. Sites therefore may need to reserve the right, and preserve the ability, to set authorization at this level. Incident handling requires the ability to identify the legitimate owner of credentials presented during transactions under investigation. However, this may be done in concert with a trusted partner (i.e. the site could accept a pseudonym with good provenance but later need to know who/what is the actual owner of the credentials presented, and could co-operate with that partner to determine the identity of the end entity in question).

IMPORTANT REMOVAL FROM THE MULLEN et al DOC:
"Accordingly, every set of authentication credentials should be tied to the identity of an individual..."

Also minor changes in the next paragraph (shown in italics).

Accordingly, every set of authentication credentials should be traceable to the identity of an individual, because this provides stronger security by way of auditability, revocation, and problem determination. However, in very special cases there may be occasion to forfeit these benefits in order to provide temporary and generic identities.

For example, an Internet cafe could provide temporary (very limited lifetime) credentials authorizing use of grid resources based solely on the fact that access was purchased. Such an identity may be a psuedonym such as "Customer 24." However, please note that it may be better to achieve a solution where a customer-service provider (CSP) model [see above xxxx] is used in such a situation.

Where credentials are supplied pseudonymously by an identity provider (home organisation) to a grid node or nodes, a service level agreement must be in place between that identity provider and the grid community that will ensure the rapid revocation of those credentials when demanded. This may be invoked by a service provider (or grid node) that detects a security problem associated with those credentials. It may be advantageous that credentials may thus be issued and re-issued easily, and therefore the consequences to the user (or end entity) may be minimised following a false positive (i.e. mistaken) revocation.

May need to re-write the above as it is rather wordy and verbose and, erm, wordy.

Other, similar identity indirections are expected:

Secure anonymous communications may still be allowable, and appropriate, for functions that do not require user authentication.

For example, in the case above of cafe access to Grid resources; the user may still require secure conversation because the results of the data derived may have some proprietary value. Anchor(assurancelevel)

Assurance

An authentication system may provide multiple methods for a user to perform their initial authentication, and these methods may differ in their convenience, resistance to attack, and risks of exposure of secrets. Even when an implementation offers its users only one method, it may not be clear to relying parties which method it is. Since some inverse correlation does exist between convenience and strength of authentication, there may be inducements to allow and employ multiple levels of authentication if sites make some class of services available through weaker but less burdensome authentication methods.

  V Many changes (unless highlighted) to the Mullen et al doc exist in the following sections V 

In a deviation from the Mullen et al document, we suggest that there is certainly a greater variety of assurance levels possible, due to both the nature of the authentication tokens (and their storage/encryption) and the policies and procedures regarding their issue and timely revocation. Thus, a numbered scale of assurance level should exist and a value should be passed from the identity provider (home organisation) to the grid node or nodes for short term credentials, or should be kept permanently (implicitly or explicitly) within longer term credentials. This assurance 'grading' needs to be of a value and format recognised by an international standard or (more likely in the shorter term) be based upon a system agreed upon by collaborators within a particular grid.

The system for grading the assurance level of each authentication assertion is beyond the scope of this document. However, this assertion should be made and transmitted. If practicable, the method used to perform authentication should be deducible from credentials, but this requirement is secondary to the requirement of the transmission of assurance level.

Levels of authentication strength

We consider this to be eyond the scope of this document. However, we demur from Mullen et al. in that we believe that the concept of "authentication strength" is too close to the concept of "assurance level". The reliability or trustablility of an authentication event or authentication token is based equally upon the technology used (and short/long term credentials and encryption etc.) and the policies used to initially authenticate, maintain data, and renew and revoke credentials.

Mode of storage

We recognize the following modes of storage of users' long-term secrets (whether used directly to authenticate to a grid resource or to a proxy), each with its own set of vulnerabilities:

What you know

What you have

Ranking of storage modes

We agree broadly with the Mullen et al document, but believe that this is where home organisation (or "identity provider") security policies play a role in ranking the storage mode in a way that would affect a judgement of assurance level.

It is not possible to give a strict ranking of storage modes discussed in section 1.5.3.3 relative to safety without asking and answering a number of questions about the details of the secrets, their storage, and their registration as the users' authentication information. Also, users may perform unsafe actions (knowingly or unknowingly) which place their secrets at much greater risk of disclosure.

Deducible Authn strength

  ^ END of Many changes (unless highlighted) from the Mullen et al doc ^ 

There are a number of cases where processes running on a machine need to authenticate to other processes. Automated processes may have to act as authenticated clients and users may wish to have automatic software ("cron jobs") that require automatic authentication. All of these should be somehow restricted such that theft of credentials from an individual machine does not easily permit their reuse elsewhere. In either case, secrets will be of the "stored" class and must be considered to be stored in cleartext form, regardless of any measures which obfuscate them.

Lifetimes

All forms of digital credential in common use are subject to possible theft and misuse. The probability of such an event is monotonically nondecreasing with time. The countermeasures against eventual credential theft are expiration and revocation. Neither measure alone is sufficient to prevent all misuse, nor is the combination of the two.

Two types of digital credential should be highlighted here to take into account the roles and behaviour of proxy parties. In many use-cases it has been found to be necessary to generate proxy credentials. These may differ in digital nature and in assurance level from the original user digital credential. They are typically shorter lived than the original digital credentials, but there may be exceptions to this generalisation. Thus, we hereby draw out the two concepts of user credential and proxy credential.

V Much changed in the following sections from the Mullen et al document to accommodate user and proxy credentials and to make some of the timings more onerous/rigorous V 

Note: In the next paragraph, I changed "authority parties" (which was undefined, I believe) to "trusted members of the grid community".  Also, beefed this up a bit, as the Mullen et al doc. said that 3rd-party requests for revocation were not time bound.  They really should be, and therefore I removed quite a bit of original text from "...under some circumstances" to the end of the paragraph .  N.B. The Mullen et al. doc seemed to suggest that in some circumstances RAs should able to demand an immediate revocation: in the DCOCE findings, we thought that these people should be able to issue revocations more freely than most other parties. 

The lifetime of authentication secrets is a separate parameter from the lifetime of credentials.

N.B. The above is original text.  However, is this skewed in support of X.509 certs?  i.e. username/password combinations need to be considered as single objects.  (I'm not sure, but) I suggest changing the text to:
The lifetime of authentication secrets is a separate parameter from the lifetime of the digital credentials (even where the combination of secret and public credential is used in some way to form an authentication credential). 

All of the above does not talk about proxy authentication credentials (except for what we have added in the 'lifetime' section). In the Mullen et al doc., these are considered in the authorization section (i.e. later). However, we will need to include them here somehow (as well?, instead?) as they are, arguably, relevant to X.509 and, definitely, Shibboleth. Nevertheless, it may be worth debating whether proxy authN credentials exist at all (i.e. once an entity is authenticated, everything that follows is authZ???).

(Chapter 2) Site Authorization Requirements

Terminology

Terminology used in this document strives to be consistent with that used in the Authorization Frameworks working group REF? xxxx.

"User" is a synonym for end entity and for subject used in the more general framework document. We preserve the use of "user" since it is more widely used within the site operations community.

"Groups" refer to groups of end entities which are accorded equivalent rights for purposes of obtaining a particular set of privileges.

"Role" refers to the set of attributes an end entity is presenting with a particular request for obtaining or asserting a privilege.

"Provenance" refers to information about the history of a request or of any type of assertion. Examples include: the identity of the original requester; and the identity of the entity that is making the assertion.

Authorization Process

V We take an important and different view from Mullen et al, expressed between these marks V 

A VO must manage information regarding users that can be used for authorization decisions. This information may be made available to certain trusted third parties. Thus, a typical authorization process may have several steps (for example, but not necessarily in the following order, user authorization, VO authorization, site authorization, resource authorization) with various implementations.

Users and VO managers must be able to rely on consistent interpretation of their policies.

The Virtual Organization must be able to decide user membership policy and allow sites to set user authorization policy. However, it is likely that a degree of co-operation between the VO and the site will be desirable in setting the site's authorization policy in most cases.

^ We take an important and different view from Mullen et al, expressed between these marks ^ 

The authorization method must be application independent.

Mutual authorization may be required.

An application or end entity may need assurances that the resource is authorized to run a specific job. The distributed program or grid job in and of itself may be of value. The results may be of value and need protection from dubious resources.

A grid job may need to specify that it is only run on systems with security level B operating systems, or systems not directly connected to the Internet, or some other operations requirement. This is more relevant in the OGSA model where service factories may incorporate more resources to handle service request loads.

Maintain Provenance

The authorization mechanism must preserve the Subject Identity of the user who originated the request, except in transactions taking part within the Customer-Service model of operations: in this case, the grid service is trusted to have carried out the authorization check on the customer and to be acting on his/her/its behalf.

Provide for method of grouping users

It should be possible to assign a user to a group. The authorization of resource access can be managed by managing permissions of the group.

Authorization Level Dependent on Authentication Strength

The authorization for access to a resource at a particular level may depend on the strength of the authentication. The level of authentication or, more likely, the level of assurance (agreed upon by the grid community, see [wiki:RequirementsDoc Section 5.3]), must be included with the credential information presented to all resource managers.

Call-outs

Call-outs prior to access to resources may be provided as a form of authorization control by the virtual organization, the site(s) and each resource provider.

Revocation

There must be the ability to quickly revoke a particular remote authorized service that may be operated under dubious procedures. The timescale for this revocation should be of order 0.1 Ms.

For example, if a remote processing resource steals computation results, it should be removed from the directory of processing resources. This is difficult in the context of the current Grid technology because of the open resource registration process and aggressive discovery algorithms. Similar such directory services on the Internet have a history of exploitation.

Authorization Attributes

Attribute Authorities

In expected grid operations, authorization attributes are managed by authorization authority servers run by VOs, by sites or other authoritative entities. These authorization attributes may contain specific authorization privileges, indicating to sites that they should be authorized to act in a particular role, or may contain statements of membership in a particular group within the VO.

Anchor(numbersofattributes)

Numbers of Attributes

Users or end entities may have any number of roles within a given Virtual Organization. Whereas VOs may choose to structure themselves and express recommended authorization policy in an arbitrary form, resource providers need appropriate mechanisms to enforce that policy in the local authorization infrastructure. Therefore, the user attributes should be stored in a standard form, and the recommended policy should be expressed from the VO authorization authority server to the site in a manner agreed between the VO and the site.

Users or end entities may be members of any number of Virtual Organizations.

Currency of Membership

Assertions of membership in roles and groups within a VO must be able to be validated by relying parties. Validation of such assertions should not succeed more than 1Ms after an authority removes the subject's membership.

Resource Administrators Authorize by Groups and Roles

VO attributes describing the roles and groups must follow a published standard, agreed upon at least within the domain of the VO. This consistency gives the Authorizer or Resource Administrator a manageable and trusted view of the membership pool. The administrator must be able to trust the concurrency of the roles and groups. This removes the need for Authorizer to have an understanding of each member. The Authorizer needs to only understand the groups and roles within this assigned membership pool.

User Selection of roles

A user must be able to select and de-select VOs and roles for a specific access (analogous to the substitute user or 'su' command on UNIX systems, allowing an entity to change the current role briefly for a critical section before returning to a role and access privilege less vulnerable or potentially dangerous.)

In addition, a user should be able to individually define the set of privileges to be used with a specific service request. This allows for least privilege access tailored to the requested service and increases system security.

Policies

Anchor(authzdeccrit)

Authorization decision criteria

The owner of a resource or data should be able to allow or deny the authorization of an end entity to carry out an action using any of the following criteria:

Precedence rules for applying authorization decision criteria must be clearly stated.

Source of authorization also a decision criterion

It may be desirable for a resource manager to be able to disable access based on the source of the authorization attributes presented in case of compromise of a particular remote attribute authority.

Combinations

The authorization method should allow any combination of the above authorization requirements, including any combination of VOs and roles (see [wiki:RequirementsDoc requirement 6.3.2]). Nevertheless, this is still a business decision to be taken between the resource owner and the VO/Attribute Authority.

Authorization may be based on Operation criteria

It should be possible to base authorization on any of the following, in addition to the authorization requirements of [wiki:RequirementsDoc section 6.4.1].

Granularity of Authorization

Depending on the application scenario, the granularity requirement for authorization decisions varies from fine grain (e.g. based on individual subject, requested action, privilege restrictions, and assets involved) to coarser-grained authorization on the basis of groups or even sites. Support for role based access control mechanisms is specifically requested for future collaborative environments but may also be desirable for other grid systems.

Collections

There should be no restrictions on the degree/level of granularity of authorization. In particular, no hard-coded limits to how the granularity is set should exist. This should include, for example, allowing authorization to a hierarchy of directories, individual directories, or individual files. It may become burdensome on the resource to support a high level of granularity, therefore it is left to the resource to set a practical level of granularity collecting objects into manageable sets.

Catalog by user

It must be possible to determine the list of resources to which an end entity has access and what actions that entity is allowed to carry out as a member of the VO(s) and role(s) set for the current session. The burden of creating this list is on the end entity. It is left to the end entity to know or lookup or discover the resource and query for access permissions. This relieves the resource from having to know how to report to the end entities. This also averts a security vulnerability similar to the historical NIS (Network Information Services) hack in which the complete access lists being pushed to slave servers were intercepted and exploited. It is recommended that resources reveal access permissions only to the authenticated entities that hold these permissions and to administrative entities (see also next paragraph).

Catalog by role

It must be possible to determine if a role or group has access to a resource. This access information is necessary to accurately stage and schedule jobs. This access information is sensitive because it could be used to exploit the Grid's security. For example, knowing that Bob has access to the targeted resource, the hackers attention is turned to Bob or his home computer.

Therefore, the following access levels are needed: A resource's access information must be accessible in its complete form to the administrator of that resource and security personnel for security audit and forensic purposes. Authenticated users may have information about all accesses he/she is allowed on that resource using the asserted identity and authorizations. Others must have access to authorization data only in the form

Authorization control points

Authorization Policy Change Control

Policy coherency tools needed

Authorization policies may change over time. Mechanisms to manage policy specification across the administrative domain of the resource, site, VO, application manager, and user should be provided.

Timely updates of policy needed

A time delay between publication of a policy change and implementation or enforcement is to be expected. There should be prompt implementation of policy change. The resource manager will implement the policy change and log compliance. The resource manager will define a prompt and reasonable time delay appropriate for the resource. Policy changes may require verification and validation before deployment.

Suspension of privileges should not delete policy

Sites and virtual organizations should have the ability to suspend resource authorization for a particular grid identity without actually deleting the authorization and therefore possibly losing tracking information.

Transparency

Directory of user's roles

VOs should provide a method providing membership and role/group information for a given user. Examples of this might include extended attributes within the user's proxy certificate and SAML attribute assertions containing agreed user attributes that are related to roles or privileges.

Transparency of Authorization information and policy

Certain groups or roles may require additional authorization before membership information is released (so as to not leak information about which accounts are privileged).

Protection of Authorization-informing attributes

Alterations of the information should only be possible through secure, authenticated access paths using procedures such that the sites are willing to trust the role / membership information returned. This requirement may involve a detailed description of how virtual organizations maintain and protect this data. (Similar, perhaps to a Certificate Policy / Certification Practices Statement for Certificate Authorities.)

Current proxy certificate specifications ensure that proxy and delegation operations never require private keys to be sent across the network. It is important to state clearly to developers that all future protocols must continue this practice. If it is necessary to send a passphrase or password across the network, they need to be encrypted at a strength equivalent to the strength of the key.

Dynamic Revocation of authorization

There is a dynamic nature to authorized access in that it may depend on the resource load, quality of service, or time of day. If authorization access changes during access, an error code should be propagated back to the application or the application should query for the authorization deny qualifier.

Standard Error Codes

The consistency and transparency to the application is aided by the use of standardized error codes of authorization denials. The error information should not provide more information than necessary, lest it create a security risk. An error return code may be accompanied with a log entry number to assist the resource administrator in synchronizing the denial instance. For example, a user may call a helpdesk to report access problems, giving the error code and log entry number. The resource administrator can reference this log entry number to provide detailed information.

Role Confirmation

Trust Model

It must be possible for the resource to confirm that a user has the VO membership(s) they claim. This is done through the trust model with the authority vetting the identity of the user. This is described in the "CA-based Trust Model for Grid Authentication and Identity Delegation" from the GGF Grid Certificate Policy Working Group.

Timeliness

It must be possible for the resource to confirm the user's claimed role(s) or group membership at the time access to a resource is requested. For example, in the Globus environment, resources assign these groups via the grid-mapfile.

Privacy

It must not be possible for unauthorized users to produce a list of members of a VO, or the list of VOs to which a user belongs. Authorized VO administrators may have access to the full list of members.

Operations

Logging

Logs documenting the resource access decisions, policies, policy changes, and resource implementation of policies should be kept. The virtual organization, site(s) and resource managers should log such events and retain these logs for 10Ms (approximately 4 months). The logs should be protected to ensure privacy and integrity.

Logs should be frequently archived on a machine different than the one on which they were generated.

When archived, the logs should be digitally signed by the archive server.

Revocation

It should be possible for a resource owner to demand the removal of a user from a VO, or at least to demand the revocation of the role(s) or attributes through which the VO attribute authority effectively enables the user to gain access to the resource.

It must be possible for the authorized administrators to revoke a user's assertion of privileges by removing the user's ability to claim a given role, a number of roles, or other attributes issued by an authority.

Revocation Timeliness

Authorization revocation should be done in a time frame consistent with the authentication revocation of 0.1Ms.

Fault Tolerance

Grids should gracefully survive partitioning so that local services can continue their operation in case a resource is disconnected or to avoid a DoS attack. This may require redundant or distributed Authorization Services.

Providing credentials to service

The authentication and authorization credentials that a user presents should be made available to the execution environment by something like a gatekeeper or job manager. In other words, the gatekeeper may have passed a request based on the presented credentials, but if this results in delegation of the request (e.g. running a job ) the authentication/authorization credentials should be made available to the final execution environment via some standard mechanism.

Authorization for Replicated Data

Dependency on unreplicated authorization service.

If files are replicated, authorization for access to this replicated data should not depend on the availability of a single source of authorization. Simply put, the source site and the source site authorization server can go down without effecting access to the replicated data at other sites. Otherwise the service is not replicated.

Consistent authorization on all replicas.

The authorization requirements on data access should be consistently applied for all replicas of the same data.

(Chapter 3) Site Accounting and Audit Requirements

Accounting and Audit Requirements Introduction

Accounting has historically had close ties to Authentication and Authorization because of the certainty with which they need to identify the entity to be associated with the accounting data. This is particularly important in the areas of security audits, intrusion detection, and computer and network forensics.

Accounting also has importance beyond accurate billing. IT management use accounting for controlling and managing operational costs. Accounting links to other IT disciplines such as capacity planning, service level management, and performance management.

Terminology

Grid Resource Accounting

Grid resource auditing is the more traditional sense of accounting that accounts for resources usage and billing.

Grid Auditing

Grid Auditing is the focus on accounting as a security component, and the need for a seamless relationship between accounting, and the authentication and authorization components of the Grid. Simply put, with a small addition to existing accounting data, an audit mechanism could greatly enhance Grid security.

Monitoring

The term "monitoring" refers, in the accounting and audit context, to the recording of transaction data. It is synonymous with "logging" in this document and does not imply timely human oversight.

Requirements Gathering

Requirements Gathering for Grid Accounting

Requirements for Grid accounting focus on the relationship of monitoring and metering authentication and authorization for auditing security. This information binds an end entity to the resource for the time and duration of access. The consumer of this information is Grid admin, helpdesk , intrusion detection or computer forensics.

Requirements Gathering for Grid Resource Accounting

It is important to understand how the audit data will be used. This will help define the accounting data gathered and the data flow. It is the goal of this document to describe the requirements of Grid accounting and audit components which satisfy a broad range of instances and usage. This chapter will also identify other current Grid working groups and accounting standards that are addressing these needs.

Non-Goals

This chapter will consider the consumers of the accounting data and their requirements, but will not analyze the consumers or make recommendations on how consumers should process the accounting data. It is not the goal of this chapter to reproduce or reinvent past accounting standards or duplicate current Grid accounting work.

Grid Auditable Data

The Grid auditing examines accounting requirements from a security perspective: audit logs, intrusion detection, and forensics. These requirements are not disjoint for mainstream accounting concerned with billing and metering, but in this section the requirements are described from the security perspective.

Grid Accounting must log the following data per resource access.

Unfinished....