Access controls

Introduction

Astronomical data is often subject to complex data rights policies. As a data platform designed to ingest and serve data from multiple experiments and groups, each potentially with different access policies, SkyPortal must be able to enforce arbitrary logic governing who can see, interact with, and modify data. SkyPortal enforces such policies using a custom row-level security (RLS) framework in the API layer.

RLS allows SkyPortal developers to define policies that restrict, with row- and user-level granularity, which rows of a table (e.g., photometry, spectra, groups, followup requests, data streams, etc.) a user can read, update, create, and delete. SkyPortal uses baselayer’s ORM-based framework for RLS within API transactions. With baselayer’s RLS framework, SkyPortal developers can

  • Be sure that access policies will be consistently enforced when protected records are accessed in an API transaction

  • Efficiently filter database queries by RLS accessibility to ensure that API endpoints only return records that users can access

  • Take advantage of vectorization to efficiently apply policy checks and filters to bulk record queries

  • Define arbitrarily complex, relational, and scalable row-level CRUD access policies on any SkyPortal table

  • Use predefined access patterns to restrict access to records by stream access, group membership, user ACLs, and more

  • Implement different policies for different types of access on a single type of record

Note: Although SkyPortal uses PostgreSQL as the database backend, SkyPortal does not use PostgreSQL’s RLS implementation for row-level security.

RLS Architecture and Design

In SkyPortal, RLS policies are defined on SQLAlchemy mapped classes. Each class has a single policy for each of the create, delete, update, and read access modes. Records are read- and create-public by default, and update- and delete-restricted (i.e., only accessible to users with the “System admin” ACL) by default. Join table classes are also update- and delete-restricted by default, but they can (by default) be created or read by any user that can read both their left and right records. These defaults can be overridden on any mapped class.

When an API handler brings an instance of a mapped class into the session, either by a database query or by creating a new instance within the handler and adding it to the session, the SQLalchemy ORM tracks changes to the instance’s state. At the end of a transaction, if the handler calls verify_and_commit, the RLS permission checker will introspect the database session and get the sets of records that were read, updated, deleted, or created during the current transaction. It will then iterate through each of these collections and check that every record was accessed in an allowable way. If it encounters any permissions violations during this process, it causes the handler to rollback the transaction and return an HTTP status code of 400 with a message that identifies the specific row where an access policy was violated. If no access policies are violated, then the transaction will finish successfully.

This design ensures that every object pulled into the handler’s session during an API transaction has its access policy checked in a consistent way, automatically preventing leaks of sensitive information to the end user. In most cases, the developer only needs to define a policy on a mapped class, then call verify_and_commit at the end of an API handler call to ensure that the policy is applied.

Create a New Policy

Policies are instances of UserAccessControl defined in baselayer/app/models.py. The only method of UserAccessControl that subclasses must provide is query_accessible_rows:

class UserAccessControl:

    def query_accessible_rows(self, cls, user_or_token, columns=None):
        """Construct a Query object that, when executed, returns the rows of a
        specified table that are accessible to a specified user or token.

        Parameters
        ----------
        cls: `baselayer.app.models.DeclarativeMeta`
            The mapped class of the target table.
        user_or_token: `baselayer.app.models.User` or `baselayer.app.models.Token`
            The User or Token to check.
        columns: list of sqlalchemy.Column, optional, default None
            The columns to retrieve from the target table. If None, queries
            the mapped class directly and returns mapped instances.

        Returns
        -------
        query: sqlalchemy.Query
            Query for the accessible rows.
        """

        raise NotImplementedError

query_accessible_rows constructs (but does not execute) a query against the table that cls is mapped to. The query returns the rows of the cls table that user_or_token could access.

To construct a new policy, subclass UserAccessControl and override query_accessible_rows.

Restricted

Baselayer provides a restricted policy that grants record access only to users or tokens that have the System admin ACL. Restricted records are not restricted to all users.

Examples

The following public policy will grant record access to any user:


class Public(UserAccessControl):
    """A record accessible to anyone."""

    def query_accessible_rows(self, cls, user_or_token, columns=None):
        if columns is not None:
            return DBSession().query(*columns).select_from(cls)
        return DBSession().query(cls)


public = Public()

The following accessible_to_owner policy will grant record access to any user that matches the User pointed to by the owner relationship of cls, with access automatically granted to users with the "System admin" ACL:

from baselayer.app.models import AccessibleIfUserMatches
accessible_to_owner = AccessibleIfUserMatches('owner')

Bitwise operations can be used to compose access policies. The accessible_to_owner_or_last_modified_by policy below grants access to any user that matches the User pointed to by the owner relationship of cls or the last_modified_by relationship of cls:

from baselayer.app.models import AccessibleIfUserMatches
accessible_to_owner_or_last_modified_by = AccessibleIfUserMatches('owner') | AccessibleIfUserMatches('last_modified_by')

Policies can be custom (potentially complex and relational) database queries. This policy allows adding a user to a group only if that user has access to all of the group’s streams:

from baselayer.app.env import CustomserAccessControl
GroupUser.create = (
    CustomUserAccessControl(
        # Can only add a user to a group if they have all the requisite
        # streams required for entry to the group
        lambda cls, user_or_token: DBSession()
        .query(cls)
        .join(Group)
        .outerjoin(Stream, Group.streams)
        .outerjoin(
            StreamUser,
            sa.and_(
                StreamUser.user_id == cls.user_id,
                StreamUser.stream_id == Stream.id,
            ),
        )
        .group_by(cls.id)
        .having(
            sa.or_(
                sa.func.bool_and(StreamUser.stream_id.isnot(None)),
                sa.func.bool_and(Stream.id.is_(None)),  # group has no streams
            )
        )
    )
)

Binding Policies to Classes

To bind a policy to a class, set the create, read, update, or delete attribute of the class to the policy, depending on which access mode you want the policy to apply to:

Spectrum.delete = accessible_by_owner

The above will ensure that only the owner of a spectrum (or anyone with the “System admin” ACL) can delete the spectrum.

Convenience Methods

Instances of Base provide convenience methods to (a) check if an instance of a mapped class is accessible to a specified user, (b) retrieve specific records of a mapped class (identified by a primary key), if they are accessible by a specified user, otherwise raise an AccessError (c) retrieve all records of a mapped class accessible to a specified user, and (d) generate a query object that returns records in a table that are accessible to a given user:

(a) Can the User user read the Spectrum spectrum?

spectrum.is_accessible_by(user, mode="read")

Can the User user update the Spectrum spectrum?

spectrum.is_accessible_by(user, mode='update')

(b) Retrieve spectra with IDs [1, 3] if they are updatable by the User user, raising an AccessError if either does not exist:

Spectrum.get_if_accessible_by([1, 3], user, raise_if_none=True, mode="update")

(c) Retrieve all photometry that User user can read

Photometry.get_records_accessible_by(user, mode="read")

(d) Construct a query object for all photometry that User user can delete with a signal-to-noise ratio of at least 10:

deletable_phot_query = Photometry.query_records_accessible_by(user, mode="delete")
filtered_deletable_query = deletable_phot_query.filter(Photometry.snr > 10)

Enforcing Permissions in Handlers

Calling self.verify_and_commit() within a SkyPortal API handler will trigger the RLS permission checker to introspect the current database session and verify that all of the records it currently is tracking are being accessed in an allowable way. If it encounters any permissions violations during this process, it causes the handler to rollback the transaction and return an HTTP status code of 400 with a message that identifies the specific row where an access policy was violated. If no access policies are violated, then the current database transaction will be committed.

If self.verify_and_commit() is not called in an API handler, the handler will not automatically check for permissions violations. The best practice is to call self.verify_and_commit() at the end of a handler’s transaction, immediately before self.success or self.error.

Further examples of policies:

Examples of permissions being checked in an API handler: