Astronomical data is often subject to complex data rights policies. As a data platform designed to ingest and serve data from multiple experiments and groups, each potentially with different access policies, SkyPortal must be able to enforce arbitrary logic governing who can see, interact with, and modify data. SkyPortal enforces such policies using a custom row-level security (RLS) framework in the API layer.
RLS allows SkyPortal developers to define policies that restrict, with row- and user-level granularity, which rows of a table (e.g., photometry, spectra, groups, followup requests, data streams, etc.) a user can read, update, create, and delete. SkyPortal uses baselayer’s ORM-based framework for RLS within API transactions. With baselayer’s RLS framework, SkyPortal developers can
Be sure that access policies will be consistently enforced when protected records are accessed in an API transaction
Efficiently filter database queries by RLS accessibility to ensure that API endpoints only return records that users can access
Take advantage of vectorization to efficiently apply policy checks and filters to bulk record queries
Define arbitrarily complex, relational, and scalable row-level CRUD access policies on any SkyPortal table
Use predefined access patterns to restrict access to records by stream access, group membership, user ACLs, and more
Implement different policies for different types of access on a single type of record
Note: Although SkyPortal uses PostgreSQL as the database backend, SkyPortal does not use PostgreSQL’s RLS implementation for row-level security.
RLS Architecture and Design¶
In SkyPortal, RLS policies are defined on SQLAlchemy mapped classes. Each class has a single policy for each of the create, delete, update, and read access modes. Records are read- and create-public by default, and update- and delete-restricted (i.e., only accessible to users with the “System admin” ACL) by default. Join table classes are also update- and delete-restricted by default, but they can (by default) be created or read by any user that can read both their left and right records. These defaults can be overridden on any mapped class.
When an API handler brings an instance of a mapped class into the session, either by a database query or by creating a new instance within the handler and adding it to the session, the SQLalchemy ORM tracks changes to the instance’s state. At the end of a transaction, if the handler calls
verify_and_commit, the RLS permission checker will introspect the database session and get the sets of records that were read, updated, deleted, or created during the current transaction. It will then iterate through each of these collections and check that every record was accessed in an allowable way. If it encounters any permissions violations during this process, it causes the handler to rollback the transaction and return an HTTP status code of 400 with a message that identifies the specific row where an access policy was violated. If no access policies are violated, then the transaction will finish successfully.
This design ensures that every object pulled into the handler’s session during an API transaction has its access policy checked in a consistent way, automatically preventing leaks of sensitive information to the end user. In most cases, the developer only needs to define a policy on a mapped class, then call
verify_and_commit at the end of an API handler call to ensure that the policy is applied.
Create a New Policy¶
Policies are instances of
UserAccessControl defined in baselayer/app/models.py. The only method of
UserAccessControl that subclasses must provide is
class UserAccessControl: def query_accessible_rows(self, cls, user_or_token, columns=None): """Construct a Query object that, when executed, returns the rows of a specified table that are accessible to a specified user or token. Parameters ---------- cls: `baselayer.app.models.DeclarativeMeta` The mapped class of the target table. user_or_token: `baselayer.app.models.User` or `baselayer.app.models.Token` The User or Token to check. columns: list of sqlalchemy.Column, optional, default None The columns to retrieve from the target table. If None, queries the mapped class directly and returns mapped instances. Returns ------- query: sqlalchemy.Query Query for the accessible rows. """ raise NotImplementedError
query_accessible_rows constructs (but does not execute) a query against the table that
cls is mapped to. The query returns the rows of the
cls table that
user_or_token could access.
To construct a new policy, subclass
UserAccessControl and override
Baselayer provides a restricted policy that grants record access only to users or tokens that have the
System admin ACL. Restricted records are not restricted to all users.
public policy will grant record access to any user:
class Public(UserAccessControl): """A record accessible to anyone.""" def query_accessible_rows(self, cls, user_or_token, columns=None): if columns is not None: return DBSession().query(*columns).select_from(cls) return DBSession().query(cls) public = Public()
accessible_to_owner policy will grant record access to any user that matches the
User pointed to by the
owner relationship of
cls, with access automatically granted to users with the
"System admin" ACL:
from baselayer.app.models import AccessibleIfUserMatches accessible_to_owner = AccessibleIfUserMatches('owner')
Bitwise operations can be used to compose access policies. The
accessible_to_owner_or_last_modified_by policy below grants access to any user that matches the
User pointed to by the
owner relationship of
cls or the
last_modified_by relationship of
from baselayer.app.models import AccessibleIfUserMatches accessible_to_owner_or_last_modified_by = AccessibleIfUserMatches('owner') | AccessibleIfUserMatches('last_modified_by')
Policies can be custom (potentially complex and relational) database queries. This policy allows adding a user to a group only if that user has access to all of the group’s streams:
from baselayer.app.env import CustomserAccessControl GroupUser.create = ( CustomUserAccessControl( # Can only add a user to a group if they have all the requisite # streams required for entry to the group lambda cls, user_or_token: DBSession() .query(cls) .join(Group) .outerjoin(Stream, Group.streams) .outerjoin( StreamUser, sa.and_( StreamUser.user_id == cls.user_id, StreamUser.stream_id == Stream.id, ), ) .group_by(cls.id) .having( sa.or_( sa.func.bool_and(StreamUser.stream_id.isnot(None)), sa.func.bool_and(Stream.id.is_(None)), # group has no streams ) ) ) )
Binding Policies to Classes¶
To bind a policy to a class, set the
delete attribute of the class to the policy, depending on which access mode you want the policy to apply to:
Spectrum.delete = accessible_by_owner
The above will ensure that only the owner of a spectrum (or anyone with the “System admin” ACL) can delete the spectrum.
Base provide convenience methods to (a) check if an instance of a mapped class is accessible to a specified user, (b) retrieve specific records of a mapped class (identified by a primary key), if they are accessible by a specified user, otherwise raise an AccessError (c) retrieve all records of a mapped class accessible to a specified user, and (d) generate a query object that returns records in a table that are accessible to a given user:
(a) Can the
user read the
user update the
(b) Retrieve spectra with IDs [1, 3] if they are updatable by the
user, raising an AccessError if either does not exist:
Spectrum.get_if_accessible_by([1, 3], user, raise_if_none=True, mode="update")
(c) Retrieve all photometry that
user can read
(d) Construct a query object for all photometry that
user can delete
with a signal-to-noise ratio of at least 10:
deletable_phot_query = Photometry.query_records_accessible_by(user, mode="delete") filtered_deletable_query = deletable_phot_query.filter(Photometry.snr > 10)
Enforcing Permissions in Handlers¶
self.verify_and_commit() within a SkyPortal API handler will trigger the RLS permission checker to introspect the current database session and verify that all of the records it currently is tracking are being accessed in an allowable way. If it encounters any permissions violations during this process, it causes the handler to rollback the transaction and return an HTTP status code of 400 with a message that identifies the specific row where an access policy was violated. If no access policies are violated, then the current database transaction will be committed.
self.verify_and_commit() is not called in an API handler, the handler will not automatically check for permissions violations. The best practice is to call
self.verify_and_commit() at the end of a handler’s transaction, immediately before