Federation and Replication Services

Mediaflux Federation services enable multiple Mediaflux servers in different locations to act in concert, so that services may be executed across any or all of them in parallel from a single command. Unlike clustering, where multiple Mediaflux servers in a single location are bound together to parallelize local operations, Federation enables distributed operations across multiple sites, whether across a campus, or in other parts of the world.

Mediaflux Replication services support the sharing of data between multiple Mediaflux servers, whether the servers are separated by a few metres or located in different continents. Typically a Mediaflux server is located wherever important data is located, and replication services are utilised to enable convenient access to data throughout an organisation.

Federation and replication services support:

  • the federation of any Mediaflux service, including queries, in order to distribute and localise computation
  • the replication of data throughout the network as required to localise access for improved performance or to implement data protection policy.

Peers

A Mediaflux server can be made aware of another Mediaflux server by adding a peer server. Using the peer concept, Mediaflux servers can be configured into a network of connected peers to reflect the data topography of an organisation.

Distributed Query

Any Mediaflux service may be federated to peers. A primary example is a distributed query – where a query will execute on not only the origin server but will also propagate to peers of that server, which in turn will propagate to their peers. Query results are then the routed back to the origin and merged.

Federation

Figure 1 - Mediaflux Federation

Trust

Data is always owned by the Mediaflux server storing it. There is no need for a central server to arbitrate data access, potentially leading to uncertainty over data ownership.

When a service is executed on a remote server the user’s authentication context is prefixed with the unique identity of the calling server. This allows for extensive configuration over remote access rights to metadata and data.

If the local and remote server share an authentication domain, a trust relationship may be established, where users automatically authenticate to the same domain.

Intelligent Replication to Remote Locations

Replication within Mediaflux is a regularly scheduled event whereby data selected by a query are packaged, transmitted to and restored to a peer. Restoration takes existing metadata and data into account and only merges in required versions. Replication may be “by reference”, where metadata is replicated according to the schedule, and data is replicated on demand when requested.

A replication policy of "replicate everything" can serve as the basis for disaster recovery.

Replica data is identified as such: when replica data is displayed, the source is identified. Replication also tracks the server time of the last successful replication event and only transmits updates made after that point on the next replication event.

Replication

Figure 2 - Mediaflux Replication

Data integrity is assured by CRC32 checksums and data may be encrypted for transmission.

See also Mediaflux Cluster Edition.