mRFC 0042: (Ratatosk) Workspaces
Concept
Introduction
Workspaces is generalized combination of the concepts of SiteGroups and MultiLang, workspaces can be used to replace either or both if so desired on application level and core will be simplified by having only support for the general idea of workspaces.
Workspaces are a tree, they have a local ID (not exposed to developers), a name which must be unique on the current tree level and optionally parent, to set a workspace application calls midgard_connection::set_workspace('/path/to/my/workspace'), workspaces can have record extensions and/or they can be extended via MgdSchema inheritance.
Objects on closer to the root of the tree are inherited towards leaves (read privileges apply), if an object is edited on higher (as in towards leaves) level a copy is made (the copies are full objects on their own, with separate metadata).
Replication is workspace agnostic.
Identifiers
Objects will always have identifiers:
- Local DB id, this is process/thread transient and will be on API level abstracted as opaque midgard_reference object
- Global Midgard identifier, like the classic GUID
- UUID, this is Universally Unique for the specific object "instance" in specific workspace.
Instantiating objects
Objects can be instantiated via a midgard_reference object in the following ways:
- Pass the reference as argument to object class constructor
- Pass the reference as argument to midgard_object::get()
- Call the get() method on the reference (<- PONDER: is this usefull or neccessary ?)
Objects can be instantiated with their identifier in the following ways:
- Pass the identifier as argument to midgard_object::get()
- Pass the identifier as argument to object class constructor
In both cases core checks if the object has intance in the current workspace, if not it checks the parent etc untill it reaches root, if at any level it encounters an object instance that has read privilege denied it stops checking and throws "access denied" exception, this means that if we have object instances on level 1 and 2 and level 1 has read allowed and level two does not, then on any level above 1 user will not see the object.
Finally a specific instance of object can be instantiated via it's UUID in the following ways:
- Create new midgard_reference object by passing the UUID to constructor and instantiate as via reference
- Pass the UUID as argument to midgard_object::get()
- Pass the UUID as argument to object class constructor
Cases 2 and 3 basically do 1 transparently. However this way also respects the workspace tree so that if we have UUID '113' in workspace '/foo' and our current workspace is '/baz' we cannot see the object exists, if our workspace was '/foo/bar/baz' we would get the object in workspace '/foo' even if object with same identifier exists in '/foo/bar/baz'.
Replication
Replication is workspace agnostic, however it's forbidden to replicate between workspaces in the same database as UUIDs may not clash.
Serialize/export
Replication works on object "instances", ie. serializing object instance from workspace '/foo/bar' will not include also the instance from workspace '/foo'.
No workspace info is passed in the serialized data.
Unserialize/import
For UUIDs that do not yet exist in the database this is simple: importing creates a new copy in the current connection workspace.
If the connection workspace is '/foo/bar/baz' and UUID already exists on '/foo' level then the '/foo' level is updated on import.
If the connection workspace is '/foo/bar/baz' and UUID already exists on '/bar' tree then an error is thrown.
Use cases
With properly thought out conventions and thinking about the stacking when implementing these, all of the following could be stacked on top of each other so you could have multiple companies with multilingual staging/live sites.
Basically it boils down to what the implementation considers "root" workspace.
Sitegroup emulation
For hosting multiple sites of different companies with different needs and data structures using separate databases is probably more efficient or at the very least logically cleaner.
However for application hosting where all customers use mostly the same MgdSchemas workspaces can be used to neatly separate and share data, for example the might be a workspace tree like the following:
- Huge enterprise
- Main company
- Another company
- And another
- Company not related to huge enterprise
Here the huge enterprise and the unrelated company are unaware of each other, data shared across whole enterprise can be created to the first level workspace
For security we should probably have a second argument to set_workspace() which forbids changing the workspace to above/outside the given workspace, so if we say set_workspace('/my_company', true) after that set_workspace('/another_company') should fail but set_workspace('/my_company/fi') should not (provided the workspace exists...), similarly if set_workspace('/my_enterprise/my_company', true) has been called call to set_workspace('/my_enterprise') should fail.
Multilang emulation
CMS application could handle multi-lingual content by defining that by convention it will look for workspace named for example by locale naming conventions 'fi_FI' for Finnish localized content, core would provide 'fallback language' if the workspaces are organized in the order of '/fallback/preferred_language', this could be a longer path as well for example '/en_US/fi_FI/fi_SV' so we prefer Finnish-swedish but fall back to Finnish if unavailable and finally fall back to english.
User interface locale info could be stored as record extension(s) to the workspace object.
Staging/Live
One could have live site use workspace '/My-site' and the staging site use workspace '/My-site/staging' and on publish to live just overwrite the '/My-site' instance of the object with data from the staging instance.
Deletes instead of being done at the same time is from staging (like with the current replication based system) could wait for example re-approval of the parent.
Corner cases
Copying
How to copy objects
Do we use the update() method to create a copy in the current workspace or do we use a separate method to create the copy ? The con with the update() -approach is that it's "magick", the advantage that component developers do not need to have special handlers.
Copying to parent workspace
What happens if we have object with identifier X in '/foo/bar' but not in '/foo' and we want to copy it to '/foo', ie. how to achieve this ?
If we had separate copy method giving the path would be obvious solution. with the update() approach I suppose there is need to change workspace (and if changing workspace above current one is forbidden the separate copy method should fail too).
Attachments and record extensions
How should these be handled by core when object is copied from workspace to another ?
Idea 1: Record extensions copied on copy, attachments not.
Staging/live and/or Multilang applications/helpers handle copying/fallback for attachments.
Idea 2: Only object copied
This is simpler for core and is less "magick" but makes the the required helpers more complicated. Also it can be argued that the record extensions are more part of the object than the attachments.
