πŸ“ƒOverview

πŸ”° Introduction

A local installation of Ellipsis Drive (ED) allows you to run ED on your own (virtual) environment. You can launch ED as a completely stand-alone solution, or launch it on top of your existing file storage, whether that is S3, a classic file server or another form of cloud storage.

ED can be deployed in a geographically distributed environment. That is to say, you can make your ED deployment perform even if users are distributed around the world.

πŸ” Security configuration

ED has various settings that can be configured to improve security. The following list gives a few examples of these possible settings:

List of things that can be configured πŸ‘ˆ

1️⃣ 2 factor authentication required to access an account - yes/no

2️⃣ Direct read access to passive (S3 / file server) storage allowed - yes/no

3️⃣ Public and Link Sharing options - enabled/disabled

4️⃣ Access through internet or intranet only

5️⃣ Possible for users to register their own account - yes/no/only for certain domains

6️⃣ Allow creation of permanent access tokens - yes/no

7️⃣ Maximum duration of validity of a token

πŸ’½ File storage

ED works as a layer on top of classic file storage (that we call passive storage). This layer takes care of access management, search, interoperability and scalability.

It is important to note that content in passive storage is not altered. ED needs read and write access to this file storage, but will not alter the files itself in any way.

If you have processes that need direct read access to files in passive storage, that is no problem. This can for example be necessary if you need extremely high read speed for some algorithms and want to bypass the API. It can also be needed if you have legacy code that you do not wish to migrate.

ED is compatible with other processes reading the files. But ED is not compatible with other processes having writing access to the storage. You will need to add files via the ED API. (Otherwise glitches in the system can occur)

πŸ’  Components

ED consists of 11 containerized components, that should each be launched on a (virtual) server running linux Ubuntu 16 or higher. Multiple components can run on the same server, but for optimal use we recommend to install them on different servers.

List of components πŸ‘ˆ

1️⃣ Container with source code: to launch and run the API. (pool)

2️⃣ Cache database: A container with a database that caches relevant information to ensure scalability with a lot of usage (across geographical regions). This component is not required and can be skipped if the system is running on a single geographical location and the expected load is modest. (pool)

3️⃣ Central metadata database: A container with the central database with metadata and some logic to communicate with cache databases. (unique)

4️⃣ Temporary file storage: Container functioning to receive up- and downloads before passing them on to their permanent location. (unique)

5️⃣ Active vector data storage: Container with log(n) efficient, scalable and flexible vector reading capabilities (pool)

6️⃣ Active raster data storage: Container with scalable and flexible raster storage capabilities. (pool)

7️⃣ Background processes: This container contains some logic that runs in low frequency on the background. (pool)

8️⃣ Main cluster node: This container contains logic that distributes tasks over the cluster (unique)

9️⃣ Raster activation node: This container is assigned tasks by the main cluster nodes and activates raster layers on request. (pool)

πŸ”Ÿ Vector activation node: This container is assigned tasks by the main cluster nodes and activates vector layers on request. (pool)

  • S3 storage or file server: This container does not host any ED specific logic and can be any system of storage that a client wishes to have their ED running on. (Pool)

  • The ED UI/app can be hosted on any location of convenience. Either on the API, or if preferred on some other location of your choosing.

πŸ“Ά Scaling

The components marked with '(pool)' can have multiple instances running. Instances can be added or removed from the system when scaling up or down.

The API and cache database can be placed on any geographic location. They do not need to be near the main location.

List of requirements and recommendations πŸ‘ˆ

βœ”οΈ Each component requires an environment running ubuntu 16 or higher.

βœ”οΈ A component needs at least 16GB RAM, 32GB hard disk and 8 CPU’s

βœ”οΈ We recommend to run components on their own (virtual server) for optimal performance, but this is not required and can also be done in a later stage.

βœ”οΈ When launching with more than one API component in the pool a load balancer is required.

βœ”οΈ The main database and the temporary file receiver must be hosted on the same physical location. (geographically close)

βœ”οΈ Communication directly (bypassing ED) with the passive S3 bucket is allowed in read only mode. When you write to the S3 bucket as well, there is no way to guarantee the working of ED. We highly recommend not to do this and to disable this functionality. Files should always be added to your S3 bucket via ED.

πŸš€ Launch

An ED consultant will be dedicated to your launch, so these steps will not have to be taken by the client alone.

List of steps to launch per component πŸ‘ˆ

πŸ”˜ Download the ED docker image

πŸ”˜ Run the docker image

πŸ”˜ Edit the configuration file (inside the docker image) to match your needs and connect your file storage

πŸ”˜ Run the setup file (inside the docker image)

The component will now be launched on the server.

It is important to launch the central database and file system components first, as they are prerequisites for a few other components.

What to do yourself?

Your Ellipsis Drive instance takes care of all internal features, some general cloud infrastructure features are however inherited from the cloud you are in. These features you will need to configure outside of your Ellipsis Drive instance and configure in your cloud. These things include

  • backups. You are recommended to backup your passive storage component as well as the active raster and vector components. Since the acitve storage can be restored from the passive storage the passive storage backup is the only true requirement.

  • Cloud armour. To protect yourself against DDOS attacks we reccomend to use some form of cloud armour from your cloud provider.

  • Load balancing. If your Ellipsis Drive instance is multi region you will need a load balancer from you cloud provider in order to redirect users to the location nearest to them.

  • Auto scaling. If you wish to add additional nodes dynamically based on usage you will need to create scripts for this. Ellipsis Drive will switch resources allocated to it on and off based on need. But it will not create new resources.

FAQs

πŸ€” What do I do if I have no file storage of my own?

This is no problem. There is an additional ED component for the file storage if you do not have a system of your own in place.

πŸ€” Are there any features in the public ED version that will not be available in the private instance?

No, your private ED instance will be a copy of the public one. However, you can configure it with alternative security settings (see the list above). Changing these settings can mean that some functionality may disappear. For example, if you do not wish to facilitate link sharing due to security concerns, some link sharing functionality will not be available.

πŸ€” Will the ED UI be included in my private instance?

Yes, the UI of the app is included.

Last updated