Comment on page
A local installation of Ellipsis Drive (ED) allows you to run ED on your own (virtual) environment. You can launch ED as a completely stand-alone solution, or launch it on top of your existing file storage, whether that is S3, a classic file server or another form of cloud storage.
ED can be deployed in a geographically distributed environment. That is to say, you can make your ED deployment perform even if users are distributed around the world.
🔐 Security configuration
ED has various settings that can be configured to improve security. The following list gives a few examples of these possible settings:
1️⃣ 2 factor authentication required to access an account - yes/no
2️⃣ Direct read access to passive (S3 / file server) storage allowed - yes/no
3️⃣ Public and Link Sharing options - enabled/disabled
4️⃣ Access through internet or intranet only
5️⃣ Possible for users to register their own account - yes/no/only for certain domains
6️⃣ Allow creation of permanent access tokens - yes/no
7️⃣ Maximum duration of validity of a token
ED works as a layer on top of classic file storage (that we call passive storage). This layer takes care of access management, search, interoperability and scalability.
It is important to note that content in passive storage is not altered. ED needs read and write access to this file storage, but will not alter the files itself in any way.
If you have processes that need direct read access to files in passive storage, that is no problem. This can for example be necessary if you need extremely high read speed for some algorithms and want to bypass the API. It can also be needed if you have legacy code that you do not wish to migrate.
ED is compatible with other processes reading the files. But ED is not compatible with other processes having writing access to the storage. You will need to add files via the ED API. (Otherwise glitches in the system can occur)
ED consists of 11 containerized components, that should each be launched on a (virtual) server running linux Ubuntu 16 or higher. Multiple components can run on the same server, but for optimal use we recommend to install them on different servers.
1️⃣ Container with source code: to launch and run the API. (pool)
2️⃣ Cache database: A container with a database that caches relevant information to ensure scalability with a lot of usage (across geographical regions). This component is not required and can be skipped if the system is running on a single geographical location and the expected load is modest. (pool)
3️⃣ Central metadata database: A container with the central database with metadata and some logic to communicate with cache databases. (unique)
4️⃣ Temporary file storage: Container functioning to receive up- and downloads before passing them on to their permanent location. (unique)
5️⃣ Active vector data storage: Container with log(n) efficient, scalable and flexible vector reading capabilities (pool)
6️⃣ Active raster data storage: Container with scalable and flexible raster storage capabilities. (pool)
7️⃣ Background processes: This container contains some logic that runs in low frequency on the background. (pool)
8️⃣ Main cluster node: This container contains logic that distributes tasks over the cluster (unique)
9️⃣ Raster activation node: This container is assigned tasks by the main cluster nodes and activates raster layers on request. (pool)
🔟 Vector activation node: This container is assigned tasks by the main cluster nodes and activates vector layers on request. (pool)
- S3 storage or file server: This container does not host any ED specific logic and can be any system of storage that a client wishes to have their ED running on. (Pool)
- The ED UI/app can be hosted on any location of convenience. Either on the API, or if preferred on some other location of your choosing.
The components marked with '(pool)' can have multiple instances running. Instances can be added or removed from the system when scaling up or down.
The API and cache database can be placed on any geographic location. They do not need to be near the main location.
✔️ Each component requires an environment running ubuntu 16 or higher.
✔️ A component needs at least 16GB RAM, 32GB hard disk and 8 CPU’s
✔️ We recommend to run components on their own (virtual server) for optimal performance, but this is not required and can also be done in a later stage.
✔️ When launching with more than one API component in the pool a load balancer is required.
✔️ The main database and the temporary file receiver must be hosted on the same physical location. (geographically close)
✔️ Communication directly (bypassing ED) with the passive S3 bucket is allowed in read only mode. When you write to the S3 bucket as well, there is no way to guarantee the working of ED. We highly recommend not to do this and to disable this functionality. Files should always be added to your S3 bucket via ED.
An ED consultant will be dedicated to your launch, so these steps will not have to be taken by the client alone.
🔘 Download the ED docker image
🔘 Run the docker image
🔘 Edit the configuration file (inside the docker image) to match your needs and connect your file storage
🔘 Run the setup file (inside the docker image)
The component will now be launched on the server.
It is important to launch the central database and file system components first, as they are prerequisites for a few other components.
Your Ellipsis Drive instance takes care of all internal features, some general cloud infrastructure features are however inherited from the cloud you are in. These features you will need to configure outside of your Ellipsis Drive instance and configure in your cloud. These things include
- backups. You are recommended to backup your passive storage component as well as the active raster and vector components. Since the acitve storage can be restored from the passive storage the passive storage backup is the only true requirement.
- Cloud armour. To protect yourself against DDOS attacks we reccomend to use some form of cloud armour from your cloud provider.
- Load balancing. If your Ellipsis Drive instance is multi region you will need a load balancer from you cloud provider in order to redirect users to the location nearest to them.
- Auto scaling. If you wish to add additional nodes dynamically based on usage you will need to create scripts for this. Ellipsis Drive will switch resources allocated to it on and off based on need. But it will not create new resources.
Are there any features in the public ED version that will not be available in the private instance?
No, your private ED instance will be a copy of the public one. However, you can configure it with alternative security settings (see the list above). Changing these settings can mean that some functionality may disappear. For example, if you do not wish to facilitate link sharing due to security concerns, some link sharing functionality will not be available.