Categories
Articles

The Ultimate deployment guide for atoti on AWS with Amazon EC2

How to run and share your atoti BI app

In a previous article, we have covered how to deploy a BI dashboard in AWS using Docker, specifically using the atoti Docker image. In case you haven’t heard of atoti, it is a Python library that allows multidimensional data analysis and comes with dashboarding capability.

In this article, we are going to see how we can set up a JupyterLab along with atoti on a virtual machine using AWS Elastic Compute Cloud, also known as EC2.

Objective

The goal of this article is to give access to atoti development platform from the cloud for anyone to create their own notebooks and run a BI web application. The recommended solution would be to implement a JupyterHub. We will show you a simpler solution that uses a shared instance of JupyterLab. One limitation of this solution is that your users should be mindful of not concurrently running the same notebook or restarting the kernels of other users.  With a common development platform deployed on AWS using Amazon EC2, your end-users can start building the dashboards, or even explore the source code of the notebook behind the app.

For more information about atoti and what you can do with it, follow this link.

Prerequisites

We created a free account with the AWS free tier for this article.

In this use case, we used PuTTY to remote access the Ubuntu server that we are going to create.

Step 1. Launching Ubuntu instance with Amazon EC2

Amazon EC2 is a web service that allows us to boot an Amazon Machine Image (AMI) to configure a virtual machine. The video below takes us through the process of setting up a Ubuntu server with Amazon EC2 and JupyterLab with an atoti tutorial.

Generating Amazon EC2 key pair

A key pair is a set of security credentials that allow us to connect to an Amazon EC2 instance.

Log into AWS and navigate to the Amazon EC2 main page. We can create the key pair from the Amazon EC2 dashboard.

Select the ppk option to download a PuTTY private key file. Be sure to keep this file securely. We will use it to configure PuTTY for SSH into our Amazon EC2 instance.

We can also create the key pair before launching our EC2 instance. This will give us a private key file (*.pem file) that can be used with OpenSSH. However, we will have to convert this file to a ppk file when we use PuTTY.

Launching Amazon EC2 instances

From the AWS left menu bar, navigate to Instances. Click on the “Launch instances” button to start creating the virtual machine.

There are 7 steps to launching the server. We are going to focus on:

Step 1. Choose AMI

Step 2. Choose Instance Type

Step 4. Add Storage

Step 6. Configure Security Group

For those steps that are not mentioned, we will keep to the default configuration.

Step 1. Choose AMI

We shall select the free tier eligible Ubuntu server.

Step 2. Choose Instance Type

Go for the t2.micro instance type which gives us 750 free hours per month for the first year.

Step 4. Add Storage

We can only specify the instance store volumes for the EC2 instance during launching, we should therefore increase the volume size.

Note that data in an instance store persists only during the lifetime of its associated instance, i.e. if the EC2 instance is stopped or hibernates. The data in the instance store will be lost.

Although not covered in this article, it is advisable to use more durable data storage such as Amazon EBS for example. We can attach additional EBS volumes to the EC2 instance anytime.

Step 6. Configure Security Group

Depending on whether SSL is configured, we can open the firewall for HTTP or HTTPS. Otherwise, add the rules for port 22 for SSH and port 8888 for the Jupyter server.

While we are allowing all IP addresses to access the EC2 instance with source set to Anywhere, you should restrict the access from known IP addresses only.

Associating Key pair for EC2 instance connection

In the final step to launch the EC2 instance, we will be prompted to select a key pair for connecting to our instance.

This is where we choose the key pair that we have created earlier on. Make sure you still have the private key that was downloaded to the machine.

Instance key information

We can quickly access our EC2 instance from the Launch Status page by clicking on the link highlighted below.

It’s a good idea to create billing alerts as suggested above as we don’t want to be charged unknowingly when the free tier usage is exceeded.

On the instance summary page, we look at a few things:

1. Instance status — it should be running for us to be able to connect to it

2. Public IPv4 address — we need this to connect to the EC2 instance and also, to configure our Jupyter server.

3. Connect — This gives us the instructions for connecting to the EC2 instance.

SSH using PuTTY

Using the information from the instance summary page, we can configure PuTTY to connect to our EC2 instance:

  • Enter the instance IP address in the Host Name field.
  • Go to Connection > Data and enter the username obtained from the “Connect to instance” page.
  • Go to Connection > SSH > Auth, browse and select the ppk file downloaded from our keypair creation earlier on.

Save the instance configuration and click Open to SSH into our EC2 instance.

Step 2. Installing JupyterLab with atoti on Amazon EC2

We will reference the atoti installation guide to install JupyterLab and atoti on the Amazon EC2 instance that we have created:

1 — Install Conda

As recommended, we will install miniconda 64-bits. We have to first download it to the Amazon EC2 instance. Connect to the instance in PuTTY and run the below commands:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh

During the installation, press Enter a few times to go through the license agreement and type yes to agree. Also, press Enter to confirm the location.

Lastly, type yes for the installer to initialize Miniconda3.

We need to close and reopen PuTTY for the changes to take effect.

2 — Set up the conda-forge channel and atoti channel

After reconnecting to the EC2 instance, run the below commands in PuTTY.

conda config — add channels conda-forge
conda config — add channels https://conda.atoti.io

3 — Create a new Conda environment

Run the below command to create a new Conda environment for us to set up atoti and JupyterLab. You can create multiple environments with different conda packages installed for different purposes, such that changes to one doesn’t affect the other.

conda create --name atoti

Enter y to proceed with the environment creation. Follow the instructions to activate the environment:

conda activate atoti

4— Install atoti and JupyterLab

All it takes to install atoti and its companion packages is the below command:

conda install atoti atoti-jupyterlab python

It will take a while for the installation to finish. Don’t forget to enter y to proceed with the installation when prompted.

atoti web application port

When we create an atoti session, session.url will return the link to access the atoti web application. A random port is generated for each session unless pre-defined in the configuration.

To avoid having to open up a range of ports on the firewall, we are going to make use of the Jupyter Server Proxy. This will allow us to run the atoti web application alongside the notebook, with the URL directing to a proxy port.

Hence we only have to open up the firewall for port 8888 instead of a different port per atoti session.

Let’s install it with the below command:

conda install jupyter-server-proxy

Configuring JupyterLab

There are a few configurations that we want to apply to the JupyterLab:

  • ServerApp.ip — For the notebook server to listen on all IP address
  • ServerApp.open_browser — not to open the browser after starting
  • ServerApp.password — Instead of the default token authentication, we will change it to password authentication.
  • ServerApp.custom_display_url — Override URL shown to users
  • ServerApp.root_dir — set it to the work folder where we will store the atoti tutorial

Password generation

Before we start the configuration, let’s generate a hashed password for the web authentication.

ipython
from IPython.lib import passwd
passwd()

Enter password: [Create password and press enter] Verify password: [Press enter]

The hashed password will be displayed as shown below:

Exit from ipython.

Updating configuration file

Run the below commands in PuTTY to create the config profile:

jupyter notebook --generate-config

To start configuring Jupyter, run the following:

cd ~/.jupyter
nano jupyter_notebook_config.py

Insert the following at the beginning of the configuration file (Replace the hashed password, instance IPv4 address and port for Jupyter server in the url_pattern above):

c = get_config()
# Notebook config
# listens on all IPs
c.ServerApp.ip = '*'
#so that the ipython notebook does not opens up a browser by default
c.ServerApp.open_browser = False
#the encrypted password we generated above
c.ServerApp.password = u'<hashed password>'
# Set the port to 8888, the port we set up in the AWS EC2 set-up
c.ServerApp.port = <port for jupyter server>
# Replace actual URL, including protocol, address, port and base URL, with the given value when displaying URL to the users.
c.ServerApp.custom_display_url = 'http://<instance IPv4 address>:<port for jupyter server>'
# to start up Jupyter on this directory
c.ServerApp.root_dir = 'work'

Exit and save the configuration file.

Downloading atoti tutorial to work directory

Since we have set the root_dir to the work directory, we need to create this folder. We will create it under the home directory and download the atoti tutorial to it:

mkdir ~/work
python -m atoti.copy_tutorial ~/work/tutorial

atoti configuration

We can configure atoti during session creation to point to the EC2 instance IP address instead:

Instead of providing the configuration for each session, we can also create a global configuration for atoti. Run the following commands in PuTTY:

mkdir --parents ~/.atoti
echo "url_pattern: http://<instance IPv4 address>:<port for jupyter server>/proxy/{port}/" > ~/.atoti/config.yml

Replace the instance IPv4 address and port for the Jupyter server in the url_pattern above. Since we are making use of the Jupyter server proxy, we include `/proxy/{port}/` in the URL pattern. The proxy {port} will be randomly assigned unless specified in the atoti configuration. Check out the configuration options available for atoti.

Step 3. Launching JupyterLab

We are now ready to launch the JupyterLab! Run the below command with `&` to start the process in the background:

jupyter lab &

This way, JupyterLab will still be accessible after we exit the shell.

To terminate the JupyterLab, run `ps -ef | grep jupyter` to find Jupyter running processes. Use `kill -9` on the PID to kill the processes.

Step 4. Example of an atoti BI application

Using the atoti — 01. Basics.ipynb, we will create our measures and visualize them in the Jupyter notebook.

We can publish the visualization from the notebook to the web application for use in a dashboard, or we can add new widgets to a dashboard:

We can save the dashboard and share it with our peers!