Skip to main content

Getting started

<h4 id="bkmrk-installing-ethoscopy">Installing ethoscopy as a docker container with ethoscope-lab (recommended).</h4> <p id="bkmrk-the-ethoscope-lab-do">The <a href="https://hub.docker.com/repository/docker/ggilestro/ethoscope-lab">ethoscope-lab docker container</a> is the recommended way to use ethoscopy. A docker container is a pre-made image that will run inside any computer, independent of the operating system you use. The docker container is isolated from the rest of the machine and will not interfere with your other Python or R installations. It comes with its own dependencies and will just work. The docker comes with its own multi-user jupyter hub notebook so lab members can login into it directly from their browser and run all the analyses remotely from any computer, at home or at work. In the Gilestro laboratory we use a common workstation with the following hardware configuration.</p> <pre id="bkmrk-cpu%3A-12x-intel%28r%29-xe"><code>CPU: 12x Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz GPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz Hard drive: 1TB SSD for OS, 1TB SSD for homes and cache, 7.3 TB for ethoscope data Memory: 64GB </code></pre> <p id="bkmrk-the-workstation-is-a">The workstation is accessible via the internet (behind VPN) so that any lab member can login into the service and run their analyses remotely. All the computational power is in the workstation itself so one can analyse ethoscope data from a tablet, if needs be. Follow the instruction below to install the ethoscope-lab docker container on your machine.</p> <h5 id="bkmrk-on-linux">On linux (recommended)</h5> <p id="bkmrk-the-best-solution-is">The best solution is to install this on the same computer that collects the ethoscope data so ethoscopy can have access to the <code>db</code> files directly stored in the machine. For most small installations, this computer could be <a href="https://www.notion.so/The-node-9cd2d45f3eae46c7a0751c500fa11bac">the node</a>.</p> <p id="bkmrk-to-install-the-docke">To install the docker you will have to find out the following information:</p> <ul id="bkmrk-what-is-the-internet"> <li>what is the internet name or address of the computer you want to access? This can be the IP or an actual name.</li> <li>Where are your ethoscope data stored? On a regular node they would be in <code>/ethoscope_data/results</code><br></li> </ul> <p id="bkmrk-once-these-info-are-">Once these info are clear, you can proceed.</p> <pre id="bkmrk-%23-optional.-update-t"><code class="language-bash"># Optional. Update the system to the latest version. You may want to restart after this. sudo pamac update # install docker sudo pacman -S docker # start the docker service sudo systemctl enable --now docker # and finally download and run the ethoscope-lab docker container # the :ro flag means you are mounting that destination in read-only sudo docker run -d -p 8000:8000 \ --name ethoscope-lab \ --volume /ethoscope_data/results:/mnt/ethoscope_results:ro \ --restart=unless-stopped \ ggilestro/ethoscope-lab </code></pre> <p id="bkmrk-installation-on-wind" class="callout info">Installation on Windows or MacOS makes sense if you have actual ethoscope data on those machines, which is normally not the case. If you go for those OSs, I won't provide detailed instruction or support as I assume you know what you're doing.</p> <h5 id="bkmrk-on-macos">On MacOS</h5> <p id="bkmrk-install-the-docker-s">Install the docker software from <a href="https://docs.docker.com/desktop/install/mac-install/">here</a>. Open the terminal and run the same command as above, e.g.:</p> <pre id="bkmrk-%23-download-and-run-t"><code class="language-bash"># download and run the ethoscope-lab docker container # the :ro flag means you are mounting that destination in read-only sudo docker run -d -p 8000:8000 \ --name ethoscope-lab \ --volume /path/to/ethoscope_data/:/mnt/ethoscope_results:ro \ --restart=unless-stopped \ ggilestro/ethoscope-lab </code></pre> <h5 id="bkmrk-on-windows">On Windows</h5> <p id="bkmrk-install-the-docker-s-0">Install the docker software from <a href="https://docs.docker.com/desktop/install/windows-install/">here</a>. After installation, open the window terminal and issue the same command as above, only replacing the folder syntax as appropriate. For instance, if your ethoscope data are on <code>z:\ethoscope_data</code> and the user data are on <code>c:\Users\folder</code> use the following:</p> <pre id="bkmrk-sudo-docker-run--d--"><code class="language-bash">sudo docker run -d -p 8080:8080 \ --name ethoscope-lab \ --volume /z/ethoscope_data:/mnt/ethoscope_results:ro \ --restart=unless-stopped \ ggilestro/ethoscope-lab </code></pre> <h4 id="bkmrk-storing-user-data-on">Storing user data on the machine, not on the container (recommended)</h4> <p id="bkmrk-ethoscopelab-runs-on">ethoscopelab runs on top of a jupyterhub environment, meaning that it supports organised and simultaneous access by multiple users. Users will need to have their own credentials and their own home folder. The default user is <code>ethoscopelab</code>, with password <code>ethoscopelab</code> and this user will save all of their work in the folder called <code>/home/ethoscopelab</code>. In the examples above, the users' folders are stored inside the container itself <span style="text-decoration: underline;">which is not ideal</span>. A better solution is to mount the home folders to a local point in your machine. In the example below, we would use the folder <code>/mnt/my_user_homes</code>.</p> <pre id="bkmrk-sudo-docker-run--d---0"><code class="language-bash">sudo docker run -d -p 8000:8000 \ --name ethoscope-lab \ --volume /ethoscope_data/results:/mnt/ethoscope_results:ro \ --volume /home:/mnt/my_user_homes \ --restart=unless-stopped \ ggilestro/ethoscope-lab</code></pre> <p id="bkmrk-make-sure-that-your-" class="callout info">Make sure that your local home location contains an <code>ethoscopelab</code> folder that can be accessed by the <code>ethoscopelab</code> user! In the example below, you would need to create a folder called <code>/mnt/my_user_homes/ethoscopelab</code>.</p> <p id="bkmrk-any-folder-in-%2Fmnt%2Fm">Any folder in <code>/mnt/my_user_homes</code> will become accessible to ethoscopelab. In our lab, we sync those using <a href="https://owncloud.com/">owncloud</a> (an opensource Dropbox clone) so that every user has their files automatically synced across all their machines.</p> <h4 id="bkmrk-creating-new-users">Creating new users</h4> <p id="bkmrk-if-you-wan-to-add-ne">If you wan to add new users, you will have to do it from the command line. On the linux computer running ethoscopelab (normally the node) use the following commands:</p> <pre id="bkmrk-%23enter-in-a-bash-she"><code class="language-bash">#enter in a bash shell of the container sudo docker exec -it ethoscope-lab /bin/bash #create the username useradd myusername -m #set the password for the username you just created passwd myusername</code></pre> <p id="bkmrk-you-will-now-be-able">You will now be able to login into jupyter with these new credentials. The data will be stored in the newly created folder.</p> <h5 id="bkmrk-persistent-user-cred">Persistent user credentials</h5> <p id="bkmrk-in-linux%2C-user-crede">In linux, user credentials are saved inside three files: <code>/etc/passwd</code>, <code>/etc/shadow</code>, <code>/etc/group</code>. It is possible to store those on the host computer (<em>e.g.</em> the node) and then mount them to the container. This is called a <em>persistent volume</em> because the data will remain on the host computer even if the container is deleted. An example of a container running in this way is the following:</p> <pre id="bkmrk-sudo-docker-run--d---1"><code class="language-">sudo docker run -d -p 8000:8000 \ --name ethoscope-lab \ --volume /mnt/data/results:/mnt/ethoscope_results:ro \ --volume /mnt/data/ethoscope_metadata:/opt/ethoscope_metadata \ --volume /mnt/homes:/home \ --volume /mnt/cache:/home/cache \ --restart=unless-stopped \ -e VIRTUAL_HOST="jupyter.lab.gilest.ro" \ -e VIRTUAL_PORT="8000" \ -e LETSENCRYPT_HOST="jupyter.lab.gilest.ro" \ -e LETSENCRYPT_EMAIL="giorgio@gilest.ro" \ --volume /mnt/secrets/passwd:/etc/passwd:ro \ --volume /mnt/secrets/group:/etc/group:ro \ --volume /mnt/secrets/shadow:/etc/shadow:ro \ --cpus=10 \ ggilestro/ethoscope-lab.gilest.ro:latest</code></pre> <p id="bkmrk-lines-12-14-indicate">Lines 12-14 indicate the location of the user credentials. This configuration allows to maintain user information even when upgrading ethoscopelab to newer versions.</p> <h3 id="bkmrk-troubleshooting">Troubleshooting</h3> <p id="bkmrk-if-your-jupyter-star">If your Jupyter starts but hangs on the following image</p> <p id="bkmrk-"><a href="https://bookstack.lab.gilest.ro/uploads/images/gallery/2023-06/3MTimage.png" target="_blank" rel="noopener"><img src="https://bookstack.lab.gilest.ro/uploads/images/gallery/2023-06/scaled-1680-/3MTimage.png" alt="image.png"></a></p> <p id="bkmrk-it-means-that-the-et">It means that the ethoscopelab user does not have access to its own folder. This most likely indicates that you are running the container mounting the folder onto your local machine but the ethoscope home folder is either not present or does not have reading and writing access.</p> <h4 id="bkmrk-install-ethoscopy-in">Install ethoscopy in your Python space</h4> <p id="bkmrk-ethoscopy-is-on-pypi">Ethoscopy is on <a href="https://github.com/gilestrolab/ethoscopy">github</a> and on <a href="https://pypi.org/project/ethoscopy/">PyPi</a>. You can install the latest stable version with pip3.</p> <pre id="bkmrk-pip-install-ethoscop"><code class="language-bash">pip install ethoscopy</code></pre> <p id="bkmrk-as-of-version-2.0.5">As of version 2.0.5, the required dependencies are:</p> <pre id="bkmrk-python-%3E%3D-3.10-pandas"><code class="language-bash">Python >= 3.10 numpy >= 2.0 pandas >= 2.2.2 plotly >= 5.22 seaborn >= 0.13 hmmlearn >= 0.3.2 pywavelets >= 1.6 astropy >= 6.1 tabulate >= 0.9 colour >= 0.1.5 nbformat >= 5.10</code></pre> <h4 id="bkmrk-tutorial-data">Tutorial data</h4> <p id="bkmrk-tutorial-data-intro">Starting with <strong>ethoscopy 2.0.5</strong>, the six tutorial pickle files (~36&nbsp;MB total, dominated by <code>overview_data.pkl</code> at ~31&nbsp;MB) are <strong>no longer bundled with the PyPI wheel</strong> — <code>pip install ethoscopy</code> ships code only. The tutorial notebooks fetch the datasets on demand with a single one-time call:</p> <pre id="bkmrk-tutorial-data-call"><code class="language-python">import ethoscopy as etho etho.download_tutorial_data() # idempotent — skips files already on disk</code></pre> <p id="bkmrk-tutorial-data-default">By default this populates <code>~/.cache/ethoscopy/tutorial_data/</code>, which is user-writable on every platform (no root or sudo needed). After that, <code>get_tutorial('overview')</code>, <code>get_tutorial('circadian')</code> and <code>get_HMM('M'|'F')</code> work as shown in the tutorial notebooks.</p> <p id="bkmrk-tutorial-data-lookup"><strong>Lookup order.</strong> At load time, ethoscopy checks three locations in this order:</p> <ol id="bkmrk-tutorial-data-lookup-list"> <li><code>&lt;site-packages&gt;/ethoscopy/misc/tutorial_data/</code> — dev / editable installs and the <code>ethoscope-lab</code> Docker image (which pre-populates it at build time, see below).</li> <li><code>$ETHOSCOPY_TUTORIAL_DATA_DIR</code> if set — useful for shared clusters where one admin-maintained copy serves many users.</li> <li><code>~/.cache/ethoscopy/tutorial_data/</code> — the default for <code>download_tutorial_data()</code>.</li> </ol> <p id="bkmrk-tutorial-data-manual">The canonical URLs live at <a href="https://github.com/gilestrolab/ethoscopy/tree/main/src/ethoscopy/misc/tutorial_data">github.com/gilestrolab/ethoscopy/tree/main/src/ethoscopy/misc/tutorial_data</a>. Place the six <code>*.pkl</code> files in any of the three locations above if you prefer to download manually.</p> <p id="bkmrk-tutorial-data-callout" class="callout success"><strong>ethoscope-lab users don't need to do anything.</strong> The Docker image (<code>ggilestro/ethoscope-lab:1.2</code> and later) runs <code>download_tutorial_data(dest_dir=package_tutorial_data_dir())</code> during build, so the pickles are already present in the image. JupyterHub users can run the tutorials without any download step.</p> <h5 id="bkmrk-tutorial-data-history">Why this changed</h5> <p id="bkmrk-tutorial-data-why">Versions 2.0.0–2.0.4 of the PyPI wheel unintentionally shipped without any tutorial pickles because <code>.gitignore</code> contained a blanket <code>*.pkl</code> rule that Hatchling's default VCS plugin honoured when building the wheel. Users running <code>get_tutorial('overview')</code> got <code>FileNotFoundError</code> with no guidance. Version 2.0.5 makes the omission explicit, ships a stdlib-only download helper with a user-writable default, surfaces an actionable error message when files are missing, and (for the Docker image) pre-populates them during the build so containerised users never see the download step.</p>