Loading the data
Setting up
To begin you need three paths saved as variables:
- the path to the metadata
.csv
file - the full path (including folder) the ftp location (eg: ftp://myethoscopedata.com/results/)
-
the path of a local folder to save downloaded
.db
files (if your .db files are already downloaded on the same machine running ethoscopy, then this will be the path to the folder containing them)
import ethoscopy as etho
# Replace with your own file/sever paths
meta_loc = 'USER/experiment_folder/metadata.csv'
remote = 'ftp://ftpsever/auto_generated_data/ethoscope_results'
local = 'USER/ethoscope_results'
# This will download the data remotely via FTP onto the local machine
# If your ethoscopy is running via ethoscope-lab on the same machine
# where the ethoscope data are, then this step is not necessary
etho.downlaod_from_remote_dir(meta_loc, remote, local)
Click the button to see the docstring - download_from_remote_dir
This function is used to import data from the ethoscope node platform to your local directory for later use. The ethoscope files must be saved on a remote FTP server and saved as .db files, see the Ethoscope manual for how to setup a node correctly.https://www.notion.so/giorgiogilestro/Ethoscope-User-Manual-a9739373ae9f4840aa45b277f2f0e3a7
This only needs to be run if like the Gilestro lab you have all your data saved to a remote ftp sever. If not you can skip straight to the the next part.
Create a modified metadata DataFrame
This function creates a modified metadata DataFrame with the paths of the saved .db files and generates a unique id for each experimental individual. This function only works for a file structure locally saved to whatever computer you are running and is saved in a nested director structure as created by the ethoscopes, i.e.
'LOCAL_DIR/ethoscope_results/00190f0080e54d9c906f304a8222fa8c/ETHOSCOPE_001/2022-08-23_03-33-59/DATABASE.db'
For this function you only need the path to the metadata file and the path the the first higher level of your database directories, as seen in the example below. Do not provide a path directly to the folder with your known .db file in it, the function searches all the saved data directories and selects the ones that match the metadata file.
meta_loc = 'USER/experiment_folder/metadata.csv'
local = 'USER/ethoscope_results' # remember to just provide the path up to the directory where the individual ethoscope files are saved
meta = etho.link_meta_index(meta_loc, local)
Load and modify the ethoscope data
The load function takes the raw ethoscope data from its .db format and modifies it into a workable pandas DataFrame format, changing the time (seconds) to be in reference to a given hour (usually lights on). Min and max times can be provided to filter the data to only recordings between those hours. With 0 being in relation to the start of the experiment not the reference hour.
data = etho.load_ethoscope(meta, min_time = 24, max_time = 48, reference_hour = 9.0)
# you can cache the each specimen as the data is loaded for faster load times when run again, just add a file path to a folder of choice, the first time it will save, the second it will search the folder and load straight from there
# However this can take up a lot of memory and it's recommended to save the whole loaded dataset at the end and to load from this each time. See the end of this page
data = etho.load_ethoscope(meta, min_time = 24, max_time = 48, reference_hour = 9.0, cache = 'path/ethoscope_cache/')
Additionally, an analysing function can be also called to modify the data as it is read. It's recommended you always call at least max_velocity_detector or sleep_annotation function when loading the data as it generates columns that are needed for the analysis / plot generating methods.
from functools import partial
data = etho.load_ethoscope(meta, reference_hour = 9.0, FUN = partial(etho.sleep_annotation, time_window_length = 60, min_time_immobile = 300))
# time_window_length is the amount of time each row represents. The ethoscope can record multiple times per second, so you can go as low as 10 seconds for this.
# The default for time_window_length is 10 seconds
# min_time_immobile is your sleep criteria, 300 is 5 mins the general rule of sleep for flies, see Hendricks et al., 2000.
Ethoscopy has 3 general functions that can be called whilst loading:
- max_velocity_detector: Aggregates variables per the given time window, finding their means. Sleeep_annotation uses this function before finding sleep bouts, so use this when you don't need to know the sleep bouts.
- sleep_annotation: Aggregates per time window and generates a new boolean column of sleep, as given by the time immobile argument.
- isolate_activity_lengths: Finds consecutive runs of inactivity or activity, filter by the intervals column and provide a window to contain the variables from prior to the start of the run.
Ethoscopy also has 2 functions for use with mAGO ethoscope module (odour delivery and mechanical stimulation):
- puff_mago: Finds the interaction times and then searches a given window post interaction for movement.
- find motifs: A modifcation of puff_mago, the function finds all interaction times and their response whilst retaining all the previous variables information in a given time window.
See
Saving the data
Loading the ethoscope data each time can be a long process depending on the length of the experiment and number of machines. It's recommended to save the loaded/modified DataFrame as a pickle .pkl
file. See here for more information about pandas and pickle saves. The saved behavpy object can then be loaded in instantly at the start of a new session!
# Save any behavpy or pandas object with the method below
import pandas as pd
df.to_pickle('path/behapvy.pkl') # replace string with your file location/path
# Load the saved pickle file like this. It will retain all the metadata information
df = pd.read_pickle('path/behapvy.pkl')