Visualising your data

Once the behavpy object is created, the print function  will just show your data structure. If you want to see your data and the metadata at once, use the built in method .display()

# first load your data and create a behavpy instance of it



You can also get quick summary statistics of your dataset with .summary() 


# an example output of df.summary()
behavpy table with:
    individuals       675
   metavariable         9
      variables        13
   measurements   3370075
# add the argument detailed = True to get information per fly
df.summary(detailed = True)

                               data_points          time_range
2019-08-02_14-21-23_021d6b|01         5756   86400  ->  431940
2019-08-02_14-21-23_021d6b|02         5481   86400  ->  431940

Be careful with the pandas method .groupby()  as this will return a pandas object back and not a behavpy object. Most other common pandas actions will return a behavpy object.

Visualising your data

Whilst summary statistics are good for a basic overview, visualising the variable of interest over time is usually a lot more informative. 


The first port of call when looking at time series data is to create a heatmap to see if there are any obvious irregularities in your experiments.

# To create a heatmap all you need to write is one line of code!
# All plot methods will return the figure, the usual etiquette is to save the variable as fig

fig = df.heatmap('moving') # enter as a string which ever numerical variable you want plotted inside the brackets

# Then all you need to do is the below to generate the figure


Plots over time

For an aggregate view of your variable of interest over time, use the .plot_overtime() method to visualise the mean variable over your given time frame or split it into sub groups using the information in your metadata.

# If wrapped is True each specimens data will be aggregated to one 24 day before being aggregated as a whole. If you want to view each day seperately, keep wrapped False.
# To achieve the smooth plot a moving average is applied, we found averaging over 30 minutes gave the best results
# So if you have your data in rows of 10 seconds you would want the avg_window to be 180 (the default)
# Here the data is rows of 60 seconds, so we only need 30

fig = df.plot_overtime(
variable = 'moving',
wrapped = True,
avg_window = 30
# the plots will show the mean with 95% confidence intervals in a lighter colour around the mean


# You can seperate out your plots by your specimen labels in the metadata. Specify which column you want fromn the metadata with facet_col and then specify which groups you want with facet_args
# What you enter for facet_args must be in a list and be exactly what is in that column in the metadata
# Don't like the label names in the column, rename the graphing labels with the facet_labels parameter. This can only be done if you have a same length list for facet_arg. Also make sure they are the same order

fig = df.plot_overtime(
variable = 'moving',
facet_col = 'species', 
facet_arg = ['D.vir', 'D.ere', 'D.wil', 'D.sec', 'D.yak'],
facet_labels = ['D.virilis', 'D.erecta', 'D.willistoni', 'D.sechellia', 'D.yakuba']

# if you're doing circadian experiments you can specify when night begins with the parameter circadian_night to change the phase bars at the bottom. E.g. circadian_night = 18 for lights off at ZT 18.