User object is composed of a list of records and, optionally, a dictionary of attributes.
A record is stored in the class
|direction||string||whether the user was called/texted (
|correspondent_id||string||identifier of the correspondent|
|datetime||datetime||timestamp of the record|
|call_duration||interaction||duration of the call in seconds or
||a position object
Records are stored as a list, and can be accessed or modified with
User attributes are loaded at the same time as the records. Attributes are
stored in a dictionary that can be access by
>>> user.attributes['age'] = 42 >>> user.attributes['likes_trains'] = True
Object attributes are created by bandicoot when the user's records are loaded:
|has_call||bool||whether call records have been loaded|
|has_text||bool||whether text records have been loaded|
|has_antennas||bool||whether antennas have been loaded|
|has_recharges||bool||whether recharges have been loaded|
|has_gps||bool||whether gps locations have been loaded|
|start_time||datetime||time of the first record|
|end_time||datetime||time of the last record|
|antennas||dict||dictionary of antennas with antenna_id as keys and latlon tuples|
|home||string||the position (antenna id) the user spends the most time at
during the night. Computed using
A lot of the complexity of bandicoot is hidden from the user when writing a new indicator. For example, let's look at the method
from bandicoot.helper.maths import summary_stats @grouping def balance_of_contacts(records): """ Computes the balance of all interactions. For every tie, the balance is the number of outgoing interactions divided by the total number of interactions. """ counter_out = defaultdict(int) counter = defaultdict(int) for r in records: if r.direction == 'out': counter_out[r.correspondent_id] += 1 counter[r.correspondent_id] += 1 balance = [float(counter_out[c]) / float(counter[c]) for c in counter] return summary_stats(balance)
@grouping decorator manages the
groupby keywords for you. It selects the right records (e.g. only calls) and groups them (e.g. by week). By default
interaction=['call','text'] but this can be redefined in the decorator
@grouping(interaction='call'). The function
balance_of_contacts is then called for each group of records and the results are combined.
In this function,
records is thus a subset of
B.records (e.g. only the calls in a specific week).
records is equal to
B.records if the function is called with
The function executes the following operations:
defaultdictfrom the collections module.
forloop then goes over each record passed by the decorator. It counts the total number of interactions and the number of outgoing interactions per contacts.
counter_outis a defaultdict, and
counter_out[c]will return 0 even if
cis not in the dictionary.
summary_stats()which will return the mean and std if
summary=default; the mean, std, median, min, max, kurtosis, skewness if
summary=extended; and the full distribution if
@grouping can return either a number (simply return the value) or a distribution (by calling summary_stats as shown); bandicoot automatically takes both values into account. For example,
number_of_contacts() returns only one number.
A function to compute a new indicator might need to access more than just the list of records. A function might, for example, need to be able to access the GPS coordinate of an antenna or the first record we have available for this user. The method can ask the decorator to also pass the full user object using
@grouping(user_kwd=True). It can then access all the records (user.records), the list of antennas (user.antennas), or other properties (see Object attributes).
@grouping(user_kwd=True) def my_indicator(records, user): pass
First, add it to bandicoot's test suite. bandicoot puts a strong emphasis on the correctness and consistency of its indicators. We thus require the values to be manually computed for the sample users located in
bandicoot/tests/samples/manual/. These manually computed value can then be added to the JSON file also located in
bandicoot/tests/samples/manual/ and tested using:
nosetests -w bandicoot/tests -v
The new metric can be integrated to the default bandicoot pipeline by adding it to