extendedMD.bs

extendedMD.bs.extract_modified_bs_sequence(sax_sequence)

This functions extracts the modified Behaviour Subsequence (BS) sequence list, which is the original sax word sequence where every consecutive pairs of sax words are equal are fused into the same sax word.

Parameters:sax_sequence (list of str) – list of original sax words
Returns:
  • bs_sequence (list of str) - list of modified sax words
  • bs_lengths (list of int) - list of lengths of each modified sax word
extendedMD.bs.generate_bs_pointers(bs_lengths, bs_size)

It generates the pointers (i.e. time indexes) of each modified sax word into the original time-series data

Parameters:
  • bs_lengths (list of str) – list of modified sax words
  • bs_size (int) – window size (in the original time-series) of a single sax word
Returns:

list of pointers to the original time-series

Return type:

list of list of int

extendedMD.bs.get_bs_subsequences_dic_list(ts, bs_seq, bs_pointers, subseq_size)

This function extracts a list with all the BS subsequences with fixed size from a BS sequence

Parameters:
  • ts (list of float) – original 1-d
  • bs_seq (list of str) – list of modified sax words (i.e. BS sequence)
  • bs_pointers (list of list of int) – list of pointers to the original time-series
  • subseq_size (int) – number of sax words in a single BS subsequence
Returns:

list of dictionaries where each dic represents a single BS subsequence. The dic has 4 entries:

  • pattern - the list of sax words related to that BS subsequence
  • pointers - the pointers list to the original time-series related to that subsequence
  • ts - time-series subsequence related to that BS subsequence (numpy array!)
  • bs_position - list of indexes of the the bs_seq list related to that BS subsequence

Return type:

list of dic