<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Marco</title>
    <description>The latest articles on DEV Community by Marco (@marcopeix).</description>
    <link>https://dev.to/marcopeix</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2711372%2F2c111140-884a-473d-b4f5-7248e6943f25.png</url>
      <title>DEV Community: Marco</title>
      <link>https://dev.to/marcopeix</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/marcopeix"/>
    <language>en</language>
    <item>
      <title>The Complete Introduction to Time Series Classification in Python</title>
      <dc:creator>Marco</dc:creator>
      <pubDate>Tue, 14 Jan 2025 21:22:28 +0000</pubDate>
      <link>https://dev.to/marcopeix/the-complete-introduction-to-time-series-classification-in-python-108g</link>
      <guid>https://dev.to/marcopeix/the-complete-introduction-to-time-series-classification-in-python-108g</guid>
      <description>&lt;p&gt;Photo by Jordan Whitt on Unsplash&lt;/p&gt;

&lt;p&gt;Time series data is omnipresent in many industries, and while forecasting time series is widely addressed, classifying time series data is often overlooked.&lt;/p&gt;

&lt;p&gt;In this article, we get a complete introduction to the field of time series classification, exploring its real-life applications, getting an overview of the different methods and applying some of them in a small classification project using Python.&lt;/p&gt;

&lt;p&gt;Let’s get started!&lt;/p&gt;

&lt;h2&gt;
  
  
  Defining time series classification
&lt;/h2&gt;

&lt;p&gt;Time series classification is a field of supervised machine learning where one or more features are measured across time and used to assign a category.&lt;/p&gt;

&lt;p&gt;Therefore, the goal in classifying time series is to assign a label rather than predict the future value of the series.&lt;br&gt;
Use cases for time series classification&lt;/p&gt;

&lt;p&gt;Time series classification is mostly used with sensor data. Hence, we can perform predictive maintenance and monitor different equipment to predict if a failure is likely to occur.&lt;/p&gt;

&lt;p&gt;It is also a technique used in healthcare, such as analyzing the electrocardiogram (ECG) data. The recorded pattern can be analyzed by a model to determine if a patient is healthy or not.&lt;/p&gt;

&lt;p&gt;Furthermore, time series classification is used for speech recognition. Spoken words can be captured as a sound wave over time, and time series classification models can be used to determine what words were spoken, and also identify the speaker.&lt;/p&gt;

&lt;p&gt;Another application is in food spectroscopy, where a classification model is applied to the spectroscopy data to determine the alcohol content of a beverage, or identify different components of food products.&lt;/p&gt;

&lt;p&gt;Finally, it is used in cybersecurity, where a model can identify patterns of abnormal activity, signalling a potential fraud or a breach.&lt;/p&gt;

&lt;p&gt;As we can see, the applications of time series classification are significant in many fields and industries, making it an indispensable tool to have for any data scientist.&lt;/p&gt;
&lt;h2&gt;
  
  
  Overview of time series classification models
&lt;/h2&gt;

&lt;p&gt;There are many different approaches to time series classification. In this section, we get an overview of each method, providing a broad explanation of their inner workings, and listing the main models.&lt;/p&gt;

&lt;p&gt;For a more detailed breakdown of each method, including how they work and their speed of inference, consult &lt;a href="https://www.datasciencewithmarco.com/pl/2148619893" rel="noopener noreferrer"&gt;this guide&lt;/a&gt; on time series classification.&lt;/p&gt;
&lt;h3&gt;
  
  
  Distance-based models
&lt;/h3&gt;

&lt;p&gt;These models rely on a distance metric to classify samples. The most common metric is the Euclidean distance. &lt;/p&gt;

&lt;p&gt;Dynamic time warping (DTW) is a more robust distance measure, as it finds the optimal match between each point of two series, allowing it to handle series of different lengths and recognize patterns that are slightly out of phase, as shown below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqu37b7edyt2a4zf3f668.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqu37b7edyt2a4zf3f668.png" alt="Image description" width="800" height="596"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Notice that the blue series has more points than the red series, and that the best matches are shown by a hashed line. Image by the author.&lt;/p&gt;

&lt;p&gt;Distance-based models include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;K-nearest neighbors (KNN)&lt;/li&gt;
&lt;li&gt;ShapeDTW&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Dictionary-based models
&lt;/h3&gt;

&lt;p&gt;These models encode patterns in the series using symbols, and then use the frequency of occurrence of each symbol to classify time series.&lt;/p&gt;

&lt;p&gt;Dictionary-based models include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;BOSS&lt;/li&gt;
&lt;li&gt;WEASEL&lt;/li&gt;
&lt;li&gt;TDE&lt;/li&gt;
&lt;li&gt;MUSE&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Ensemble methods
&lt;/h3&gt;

&lt;p&gt;These methods are not models, but rather protocols used with other estimators.&lt;/p&gt;

&lt;p&gt;Basically, it involves taking multiple base estimators and combining their prediction to get a final prediction.&lt;/p&gt;

&lt;p&gt;The main advantage of ensemble method, is that it can take a univariate model and apply it on a multivariate dataset. &lt;/p&gt;

&lt;p&gt;With the bagging technique, a univariate model can be trained on each feature of a dataset, and we can then combine the prediction from all estimators, effectively using information from all features.&lt;/p&gt;

&lt;p&gt;Methods include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bagging&lt;/li&gt;
&lt;li&gt;Weighted ensemble&lt;/li&gt;
&lt;li&gt;Time series forest&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Feature-based methods
&lt;/h3&gt;

&lt;p&gt;Once again, this group represents methods and not models to extract different features from time series. These features are then used to train any arbitrary machine learning model for classification.&lt;/p&gt;

&lt;p&gt;Feature-based methods include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Summary features (min, max, mean, median, etc.)&lt;/li&gt;
&lt;li&gt;Catch22&lt;/li&gt;
&lt;li&gt;Matrix profile&lt;/li&gt;
&lt;li&gt;TSFresh&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Interval-based models
&lt;/h3&gt;

&lt;p&gt;These models extract multiple intervals from time series and compute features, using the methods listed above. These features are then used to train a classifier.&lt;/p&gt;

&lt;p&gt;Such models include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;RISE&lt;/li&gt;
&lt;li&gt;CIF&lt;/li&gt;
&lt;li&gt;DrCIF&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Kernel-based models
&lt;/h3&gt;

&lt;p&gt;With kernel-based models, a kernel function is applied to map the current series to another dimensional space where it would technically be easier to classify. &lt;/p&gt;

&lt;p&gt;Common kernels include the RBF kernel and the convolutional kernel.&lt;/p&gt;

&lt;p&gt;Example models are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Support vector classifier (SVC)&lt;/li&gt;
&lt;li&gt;Rocket&lt;/li&gt;
&lt;li&gt;Arsenal (an ensemble of Rocket)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Shapelet classifier
&lt;/h3&gt;

&lt;p&gt;A shapelet classifier relies on extracting shapelets: the most discriminative subsequences of a time series. &lt;/p&gt;

&lt;p&gt;The distance between the shapelet and a particular series is then used for classification.&lt;/p&gt;
&lt;h3&gt;
  
  
  Meta classifier
&lt;/h3&gt;

&lt;p&gt;Finally, the meta classifier combines different methods listed above to ensemble them and produce a robust classifier that can used with virtually any series and result in good performance.&lt;/p&gt;

&lt;p&gt;HIVE-COTE is an example of a meta classifier that combines TDE, Shapelet, DrCIF and Arsenal. &lt;/p&gt;

&lt;p&gt;While this model looks like a universal solution to classification, it is very slow to train, and other methods are worth testing before resorting to HIVE-COTE.&lt;/p&gt;

&lt;p&gt;As you can see, there is a vast array of methods for time series classification. Some are faster than others, some handle only univariate data. &lt;/p&gt;

&lt;p&gt;Knowing each method’s strength and inner workings is key in building the best classification model for your particular scenario. However, a deep dive into each method is outside the scope of this article, as this is meant to give an introduction to classification and get you some hands-on experience.&lt;/p&gt;

&lt;p&gt;As such, let’s apply some classification models in a small project using Python.&lt;/p&gt;
&lt;h2&gt;
  
  
  Hands-on time series classification project
&lt;/h2&gt;

&lt;p&gt;In this section, we apply some techniques listed above to a classification task.&lt;/p&gt;

&lt;p&gt;Here, we use the BasicMotions dataset, donated by Jack Clements to the UEA archive and publicly accessible here through the GPL license. &lt;/p&gt;

&lt;p&gt;This dataset compiles data from four students wearing a smartwatch and performing different activities: standing, walking, running and playing badminton. &lt;/p&gt;

&lt;p&gt;The watch has an accelerometer and a gyroscope that recorded data in three different axes (x, y, z), resulting in six features in total. However, the dataset is rather small with only 40 training and testing samples.&lt;/p&gt;

&lt;p&gt;The objective is then to classify the activity being performed from the data collected by the accelerometer and gyroscope. Here, we implement the K-nearest neighbor algorithm and use bagging along with WEASEL to see which approach performs best.&lt;/p&gt;

&lt;p&gt;The full source code is available on &lt;a href="https://github.com/marcopeix/time-series-analysis/blob/master/intro_time_series_classification.ipynb" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Initial setup
&lt;/h3&gt;

&lt;p&gt;First, we import the required packages for time series classification. &lt;/p&gt;

&lt;p&gt;For this task, I think that &lt;code&gt;sktime&lt;/code&gt; is the best option, as it implements a comprehensive list of classification method through a familiar interface that mimics &lt;code&gt;scikit-learn&lt;/code&gt;. It also plays very well with &lt;code&gt;scikit-learn&lt;/code&gt; making it easy to evaluate our models and use other machine learning models for time series classification.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sktime.datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_basic_motions&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.model_selection&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GridSearchCV&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;KFold&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, we read the dataset. Since this is a common dataset to get started with time series classification, it is also available through the &lt;code&gt;sktime&lt;/code&gt; package.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_basic_motions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;train&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;numpy3D&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_basic_motions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;numpy3D&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that we specify &lt;code&gt;return_type='numpy3D’&lt;/code&gt;. This is the most flexible data format for time series classification.&lt;/p&gt;

&lt;p&gt;We can print out the shape of &lt;code&gt;X_train&lt;/code&gt; and get &lt;code&gt;(40, 6, 100)&lt;/code&gt;. The shape corresponds to (num_samples, num_features, num_timesteps). Thus, we see that &lt;code&gt;X_train&lt;/code&gt; has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;40 samples&lt;/li&gt;
&lt;li&gt;6 features&lt;/li&gt;
&lt;li&gt;Each feature is measured across 100 time steps&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then, we can visualize our data to see the difference in patterns between each activity. Below, we show the difference between walking and playing badminton.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;series_indices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;categories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;standing&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;running&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;walking&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;badminton&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;features&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accel_1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accel_2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accel_3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gyro_1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gyro_2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gyro_3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;selected_series&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;series_indices&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;subplots&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;selected_series&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="n"&gt;axes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;selected_series&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="n"&gt;axes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;set_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Category: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;categories&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;axes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;set_xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Time Steps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;axes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;set_ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Values&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;axes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tight_layout&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Comparing walking and playing badminton. Notice how the accelerator data shows bursts when playing badminton, unlike during walking. Image by the author.&lt;/p&gt;

&lt;p&gt;In the figure above, we can see very clear patterns for each activity. For example, the accelerator displays short bursts when a person is playing badminton, something we do not observe during walking.&lt;/p&gt;

&lt;p&gt;The idea is now to feed those features measured across time to a machine learning model and see if it can correctly classify each activity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Classification with KNN
&lt;/h3&gt;

&lt;p&gt;One of the simplest methods we can use is a distance-based model like K-nearest neighbors (KNN). &lt;/p&gt;

&lt;p&gt;Again, this method uses a distance metric, like the Euclidean distance or dynamic time warping (DTW), and assigns the label of the sample that has the shortest distance to a given series.&lt;/p&gt;

&lt;p&gt;Thus, as a small experiment, let’s tune KNN to determine which distance metric is best to use between Euclidean and DTW.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sktime.classification.distance_based&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KNeighborsTimeSeriesClassifier&lt;/span&gt;

&lt;span class="n"&gt;knn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KNeighborsTimeSeriesClassifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_neighbors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;distance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;euclidean&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;dtw&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;tuned_knn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GridSearchCV&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;knn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;cv&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;KFold&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_splits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tuned_knn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;y_pred_knn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tuned_clf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tuned_knn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;best_params_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, the best distance metric to use is DTW, and we already made predictions using this optimal configuration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Classification with bagging and WEASEL
&lt;/h3&gt;

&lt;p&gt;Next, let’s use bagging along with WEASEL to classify our dataset. &lt;/p&gt;

&lt;p&gt;WEASEL is a dictionary-based model, meaning that it encodes patterns with bag-of-words. &lt;/p&gt;

&lt;p&gt;For example, an increasing trend might be encoded as “aaa” while a decreasing trend can be encoded as “aab”. The frequency of these bag-of-words is then used to train a model and make predictions.&lt;/p&gt;

&lt;p&gt;However, WEASEL is a univariate model, meaning that it can only process a single feature. Since our datasethas six features, the model might miss important information resulting in poor performances.&lt;/p&gt;

&lt;p&gt;To solve that, we can use bagging. With this technique, we can train multiple base estimators and combine them to get a final prediction. &lt;/p&gt;

&lt;p&gt;In this case specifically, we can train six different WEASEL models that will specialize in each feature. We can then combine the predictions of all individual models to get the final label.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sktime.classification.ensemble&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaggingClassifier&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sktime.classification.dictionary_based&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;WEASEL&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.preprocessing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LabelEncoder&lt;/span&gt;

&lt;span class="n"&gt;encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LabelEncoder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;y_train_encoded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;base_clf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;WEASEL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;alphabet_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;support_probabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;clf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BaggingClassifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_clf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;n_estimators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# there are 6 features in total 
&lt;/span&gt;    &lt;span class="n"&gt;n_features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;y_pred_bagging&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;y_pred_bagging&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;inverse_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_pred_bagging&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the code block above, we first label encode our target, as it is a requirement for using bagging. &lt;/p&gt;

&lt;p&gt;Then, notice that we specify &lt;code&gt;alphabet_size=3&lt;/code&gt;. This determines how many letters can go in the bag-of-words to encode patterns. A larger alphabet can encode more complicated patterns, but using a value of 3 is a reasonable starting point. &lt;/p&gt;

&lt;p&gt;Also, when used with bagging, we must set &lt;code&gt;support_probabilities=True&lt;/code&gt;. The probabilities of each individual model are then combined to get the final prediction.&lt;/p&gt;

&lt;p&gt;Once the base estimator is defined, we initialize the &lt;code&gt;BaggingClassifier&lt;/code&gt; and specify &lt;code&gt;n_estimators=6&lt;/code&gt;, since there are six features in the dataset and each estimator will consider &lt;code&gt;n_features=1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Using this method, we have successfully applied a univariate model on a multivariate dataset and used all the information to perform classification.&lt;/p&gt;

&lt;h3&gt;
  
  
  Evaluation
&lt;/h3&gt;

&lt;p&gt;Having used two different approaches for classification, let’s evaluate both of them to see which performs best.&lt;/p&gt;

&lt;p&gt;We can display the classification report to get a breakdown of the performance for each class.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.metrics&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;classification_report&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f1_score&lt;/span&gt;

&lt;span class="n"&gt;knn_report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;classification_report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred_knn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;zero_division&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;bagging_report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;classification_report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred_bagging&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1xx7nvly6xgka8fyiku.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1xx7nvly6xgka8fyiku.png" alt="Image description" width="800" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Classification report of KNN. Notice that the model fails to predict the badminton class.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftr6kw9wwdplbv0sfjzh8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftr6kw9wwdplbv0sfjzh8.png" alt="Image description" width="800" height="361"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Classification report for bagging with WEASEL. This model performs best with a F1-score of 0.92. Image by the author.&lt;/p&gt;

&lt;p&gt;Looking at both reports above, we notice that the KNN model completely fails to predict the badminton class, resulting in an overall poor performance.&lt;/p&gt;

&lt;p&gt;However, bagging with WEASEL yields very good results, as it perfectly labels the walking activity and achieves a F1-score of 0.92. &lt;/p&gt;

&lt;p&gt;We can optionally visualize both F1-scores in the figure below.&lt;br&gt;
Comparing weighted F1-scores of both approached. Bagging with WEASEL is the best model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fauon307ltletv2tvuouq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fauon307ltletv2tvuouq.png" alt="Image description" width="800" height="695"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once again, from the figure above, we can see that bagging with WEASEL yields the best results, with a F1-score of 0.92. &lt;/p&gt;

&lt;p&gt;Thus, it is interesting to see the benefits of using bagging with a univariate model, allowing it to capture information from all features and resulting in good performances.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this article, we introduced the field of time series classification. &lt;/p&gt;

&lt;p&gt;We discovered some of its real-life applications in healthcare, cybersecurity and predictive maintenance and got an overview of the different methods used for time series classification.&lt;/p&gt;

&lt;p&gt;We then completed our first classification project using KNN and bagging with WEASEL. Keep in mind that this project is meant to get us started in time series classification, as there is of course much more to discover.&lt;/p&gt;

&lt;p&gt;Thanks for reading and I hope that you learned something new!&lt;/p&gt;

&lt;p&gt;Cheers 🍻&lt;/p&gt;

&lt;h2&gt;
  
  
  Next steps
&lt;/h2&gt;

&lt;p&gt;To keep learning on time series classification, leave a comment to let me know that you want my to cover more on the subject.&lt;/p&gt;

&lt;p&gt;Also, you can download my &lt;a href="https://www.datasciencewithmarco.com/pl/2148619893" rel="noopener noreferrer"&gt;free guide on time series classification&lt;/a&gt; for a reference guide on all available methods, how they work, their speed of inference, and for data sources to practice time series classification.&lt;/p&gt;

&lt;p&gt;Finally, if you are serious about mastering time series classification, check out my course: &lt;a href="https://www.datasciencewithmarco.com/store" rel="noopener noreferrer"&gt;Time Series Classification in Python&lt;/a&gt;. This is the most complete course on the subject, covering both machine learning and deep learning methods in detail, along with guided capstone projects with real-life datasets.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;p&gt;BasicMotions dataset — &lt;a href="http://www.timeseriesclassification.com/description.php?Dataset=BasicMotions" rel="noopener noreferrer"&gt;http://www.timeseriesclassification.com/description.php?Dataset=BasicMotions&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sktime — &lt;a href="https://www.sktime.net/en/stable/" rel="noopener noreferrer"&gt;https://www.sktime.net/en/stable/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>timeseries</category>
      <category>datascience</category>
      <category>python</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
