Feature Extraction of IoT Sensor Data

Feature extraction is an important step in IoT-related machine learning process that transforms the temporal data of machine component state into a format supported by machine learning algorithms. The extracted features need to be informative, i.e. need to carry the information that can contribute to the prediction.

Due to the temporal characteristic of IoT sensor data, there are some common patterns for extracting feature from IoT data. This blog post introduces three types of common feature extraction patterns for IoT data:

  • Window-based descriptive statistics
  • Seasonal pattern
  • Trend pattern
  1. Window-based descriptive statistics

A piece of message from IoT sensor carries the information related to the state of a machine component at a time point. This single piece of information is meaningless for the machine learning prediction. However, the descriptive statistics of a sequence of senor data within a time window can offer valuable information for the prediction. For example, the count of exceptions occurring on a machine component in the last 7 days can be an indicator of potential failure of the machine in next 7 days.

The descriptive statistics can describe the distribution (e.g., skewness and kurtosis), central tendency (e.g., mean, median, and mode) and dispersion (e.g., standard deviation, variance, and Range) of the senor measurements within a give time window. It is common that the combination of some descriptive statistics can provide more accurate information. For example, having both the mean and standard deviation of the sensor measurements in a time window will provide more accurate state information of a machine component compared to only having one of them.

Many window-based descriptive statistics can be the candidate features. Domain knowledge is always useful to judge which descriptive statistics within which size of window is more important than others to contribute for the prediction. For example, the engineers of an industrial machine will have more knowledge on which component under which condition is more likely to cause the machine failure.

  1. Seasonal pattern

IoT sensor data can show seasonal pattern. For example, the IoT data monitoring a machine usage can show a low usage level at weekends and a high usage level at weekdays. The features representing seasonal pattern can be extracted from the timestamp of the IoT sensor data. Some examples of the seasonal pattern features are:

  • IsWorkingHour
  • IsWeekday
  • MonthOfYear
  • DayOfWeek

These features can be very useful for time-series forecast type of machine learning requirements, e.g., machine usage forecast and energy consumption forecast.

  1. Trend pattern

Like seasonal pattern the features extracted from the trend pattern of the IoT sensor data can be useful for time-series forecast type of machine learning requirements as well. Lag operator can be used to extract the features for representing the trend pattern. A sequence of lagged values for the previous X units of time periods carry the autocorrelation information of the time series that can contribute to the prediction.


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s