An Ocean of Data
Wind turbines produce a lot of data. A single turbine can contain thousands of sensors and generate GBs of data. Now, scale that single turbine up to an entire farm. Well, now you are talking TB of data.
“Great!” I hear you say. “We’ll use this data to build impressive machine learning models which will help me detect failures!”. Well, you’re partially right.
Suppose you are a popular streaming service, and you want to recommend which show a user should watch. One such service claims to have 200 million subscribers, each of which spend about 3.2 hours per day on their service. That means they get through around 3 hours / 30-minute shows = 6 opportunities to recommend a show to watch per user per day. Given their 200 million subscribers, that’s 1.2 billion examples per day on which to train their models.
Let’s translate this back to wind-turbines. How many examples of main bearing failures do we get? Maybe a handful per year. These are two problems operating at completely different scales.
In the wind industry, we typically have lots of data, but we are often lacking in useful data. So, when trying to detect faults, we have to rely on other methods.


