Skip to content

Search our website

Pages

Articles

Articles

Connecting the right data to the right model

Data

An Ocean of Data

Wind turbines produce a lot of data. A single turbine can contain thousands of sensors and generate GBs of data. Now, scale that single turbine up to an entire farm. Well, now you are talking TB of data.

“Great!” I hear you say. “We’ll use this data to build impressive machine learning models which will help me detect failures!”. Well, you’re partially right.

Suppose you are a popular streaming service, and you want to recommend which show a user should watch.  One such service claims to have 200 million subscribers, each of which spend about 3.2 hours per day on their service. That means they get through around 3 hours / 30-minute shows = 6 opportunities to recommend a show to watch per user per day. Given their 200 million subscribers, that’s 1.2 billion examples per day on which to train their models.

Let’s translate this back to wind-turbines. How many examples of main bearing failures do we get? Maybe a handful per year. These are two problems operating at completely different scales.

In the wind industry, we typically have lots of data, but we are often lacking in useful data. So, when trying to detect faults, we have to rely on other methods.

The Models

One of our most frequent question from customers is “Do you have a model to solve issue X?”. Whilst the answer to this is “technically yes”, the truth is that we probably have about 100 different models which can solve issue X.

It’s not our models we’re worried about, it’s your data.

Allow me to explain.

Every wind turbine is unique. They will have different internal layouts, different alarm systems, different tags available, some will have vibration, others won’t, even the same make and model of turbine can present with very different data availability, depending on the customers situation.

In the machine learning and analytics world however, you usually need a very specific set of inputs to produce a useful result. Think about your own computer; you are (mostly) completely restricted to inputting data into it via a keyboard or a mouse.

Any specific instance of one of our models expects a specific set of data and tags as input. It’s overwhelmingly likely that your turbine won’t have the right combination of tags and data that will enable this specific model to run. That’s why we have hundreds.

In order to pick which model we run, we effectively rank them; we know certain tag inputs perform the best, but we also know they won’t always be available. We therefore have other, slightly different, models that can do a reasonable job if those tags aren’t available. For example, when detecting temperatures on bearings, we prefer to use the maximum temperatures observed over a 10-minute period, but if this is missing the average temperature will do. In another case, we could fit a power curve using info from just the power and windspeed, but we can fit a better one by including information from the generator speed, pitch angles, power coefficient, and air-density. By picking the best available model each time, we know we’re doing the best we can do with your data.

This is what separates us from other companies; we are an engineering company first. We understand and examine your problems from an engineering perspective, not just from data perspective. We understand that you have an engineering problem you need to solve, and we will provide you with an engineering solution.

Automation

Now, running this kind of analysis this by hand every time can be extremely time consuming.

Let me provide an example:

A common task of mine is to determine “how much lost energy is this issue causing me?”.  This is a simple question but let me take you through the steps needed to answer it and some of the decisions I frequently need to make along the way:

Determine which tags are available in the data, so that we can fit a power curve

    • Which fitting method can we use? Do we have pitch angles, rotor speeds, windspeeds and power? These are not always available, and we can do fits with less data, but results are poorer.
    • Can we correct for the air-density to reduce the scatter? Yes, if we have the elevation and ambient temperature, but we can identify that from the geographical position, ideally we’d like humidity too.

Clean and fit a power curve

    • Do we need to remove any data beforehand? When were deratings occurring? Are all turbines healthy during this time or do some have pitch controller errors?
    • Do we trust the anemometers, or do we suspect there may be issues associated with them so we cannot trust the wind speed?
    • Have we adjusted for the nacelle transfer function? Do we suspect it is correct based on the neighbour turbines?

Calculate the lost energy by comparing the actual to the expected power

    • What counts as lost energy? Is a 1kW underperformance considered lost energy or is it just scatter in our data? At rated power this may be significant, but below the knee of the power curve we expect to be more uncertain.
    • We need to classify different causes of lost energy, i.e., derating vs downtime vs inefficiencies to prevent us conflating one issue with another.

Associate any lost energy to this specific issue we are concerned with

    • Are there SCADA alarms that trigger around the same time? What are their effects? Are they informative or do they shut the turbine down? Is this issue one which causes downtime or just inefficiencies?
    • How do we make sure the lost energy is not from some other issue? Can we confirm that the total lost energy is the sum of all the issues?
    • What if this issue is not causing lost energy, but may do in the future? Think about a main bearing failure for example. What is the potential lost energy from such an event?

Understandably, doing all of this can take time… a lot of time.

Now I don’t want to spend my time checking these small variations on turbines or having to make these kinds of decisions every time I want to analyse a wind farm. I want to examine the results and determine what that means for the health of your whole site. I want to know which issues are causing the most lost energy, and what we can do about them.

To facilitate this, we’ve built a context-aware smart-system, which automatically assesses what is available in your data and runs the best models we have available. It runs recursively, so the information from our models which add information to your data (this anemometer is broken, here are the power coefficients, these turbines were inefficient between these dates) is used in each of our other models to provide the best recommendations.

Img
Figure 1: Our anemometer model provided early warning a broken anemometer, which was leading to increased levels of lost energy and distortion of the power curve. Rectifying this issue saved up to $30,000 a year by avoiding unnecessary high-wind shut downs.

Now, this is all exciting for me, but what does it mean for you?

There are two primary benefits:

  1. We can move extremely quickly. We’ve all but eliminated humans from the tedious work of data collection, analysis, and minor questions by building a smart tool that answers those issues for us. But we have not removed humans entirely because…
  2. Our engineering can now spend even more time on custom solutions for you. We do not remove humans from the loop, in fact we entrust the most important and critical decisions to those humans, but we let them make those decisions with the best information. We don’t replace humans with AI; we work side-by-side with AI to ensure we can spend most of our time focussing on your issues and delivering solutions to them.
Figure3 blogpostblades1 1
Figure 2: Our context aware system uses the latest statistical, ML, and engineering knowledge to automatically assess your windfarm for reliability issues and issues causing you lost energy. This leaves our engineers more time to assess the results, do deeper dives into the issues you’re facing.

By automating the search for the causes lost energy, we save you time, money, and energy.

Want to know more about the how we can help increase the efficiency of your O&M operations?

Speak to one of our experts