Month: April 2020

Storm Surge: Planning for the Risk

Posted on Updated on

Mobile home park on the north side of U.S. Highway 98 in Mexico Beach, Florida, washed away by the storm surge and wave impacts of Hurricane Michael. Nov. 2, 2018 (NOAA)

If you were asked to define the word “risk”, what would your response be? What are the first things that come to mind? What questions are you asking? What are some things you would consider to be risky?  Risk can take many forms, be it financial, personal injury, or even simple decisions that we all have to make on a daily basis. Consider two simple questions: 

  • Should I change the tires on my car?
  • Should I change the tires on my car today

From a risk perspective, how has the question changed? The answer to the first question is a simple one because it’s general in nature. Yes, the tires should be changed somewhat routinely over the lifetime that you own or lease the car. Question two requires more thought, however, because it narrows the task down to a specific time frame. If your tires were changed recently and are in good shape, then it’s most likely business as usual. If you’re overdue, what are the potential consequences of not having your tires changed? Are you risking personal injury to yourself, your family, or others by not changing them? The risk profile changes based on myriad variables that exist. As this example conveys, the idea of risk is relatively simple in itself but can become complex quickly. Let’s define risk for now as the potential of gaining or losing something of value. That is to say:

Risk = Probability x Consequence 

Yes, we threw some math at you, but this concept is relatively easy to grasp. Using the previous example, the overall risk profile is determined by multiplying the probability of a given event by the resulting consequence if that event occurs. So if there is a high chance of an event occurring, or the consequence is severe, then the risk to you would be high.  One way to look at risk is by using a risk matrix, as shown below in Figure 1.  Your risk increases if the probability of the event goes up or if the consequence of the event goes up.

Figure 1.  Risk matrix showing different levels of risk based on the probability of an event and the consequence if that event occurs.

The event in our car example is blowing a tire on the interstate, and a potential consequence would be having a fatal accident due to the blown tire. That consequence is so severe that your risk is quite high.  But let’s take the example a little further.  Risk is further compounded by vulnerability. Let’s consider a new equation:

Risk = Probability x Consequence x Vulnerability

Using the same example, what are variables that could increase the vulnerability, and thus the risk of a fatal accident, in this scenario? Are there kids in the car? What speed is the car travelling? If the tire pops while backing out of the driveway, isn’t that much different than the tire popping while travelling 70 mph down a busy interstate? This is just one of many examples that we all encounter on a daily basis. If there is a consequence to an action you might take, then you are making a risk-based decision.

 

Risk Perception and Risk Tolerance

There are two other topics related to risk that we should touch on:  risk perception and risk tolerance.  First, risk perception is the subjective judgement people make about the severity and probability of a risk. Why is this important? Well, there are two main reasons:

  1. Actual risk usually doesn’t equal perceived risk
  2. Perceived risk varies from person to person

Why does this complicate matters? When people have to make decisions, it’s important that how they perceive the risk be equal to the actual risk of the event, or at least as close to equal as possible. This is especially true when there is a desired response or action that needs to be taken to protect life and property, as is often the case with weather. We can use a simple example to further understand this. If a tornado is on the ground moving towards a community, the desired action for people living within that community is to seek shelter. In this scenario, the cost of persons within that community not seeking shelter is very high given that their lives are potentially in peril. If people dismiss what that tornado could do to their community (e.g., “tornadoes always pass our city to the south”), then that can be a recipe for disaster, especially when the cost is human lives.

This leads us nicely into risk tolerance. How an individual responds to risk is governed by their risk tolerance, which is unique to both the situation and the person. Risk tolerance is the amount of risk that an individual is willing to accept with respect to a given event occurring. Turning back to the tornado example, let’s consider two hypothetical people that live in the community threatened by the approaching tornado; we’ll name them Sara and Monika. For the sake of example, let’s assume that Sara has a family with two young children and lives in a mobile home. Monika, on the other hand, lives by herself in a well-built house. If Sara and Monika have the same perception of the imposing risk, do these life factors change their respective risk tolerances? In the real world it’s difficult to say, but in this idealized example, let’s assume that it does. Monika does not have anyone in her care and also lives in a home that could withstand stronger winds than Sara’s mobile home. Is Monika’s tolerance for risk higher than Sara’s? It certainly could be, couldn’t it? Again, we understand that assumptions are being made here, but this is simply a hypothetical scenario to demonstrate how risk tolerances may change across individuals and circumstances unique to those individuals. All of this is to say that humans are complicated and risk perceptions and tolerances vary across all of us. 

Circling back to the initial equation that we used to define risk, let’s establish a baseline for what the potential consequences are with respect to storm surge by looking at history. Storm surge is the abnormal rise of the ocean produced by a hurricane or tropical storm, and normally dry land near the coast can be flooded by the surge.  Historically, about 50% of lives lost in landfalling tropical cyclones in the United States have been due to storm surge (Figure 2):

Figure 2.  Causes of death in the United States directly attributable to Atlantic tropical cyclones, 1963-2012 (Rappaport 2014).

The mission statement of the National Weather Service charges us to “provide weather, water, and climate data, forecasts and warnings for the protection of life and property and enhancement of the national economy.” There is no clearer way to illustrate what the consequences are in this equation: the loss of human lives. Because the cost here is so high – arguably the highest – our risk tolerance for your safety is extremely low. We have absolutely no appetite for someone losing their life from a weather event. This idea directly informs many of the products that we use to communicate risk prior to and during landfalling storms, and we therefore use near-worst-case scenarios to encapsulate the full envelope of storm surge risk to communities. One life lost during a storm is one life too many. The remainder of this blog post will discuss two such products used by the National Hurricane Center and emergency managers to understand storm surge risk.

MOMs and MEOWs

Can we all agree that “MOMs” are extremely important? Well, yes, those moms are important in our lives, but that holds true for storm surge MOMs as well. Have you ever wondered how officials decide what areas should evacuate ahead of a hurricane? Look no further. MOMs (Maximum of the Maximums) are the rock from which the nation’s storm surge evacuation zones are built upon.  MOMs are generated ahead of time. That is to say that these are precomputed maps meant for planning and mitigation purposes well ahead of a landfalling hurricane. In fact, one can view these any time as they are hosted on the National Hurricane Center’s website at https://www.nhc.noaa.gov/nationalsurge/. MOMs are generated by hurricane category (think 1-5) and depict the maximum storm surge height possible across all storm surge attributes. Attributes include things such as forward speed, storm trajectory, and landfall location, just to name a few. Because this product is designed for planning, you can think of the MOM as a worst-case scenario for a given category of a storm. MOMs do have limitations, however. Remember at the beginning of this post we asked the question “should I change the tires on my car?”  MOMs are similar to that question because they are general in nature in that they lump all types of hurricanes into a single category.  They can tell you what type of storm surge risk you would have from a category 3 hurricane, for example, but they’re not quite as helpful if you know that the category 3 hurricane will be moving toward the west (and not north or northeast for instance).  

For this reason, the MOMs have a slightly more refined counterpart – MEOWs (Maximum Envelope of Water). MEOWs are like the second question we asked:  “should I change the tires on my car today.”  Since we said “today,” we know a little bit more about the actual situation we’re dealing with to make a better-informed decision.  Similarly, once a storm or hurricane forms and is within 3–4 days of impacting the coast, we have at least some idea of how strong it could get, how fast it’ll be moving, and in what general direction it’s headed. We are able as forecasters to whittle down the worst-case MOM such that we only consider storms moving toward a particular direction at a particular forward speed — not all directions and forward speeds. Similar to the Maximum of Maximums, a MEOW is a worst-case for storms of a certain strength (for example, Category 3 hurricanes), but it’s more representative of what the storm surge at individual locations could be based on the attributes and forecasts of the active tropical cyclone. At 3 days out, there is still considerable forecast uncertainty, so the MEOW is meant to supplement the MOM, not replace it. As some might say, you can’t go wrong if you always trust your mom. The same adage goes for a hurricane storm surge MOM.

Hurricane Florence

To help us better understand these products, let’s look at how they could have been used in practice during a past landfalling hurricane. Hurricane Florence made landfall along the North Carolina coast on Friday, September 14th of 2018 and presented numerous forecast challenges, as many landfalling tropical cyclones typically do. One benefit of using MOMs and MEOWs to plan, especially at longer lead-times, is that they provide stability in a situation where the forecast of the storm itself can often change quickly from advisory to advisory. Let’s take a look at what Florence’s forecast looked like about 5 days out from an expected landfall. Figure 3 is taken from the official forecast from the National Hurricane Center on September 8th at 11 pm AST.

Figure 3.  NHC five-day forecast track and cone of uncertainty issued for Tropical Storm Florence at 11 PM AST September 8, 2018 (Advisory 39).

At this point, the information we know is that a potential major hurricane is roughly 5 days away from impacting some portion of the Mid-Atlantic or southeast coast. This is a good point to begin looking at the MOMs. Florence at this point is forecast to be a category 4 hurricane at landfall, so a good rule-of-thumb to follow is to look at a MOM one category higher than the forecast intensity. Let’s take a look at the Category 5 MOM to get an idea of a worst-case storm surge scenario for a portion of the North Carolina coast. You can find that image below in Figure 4.

Figure 4.  Category 5 storm surge Maximum of Maximums (MOM) for portions of eastern North Carolina.

Given how strong Florence could be, it’s no surprise to see potential inundation that would be catastrophic. Remember what we are looking at here and also that this is still a planning tool. This graphic is showing you the worst-case scenario from a Category 5 hurricane. That is to say that these are the highest possible inundations at each individual location for any given storm attribute. We actually shouldn’t expect to see these types of inundation values across the entire area, but given the uncertainties in the storm, all locations in this area should be prepared for these types of inundation values. It would also be prudent to consider looking at other MOMs as well, for context. For example, viewing the Category 3 and 4 MOMs gives context if Florence was to reach the coast at a lower intensity.

Good. Now let’s fast forward by 2 days. We are now 3 days out from a potential landfall. Forecast confidence has increased, but the fine-scale details are still quite blurry regarding the exact location of landfall and how strong Florence will be. But this is when you can begin to turn to the MEOWs. From this point, we can begin to whittle down the MOMs to generate a more realistic potential scenario based on the information currently available. Below in Figure 5 is the forecast from September 10th at 11 pm AST.

Figure 5.  NHC five-day forecast track and cone of uncertainty issued for Hurricane Florence at 11 PM AST September 10, 2018 (Advisory 47).

As you can see, there are some updates to the forecast track. The official forecast now calls for Florence to slow down significantly as it approaches the North Carolina coast. Let’s now talk about which MEOWs we should be looking at and explore how we select them. This is an important junction in the forecast because right now we need to evaluate what we do know, what we don’t know, and what we can and cannot rule out. Remember that MEOWs are generated individually for a particular storm category, forward speed, trajectory, and initial tide level. At this point, is there anything that we can rule out in terms of unrealistic directions that Florence could potentially make landfall? It’s ok to acknowledge that there remains some subjectivity here, but it needs to be an informed decision with an understanding that our risk tolerance is low. That being said, let’s go ahead and rule out some storm directions.  Since the forecast track in advisory 47 reflects a northwestward trajectory at landfall, we’ll select that direction, as well as the two surrounding it (west-northwest and north-northwest) to account for uncertainty.  How about the intensity?  The latest forecast still shows Florence reaching the coast as a category 4 hurricane, so we still need to account for the possibility that it makes landfall one category stronger (category 5).  Lastly, let’s consider the speed at which Florence is moving and will be moving near its landfall. The tropical cyclone forecast discussion from advisory 47 explicitly mentions that Florence is expected to decrease in forward speed as it approaches the coast:

After that time [48 hours], a marked decrease in forward speed is likely as another ridge builds over the Great Lakes to the north of Florence.”

This is reflected in the official forecast which slows Florence down to less than 10 mph near the coast. While this certainly complicates the forecast, the beauty of using MEOWs is that it allows you to compensate for this forecast uncertainty. In this case, it’s fair that we could eliminate the MEOW forward speeds of 15, 25 and 35 mph, given forecaster confidence in Florence’s slow down. This leaves us with a forward speed of 5 mph (only a certain set of speeds is actually available to select). 

Let’s quickly recap the parameters that we’ve settled on to generate our MEOW:

Intensity: Category 5
Direction/trajectory: Storms that are moving West-Northwest, Northwest, or North-Northwest
Forward Speed: 5 mph
Tide-level: High (this will always be the assumption)

Using those parameters, Figure 6 shows the potential storm surge inundation that could occur across eastern North Carolina:

Figure 6.  Composite storm surge Maximum Envelope of Water (MEOW) over portions of eastern North Carolina for a category 5 hurricane moving west-northwest, northwest, or north-northwest at 5 mph at high tide.

To take this one step further, let’s zoom in around the New Bern, North Carolina, area and do a quick comparison of the category 5 MOM that we initially used 5 days out and compare it to the composite of MEOWs (Figure 7). 

 

 

 

 

 

 

Figure 7.  Comparison of MOM (left) and composite MEOW (right) from Figures 4 and 6 above, zoomed in on the New Bern, North Carolina, area.

Remember that at this point in the forecast process, we are looking at synthetic or simulated storms to get an idea of what the near-worst case storm surge inundation could be within an environment characterized by forecast uncertainty that’s very high.  What differences do you notice when you compare the two pictures above?  Don’t worry–you’re eyes aren’t deceiving you.  You probably don’t notice much difference at all.  That’s because, unfortunately, slow-moving storms moving in a generally northwestward direction are likely some of the worst types of storms for the New Bern area.  Essentially, they’re the storms that are most likely to be causing the storm surge heights you see in the MOM.  Our confidence in the hurricane’s forecast has increased since we’re 2 days closer to landfall, but the storm surge risk really hasn’t gone down at all.  While that might be a sobering thought, this process allows emergency managers to be as efficient as possible, appropriately assess their risk, and focus on the most at-risk areas.  This is a powerful and informative process when used properly! 

In the end, while all of eastern North Carolina did not experience the type of storm surge flooding shown in Figure 6 above (which we didn’t expect anyway), some areas did.  Areas around New Bern, for instance, had as much as 9 feet of storm surge inundation above ground level (red areas in Figure 8 below).  Even though Florence’s peak winds decreased while the storm moved closer to the coast, the MOM and MEOW risk maps accounted for Florence’s increasing size and slow movement (which both contribute to more storm surge) and appropriately prepared emergency managers in the area for a severe storm surge event days before Florence even reached the coast.

Figure 8.  Post-storm model simulation of storm surge inundation caused by Hurricane Florence around the New Bern/Neuse River area of North Carolina.

It’s important to note at this point that MOMs and MEOWs are predominantly used during the period before storm surge or wind-related watches and warnings are in effect for the coast (more than 48 hours before wind or surge is expected to begin).  Once we get to within 48 hours when watches or warnings go into effect, another suite of storm surge products–specifically the Potential Storm Surge Flooding Map and the Storm Surge Watch/Warning graphics–become available.  These products refine the storm surge risk profile even further because they are based on the characteristics of the actual storm, not on the simulated storms used in MOMs and MEOWs.  We plan to create another blog post addressing these products in the near future.

To really bring this home, let’s circle all the way back around to the initial discussion of risk. How does risk tolerance and risk perception affect how these products are used? We know that these products are used by a wide range of people and organizations, all of which have varying tolerances of risk. It is unrealistic to assume that we at the National Hurricane Center could know how these tolerances change across our entire user base. That being said, it is our job to gently guide the decision-making in accordance with our own risk tolerance. Said another way, we work with emergency managers and the Hurricane Liaison Team (HLT) to hopefully bring those risk perceptions more in line with the ACTUAL risk for a given storm. Emergency managers have the resources at their disposal to view MOMs and MEOWs to build out their assessment of risk tailored to their local areas. They possess the intricate knowledge specific to their area which makes them invaluable partners to us at the NHC. During a storm, we sometimes provide advice on types of MOMs and MEOWs to consult to ensure that our partners fully capture a reasonable envelope of risk. These decisions can be stressful, especially when they have to be made in line with a risk tolerance that needs to be low by necessity. Remember what the cost is again here: human lives. It’s imperative that we capture the full breadth of the risk during every storm because the cost of not doing so is immense. We are comfortable accepting that our low risk tolerance can result in some areas not experiencing the potential storm surge that was conveyed prior to a hurricane making landfall. That is, by definition, what having a low tolerance for risk means, but it’s also by design.  To us, one life lost is one life too many. 

— Taylor Trogdon and  Robbie Berg

 

Reference:

Rappaport, E.N., 2014: Fatalities in the United States from Atlantic Tropical Cyclones: New Data and Interpretation. Bull. Amer. Meteor. Soc., 95, 341–346, https://doi.org/10.1175/BAMS-D-12-00074.1

Skill or Luck?: How NHC’s Hurricane Track Forecasts Beat the Models

Posted on Updated on

Skill or Luck?

There’s one thing that many of us are missing right now while we’re occupying ourselves at home:  sports.  We should have been all set for the playoffs in major league hockey and basketball, and we would be excited about the beginning of the major league baseball and soccer seasons.  We also would have been eagerly anticipating some of this spring and summer’s major sporting events, including the Olympics.  So let’s dream a little…

When we set out to write this blog post for Inside the Eye, we wanted to show how National Hurricane Center (NHC) forecasters use their skill and expertise to predict the future track of a hurricane.  And then it got us thinking, how does luck factor into the equation?  In other words, when meteorologists get a weather forecast right, how much of it is luck, and how much of it is forecasters’ skill in correctly interpreting, or even beating, the weather models available to them?

Investment strategist Michael Mauboussin created a “Skill-Luck Continuum” where individual sports, among other activities in life, are placed on a spectrum somewhere between pure skill and pure luck (Figure 1).   Based on factors such as the number of games in a season, number of players in action, and number of scoring opportunities in a game or match, athletes and their teams in some sports might have to rely on a little more luck than other sports to be successful.  On this spectrum, a sport like basketball would be closest to the skill side (there are a lot of scoring opportunities in a basketball game) whereas a sport like hockey would require a little more luck (there are fewer scoring opportunities in a hockey match, and sometimes you just need the puck to bounce your way).  Fortunately for hockey fans, there are enough games in a season for their favorite team’s “unlucky” games to not matter so much.

The Skill-Luck Continuum for Sports

Figure 1.  The Skill-Luck Continuum in Sports, developed by investment strategist Michael Mauboussin.

 

Where would hurricane forecasting lie on such a continuum?  There’s no doubt that luck plays at least some part in weather forecasting too, particularly in individual forecasts when random or unforeseen circumstances could either play in your favor (and make you look like the best forecaster around) or turn against you (and make you look like you don’t know what you’re doing!).  But luck is much less of a factor when you consider a lot of forecasts over longer periods of time, where the good and bad circumstances should cancel each other out and true skill shines through (just as in sports).  At NHC, we routinely compare our forecasts with weather models over these long periods of time to assess our skill at predicting, for example, the future tracks of hurricanes.

An International Friendly?

From our experience of talking to people about hurricanes and weather models, it seems to be almost common “knowledge” that only two models exist – the U.S. Global Forecast System (GFS) and the European Centre for Medium Range Weather Forecasts (ECMWF) model.  It’s true that those two models are used heavily at NHC and the National Weather Service in general, but there are many more weather models that can simulate a hurricane’s track and general weather across the globe.  (Here’s a comprehensive list showing all of the available weather models that are used at NHC today, if you’re interested:  https://www.nhc.noaa.gov/modelsummary.shtml.)  We’ve also heard and seen people compare the GFS and ECMWF models and talk about which model scenario might be more correct for a given storm.  This blog entry summarizes the performances of those models and discusses how, on the whole, NHC systematically outperforms them on predicting the track of a storm.

Below are the most recent three years of data (2017, 2018, and 2019) of Atlantic basin track forecast skill from NHC and the three best individual track models:  the GFS, ECMWF, and the United Kingdom Meteorological Office model (UKMET) (Figure 2).  Track forecast skill is assessed by comparing NHC’s and each model’s performance to that of a baseline, which in this case is a climatology and persistence model.  This model makes forecasts based on a combination of what past storms with similar characteristics–like location, intensity, forward speed, and the time of year–have done (the climatology part) and a continuation of what the current storm has been doing (the persistence part).  This model contains no information about the current state of the atmosphere and represents a “no-skill” level of accuracy.

Figure 2.  NHC and selected model track forecast skill for the Atlantic basin in 2017, 2018, and 2019.

 

On the skill diagrams above, lines for models or forecasts that are above other lines are considered to be the most skillful.  It can be seen that in each year shown, NHC (black line) outperforms the models and has the greatest skill at most, if not all, forecast times (the black line is above the other colored lines most of the time).  Among the models, the ECMWF (red line) has been the best performer, with the GFS (blue line) and UKMET (green line) trading spots for second place.

Yet another metric to estimate how often NHC outperforms the models is called “frequency of superior performance.”  Based on this metric, over the last 3 years (2017-19), NHC outperformed the GFS 65% of the time, the UKMET 59% of the time, and the ECMWF 56% of the time.   This means that more often than not, NHC is beating these individual models.  So the question is, how do the NHC forecasters beat the models?

Keep Your Eyes on the Ball

Forecasters at NHC are quite skilled at assessing weather models and their associated strengthens and weaknesses.  It is that experience and a methodology of using averages of model solutions (consensus) that typically help NHC perform best.  If you ever read a NHC forecast discussion and see statements like “the track forecast is near the consensus aids,” or “the track forecast is near the middle of the guidance envelope,” the forecaster believed that the best solution was to be near the average of the models.   Although this strategy often works, NHC occasionally abandons this method when something does not seem right in the model solutions.  One recent example of this was Tropical Storm Isaac in 2018.  The figure below (Figure 3) shows the available model guidance, denoted by different colors, at 2 PM EDT (1800 UTC) on September 9 for Isaac, with the red-brown line representing the model consensus (TVCA).

Figure 3. NHC forecast (dashed black line) and selected model tracks at 2 PM EDT (1800 UTC) September 9, 2018 for then-Tropical Storm Isaac.  The solid black line represents the actual track of Isaac and the red-brown line represents the model consensus.

 

Although the models were in fair agreement that the storm would head westward for some time, a few models diverged by the time Isaac was expected to be near the eastern Caribbean Islands, mostly because they disagreed on how fast Isaac would be moving at that time.  Instead of being near the middle of the guidance envelope, NHC placed the forecast on the southern side of the model suite (dashed black line) at the latter forecast times since the forecaster believed that the steering flow would continue to force Isaac westward into the central Caribbean.  Indeed, NHC was correct in this case, and in fact, for the entire storm, NHC had very low track errors.

In some cases all of the models turn out to be wrong, which usually causes the official forecast to suffer as well.  That was the case for a period during Dorian in 2019.  Figure 4 shows many of the available operational models at 8 PM EDT on August 26 (0000 UTC August 27) for then-Tropical Storm Dorian.  As you can see by noting the deviation of the colored lines from the solid black line (Dorian’s actual track), none of the models or the official forecast (colored lines) anticipated that Dorian would turn as sharply as it did over the northeastern Caribbean Sea, and no model showed a direct impact to the Virgin Islands, where Dorian made landfall as a hurricane.

 

Figure 4. NHC forecast (dashed black line) and selected model tracks at 8 PM EDT on August 26 (0000 UTC 27 August), 2019 for then-Tropical Storm Dorian.  The solid black line represents the actual track of Dorian.

 

Figure 5 shows many of the operational models at 2 AM EDT (0600 UTC) on August 30 when Dorian, a major hurricane at the time, was approaching the Bahamas.  You can see that all of the models showed Dorian making landfall in south or central Florida in about four days from the time of the model runs, and none of them captured the catastrophic two-day stall that occurred over Great Abaco and Grand Bahama Islands.  NHC’s forecast followed the consensus of the models in this case and thus did not initially anticipate Dorian’s long, drawn-out battering of the northwestern Bahamas.

Figure 5.  NHC forecast (dashed black line) and selected model tracks at 2 AM EDT (0600 UTC) on August 30, 2019 for Hurricane Dorian.  The sold black line represents the actual track of Dorian.

 

The Undervalued Player?  A Consistently Good Field-Goal Kicker

In American football, probably one of the most undervalued players on the field is the kicker.  They don’t see much action during the majority of the game.  But at the end of close games, who has the best chance to win the game for a team?  A dependably accurate field goal kicker.  In that vein, it’s not just accuracy that can make NHC’s forecasts “better” than the individual models.  Another important factor is how consistent NHC’s predictions are from forecast to forecast compared to those from the models.  We looked at consistency by comparing the average difference in the forecast storm locations between predictions that were made 12 hours apart. For example, by how much did the 96-hour storm position in the current forecast change from the 108-hour position in the forecast that was made 12 hours ago (which was interpolated between the 96- and 120-hour forecast positions)?  Figure 6 shows this 4-day “consistency,” as well as the 4-day error, plotted together for the GFS, ECMWF, UKMET, and NHC forecasts for the Atlantic basin from 2017-19.  It can be seen that NHC is not only more accurate than these models (it’s farthest down on the y-axis), but it is also more consistent (it’s farthest to the left on the x-axis), meaning the official forecast holds steady more than the models do from cycle to cycle.  We like to say that we’re avoiding the model run-to-run “windshield wiper” effect (large shifts in forecast track to the left or right) or “trombone” effect (tracks that speed up or slow down) that are often displayed by even the most accurate models.

Figure 6.  96-hour NHC and model forecast error and consistency for 2017-2019 in the Atlantic basin (change from cycle to cycle).

 

NHC’s emphasis on consistency is so great that there are times when we knowingly accept that we might be sacrificing a little track accuracy to achieve consistency and a better public response to the threat.  An example would be for a hurricane that is forecast to move westward and pose a serious threat to the U.S. southeastern states.  Sometimes, such storms “recurve” to the north and then the northeast and move back out to sea before reaching the coast.  When the models trend toward such a recurvature, the NHC’s forecast will sometimes lag the models’ forecast of a lower threat to land.  In these cases, NHC does not want to prematurely take the southeastern states “off the hook”, sending a potentially erroneous signal that the risk of impacts on land has diminished, only to have later forecasts ratchet the threat back up after the public has turned their attention and energies elsewhere if the models, well, “change their mind”.  That would be the kind of windshield wiper effect NHC wants to prevent in its own forecasts.  Now, there are times where the recurvature does indeed occur.  Then, NHC’s track forecasts, which have hung back a little from the models, could end up having larger errors than the models.  But, NHC can accept having somewhat larger track forecast errors than the models in such circumstances at longer lead times if in doing so it can provide those at risk with a more effective message–achieved in part through consistency.

The superior accuracy and higher levels of consistency of the NHC forecasts are both important characteristics since emergency managers and other decision makers have to make challenging decisions, such as evacuation orders, based on that information.  It is not surprising to us that NHC’s forecasts are more consistent than the global models, since forecasters here take a conservative approach and usually make gradual changes from the forecast they inherited from the previous forecaster.  Conversely, the models often bounce around more and are not constrained by their previous prediction.  And, unlike human forecasters, the models also bear no responsibility or feel remorse when they are wrong!

Filling Out Your Bracket

Accuracy, consistency, and luck are important factors in one particularly favorite sport:  college basketball.  We just passed the time of year when we should have been crowning champions in the men’s and women’s college basketball tournaments.  But before those tournaments would have kicked off, “bracketologists” (no known relation to meteorologists!) would have made predictions on which teams would make it into the tournaments and which teams would have been likely to win.

Think of it this way:  a team can be accurate in that they have a spectacular winning record during the regular season, but does that mean they are guaranteed to win the tournament, or even advance far?  Nope.  As is often said, that’s why they play the game.  An inconsistent team—one whose performance varies wildly from game to game—has a higher risk of having a bad game and losing to an underdog in the first few rounds, even if their regular season record by itself suggests they should have no problem winning.  The problem is, they could have been very lucky in the regular season, winning a lot of close games that could have easily swung the other way.  If that luck runs out, the inconsistent team could have an early exit from the tournament.  With a consistent team, on the other hand, you pretty much know what kind of performance you’re going to get—good or bad—and that increases confidence in knowing how far in the tournament the team would advance.  You’d want to hitch your wagon to a good team that is consistent and hasn’t had to rely on too much luck to get where they are.

The same can be said for hurricane forecasts from NHC and the models.  NHC’s track forecasts are more accurate and more consistent than the individual models in the long run, and that fact should increase overall user confidence in the forecasts put out by NHC.  Even still, there is always room to improve, and it is hoped that forecasts will continue to become more accurate and consistent in the future.  It is always a good idea to read the NHC Forecast Discussion to understand the reasons behind the forecast and to gauge the forecaster’s confidence in the prediction.  For more information on NHC forecast and model verification, click the following link:  https://www.nhc.noaa.gov/verification/

— John Cangialosi, Robbie Berg, and Andrew Penny