Mobile home park on the north side of U.S. Highway 98 in Mexico Beach, Florida, washed away by the storm surge and wave impacts of Hurricane Michael. Nov. 2, 2018 (NOAA)
If you were asked to define the word “risk”, what would your response be? What are the first things that come to mind? What questions are you asking? What are some things you would consider to be risky? Risk can take many forms, be it financial, personal injury, or even simple decisions that we all have to make on a daily basis. Consider two simple questions:
- Should I change the tires on my car?
- Should I change the tires on my car today
From a risk perspective, how has the question changed? The answer to the first question is a simple one because it’s general in nature. Yes, the tires should be changed somewhat routinely over the lifetime that you own or lease the car. Question two requires more thought, however, because it narrows the task down to a specific time frame. If your tires were changed recently and are in good shape, then it’s most likely business as usual. If you’re overdue, what are the potential consequences of not having your tires changed? Are you risking personal injury to yourself, your family, or others by not changing them? The risk profile changes based on myriad variables that exist. As this example conveys, the idea of risk is relatively simple in itself but can become complex quickly. Let’s define risk for now as the potential of gaining or losing something of value. That is to say:
Risk = Probability x Consequence
Yes, we threw some math at you, but this concept is relatively easy to grasp. Using the previous example, the overall risk profile is determined by multiplying the probability of a given event by the resulting consequence if that event occurs. So if there is a high chance of an event occurring, or the consequence is severe, then the risk to you would be high. One way to look at risk is by using a risk matrix, as shown below in Figure 1. Your risk increases if the probability of the event goes up or if the consequence of the event goes up.
Figure 1. Risk matrix showing different levels of risk based on the probability of an event and the consequence if that event occurs.
The event in our car example is blowing a tire on the interstate, and a potential consequence would be having a fatal accident due to the blown tire. That consequence is so severe that your risk is quite high. But let’s take the example a little further. Risk is further compounded by vulnerability. Let’s consider a new equation:
Risk = Probability x Consequence x Vulnerability
Using the same example, what are variables that could increase the vulnerability, and thus the risk of a fatal accident, in this scenario? Are there kids in the car? What speed is the car travelling? If the tire pops while backing out of the driveway, isn’t that much different than the tire popping while travelling 70 mph down a busy interstate? This is just one of many examples that we all encounter on a daily basis. If there is a consequence to an action you might take, then you are making a risk-based decision.
Risk Perception and Risk Tolerance
There are two other topics related to risk that we should touch on: risk perception and risk tolerance. First, risk perception is the subjective judgement people make about the severity and probability of a risk. Why is this important? Well, there are two main reasons:
- Actual risk usually doesn’t equal perceived risk
- Perceived risk varies from person to person
Why does this complicate matters? When people have to make decisions, it’s important that how they perceive the risk be equal to the actual risk of the event, or at least as close to equal as possible. This is especially true when there is a desired response or action that needs to be taken to protect life and property, as is often the case with weather. We can use a simple example to further understand this. If a tornado is on the ground moving towards a community, the desired action for people living within that community is to seek shelter. In this scenario, the cost of persons within that community not seeking shelter is very high given that their lives are potentially in peril. If people dismiss what that tornado could do to their community (e.g., “tornadoes always pass our city to the south”), then that can be a recipe for disaster, especially when the cost is human lives.
This leads us nicely into risk tolerance. How an individual responds to risk is governed by their risk tolerance, which is unique to both the situation and the person. Risk tolerance is the amount of risk that an individual is willing to accept with respect to a given event occurring. Turning back to the tornado example, let’s consider two hypothetical people that live in the community threatened by the approaching tornado; we’ll name them Sara and Monika. For the sake of example, let’s assume that Sara has a family with two young children and lives in a mobile home. Monika, on the other hand, lives by herself in a well-built house. If Sara and Monika have the same perception of the imposing risk, do these life factors change their respective risk tolerances? In the real world it’s difficult to say, but in this idealized example, let’s assume that it does. Monika does not have anyone in her care and also lives in a home that could withstand stronger winds than Sara’s mobile home. Is Monika’s tolerance for risk higher than Sara’s? It certainly could be, couldn’t it? Again, we understand that assumptions are being made here, but this is simply a hypothetical scenario to demonstrate how risk tolerances may change across individuals and circumstances unique to those individuals. All of this is to say that humans are complicated and risk perceptions and tolerances vary across all of us.
Circling back to the initial equation that we used to define risk, let’s establish a baseline for what the potential consequences are with respect to storm surge by looking at history. Storm surge is the abnormal rise of the ocean produced by a hurricane or tropical storm, and normally dry land near the coast can be flooded by the surge. Historically, about 50% of lives lost in landfalling tropical cyclones in the United States have been due to storm surge (Figure 2):
Figure 2. Causes of death in the United States directly attributable to Atlantic tropical cyclones, 1963-2012 (Rappaport 2014).
The mission statement of the National Weather Service charges us to “provide weather, water, and climate data, forecasts and warnings for the protection of life and property and enhancement of the national economy.” There is no clearer way to illustrate what the consequences are in this equation: the loss of human lives. Because the cost here is so high – arguably the highest – our risk tolerance for your safety is extremely low. We have absolutely no appetite for someone losing their life from a weather event. This idea directly informs many of the products that we use to communicate risk prior to and during landfalling storms, and we therefore use near-worst-case scenarios to encapsulate the full envelope of storm surge risk to communities. One life lost during a storm is one life too many. The remainder of this blog post will discuss two such products used by the National Hurricane Center and emergency managers to understand storm surge risk.
MOMs and MEOWs
Can we all agree that “MOMs” are extremely important? Well, yes, those moms are important in our lives, but that holds true for storm surge MOMs as well. Have you ever wondered how officials decide what areas should evacuate ahead of a hurricane? Look no further. MOMs (Maximum of the Maximums) are the rock from which the nation’s storm surge evacuation zones are built upon. MOMs are generated ahead of time. That is to say that these are precomputed maps meant for planning and mitigation purposes well ahead of a landfalling hurricane. In fact, one can view these any time as they are hosted on the National Hurricane Center’s website at https://www.nhc.noaa.gov/nationalsurge/. MOMs are generated by hurricane category (think 1-5) and depict the maximum storm surge height possible across all storm surge attributes. Attributes include things such as forward speed, storm trajectory, and landfall location, just to name a few. Because this product is designed for planning, you can think of the MOM as a worst-case scenario for a given category of a storm. MOMs do have limitations, however. Remember at the beginning of this post we asked the question “should I change the tires on my car?” MOMs are similar to that question because they are general in nature in that they lump all types of hurricanes into a single category. They can tell you what type of storm surge risk you would have from a category 3 hurricane, for example, but they’re not quite as helpful if you know that the category 3 hurricane will be moving toward the west (and not north or northeast for instance).
For this reason, the MOMs have a slightly more refined counterpart – MEOWs (Maximum Envelope of Water). MEOWs are like the second question we asked: “should I change the tires on my car today.” Since we said “today,” we know a little bit more about the actual situation we’re dealing with to make a better-informed decision. Similarly, once a storm or hurricane forms and is within 3–4 days of impacting the coast, we have at least some idea of how strong it could get, how fast it’ll be moving, and in what general direction it’s headed. We are able as forecasters to whittle down the worst-case MOM such that we only consider storms moving toward a particular direction at a particular forward speed — not all directions and forward speeds. Similar to the Maximum of Maximums, a MEOW is a worst-case for storms of a certain strength (for example, Category 3 hurricanes), but it’s more representative of what the storm surge at individual locations could be based on the attributes and forecasts of the active tropical cyclone. At 3 days out, there is still considerable forecast uncertainty, so the MEOW is meant to supplement the MOM, not replace it. As some might say, you can’t go wrong if you always trust your mom. The same adage goes for a hurricane storm surge MOM.
To help us better understand these products, let’s look at how they could have been used in practice during a past landfalling hurricane. Hurricane Florence made landfall along the North Carolina coast on Friday, September 14th of 2018 and presented numerous forecast challenges, as many landfalling tropical cyclones typically do. One benefit of using MOMs and MEOWs to plan, especially at longer lead-times, is that they provide stability in a situation where the forecast of the storm itself can often change quickly from advisory to advisory. Let’s take a look at what Florence’s forecast looked like about 5 days out from an expected landfall. Figure 3 is taken from the official forecast from the National Hurricane Center on September 8th at 11 pm AST.
Figure 3. NHC five-day forecast track and cone of uncertainty issued for Tropical Storm Florence at 11 PM AST September 8, 2018 (Advisory 39).
At this point, the information we know is that a potential major hurricane is roughly 5 days away from impacting some portion of the Mid-Atlantic or southeast coast. This is a good point to begin looking at the MOMs. Florence at this point is forecast to be a category 4 hurricane at landfall, so a good rule-of-thumb to follow is to look at a MOM one category higher than the forecast intensity. Let’s take a look at the Category 5 MOM to get an idea of a worst-case storm surge scenario for a portion of the North Carolina coast. You can find that image below in Figure 4.
Given how strong Florence could be, it’s no surprise to see potential inundation that would be catastrophic. Remember what we are looking at here and also that this is still a planning tool. This graphic is showing you the worst-case scenario from a Category 5 hurricane. That is to say that these are the highest possible inundations at each individual location for any given storm attribute. We actually shouldn’t expect to see these types of inundation values across the entire area, but given the uncertainties in the storm, all locations in this area should be prepared for these types of inundation values. It would also be prudent to consider looking at other MOMs as well, for context. For example, viewing the Category 3 and 4 MOMs gives context if Florence was to reach the coast at a lower intensity.
Good. Now let’s fast forward by 2 days. We are now 3 days out from a potential landfall. Forecast confidence has increased, but the fine-scale details are still quite blurry regarding the exact location of landfall and how strong Florence will be. But this is when you can begin to turn to the MEOWs. From this point, we can begin to whittle down the MOMs to generate a more realistic potential scenario based on the information currently available. Below in Figure 5 is the forecast from September 10th at 11 pm AST.
Figure 5. NHC five-day forecast track and cone of uncertainty issued for Hurricane Florence at 11 PM AST September 10, 2018 (Advisory 47).
As you can see, there are some updates to the forecast track. The official forecast now calls for Florence to slow down significantly as it approaches the North Carolina coast. Let’s now talk about which MEOWs we should be looking at and explore how we select them. This is an important junction in the forecast because right now we need to evaluate what we do know, what we don’t know, and what we can and cannot rule out. Remember that MEOWs are generated individually for a particular storm category, forward speed, trajectory, and initial tide level. At this point, is there anything that we can rule out in terms of unrealistic directions that Florence could potentially make landfall? It’s ok to acknowledge that there remains some subjectivity here, but it needs to be an informed decision with an understanding that our risk tolerance is low. That being said, let’s go ahead and rule out some storm directions. Since the forecast track in advisory 47 reflects a northwestward trajectory at landfall, we’ll select that direction, as well as the two surrounding it (west-northwest and north-northwest) to account for uncertainty. How about the intensity? The latest forecast still shows Florence reaching the coast as a category 4 hurricane, so we still need to account for the possibility that it makes landfall one category stronger (category 5). Lastly, let’s consider the speed at which Florence is moving and will be moving near its landfall. The tropical cyclone forecast discussion from advisory 47 explicitly mentions that Florence is expected to decrease in forward speed as it approaches the coast:
“After that time [48 hours], a marked decrease in forward speed is likely as another ridge builds over the Great Lakes to the north of Florence.”
This is reflected in the official forecast which slows Florence down to less than 10 mph near the coast. While this certainly complicates the forecast, the beauty of using MEOWs is that it allows you to compensate for this forecast uncertainty. In this case, it’s fair that we could eliminate the MEOW forward speeds of 15, 25 and 35 mph, given forecaster confidence in Florence’s slow down. This leaves us with a forward speed of 5 mph (only a certain set of speeds is actually available to select).
Let’s quickly recap the parameters that we’ve settled on to generate our MEOW:
Intensity: Category 5
Direction/trajectory: Storms that are moving West-Northwest, Northwest, or North-Northwest
Forward Speed: 5 mph
Tide-level: High (this will always be the assumption)
Using those parameters, Figure 6 shows the potential storm surge inundation that could occur across eastern North Carolina:
Figure 6. Composite storm surge Maximum Envelope of Water (MEOW) over portions of eastern North Carolina for a category 5 hurricane moving west-northwest, northwest, or north-northwest at 5 mph at high tide.
To take this one step further, let’s zoom in around the New Bern, North Carolina, area and do a quick comparison of the category 5 MOM that we initially used 5 days out and compare it to the composite of MEOWs (Figure 7).
Figure 7. Comparison of MOM (left) and composite MEOW (right) from Figures 4 and 6 above, zoomed in on the New Bern, North Carolina, area.
Remember that at this point in the forecast process, we are looking at synthetic or simulated storms to get an idea of what the near-worst case storm surge inundation could be within an environment characterized by forecast uncertainty that’s very high. What differences do you notice when you compare the two pictures above? Don’t worry–you’re eyes aren’t deceiving you. You probably don’t notice much difference at all. That’s because, unfortunately, slow-moving storms moving in a generally northwestward direction are likely some of the worst types of storms for the New Bern area. Essentially, they’re the storms that are most likely to be causing the storm surge heights you see in the MOM. Our confidence in the hurricane’s forecast has increased since we’re 2 days closer to landfall, but the storm surge risk really hasn’t gone down at all. While that might be a sobering thought, this process allows emergency managers to be as efficient as possible, appropriately assess their risk, and focus on the most at-risk areas. This is a powerful and informative process when used properly!
In the end, while all of eastern North Carolina did not experience the type of storm surge flooding shown in Figure 6 above (which we didn’t expect anyway), some areas did. Areas around New Bern, for instance, had as much as 9 feet of storm surge inundation above ground level (red areas in Figure 8 below). Even though Florence’s peak winds decreased while the storm moved closer to the coast, the MOM and MEOW risk maps accounted for Florence’s increasing size and slow movement (which both contribute to more storm surge) and appropriately prepared emergency managers in the area for a severe storm surge event days before Florence even reached the coast.
Figure 8. Post-storm model simulation of storm surge inundation caused by Hurricane Florence around the New Bern/Neuse River area of North Carolina.
It’s important to note at this point that MOMs and MEOWs are predominantly used during the period before storm surge or wind-related watches and warnings are in effect for the coast (more than 48 hours before wind or surge is expected to begin). Once we get to within 48 hours when watches or warnings go into effect, another suite of storm surge products–specifically the Potential Storm Surge Flooding Map and the Storm Surge Watch/Warning graphics–become available. These products refine the storm surge risk profile even further because they are based on the characteristics of the actual storm, not on the simulated storms used in MOMs and MEOWs. We plan to create another blog post addressing these products in the near future.
To really bring this home, let’s circle all the way back around to the initial discussion of risk. How does risk tolerance and risk perception affect how these products are used? We know that these products are used by a wide range of people and organizations, all of which have varying tolerances of risk. It is unrealistic to assume that we at the National Hurricane Center could know how these tolerances change across our entire user base. That being said, it is our job to gently guide the decision-making in accordance with our own risk tolerance. Said another way, we work with emergency managers and the Hurricane Liaison Team (HLT) to hopefully bring those risk perceptions more in line with the ACTUAL risk for a given storm. Emergency managers have the resources at their disposal to view MOMs and MEOWs to build out their assessment of risk tailored to their local areas. They possess the intricate knowledge specific to their area which makes them invaluable partners to us at the NHC. During a storm, we sometimes provide advice on types of MOMs and MEOWs to consult to ensure that our partners fully capture a reasonable envelope of risk. These decisions can be stressful, especially when they have to be made in line with a risk tolerance that needs to be low by necessity. Remember what the cost is again here: human lives. It’s imperative that we capture the full breadth of the risk during every storm because the cost of not doing so is immense. We are comfortable accepting that our low risk tolerance can result in some areas not experiencing the potential storm surge that was conveyed prior to a hurricane making landfall. That is, by definition, what having a low tolerance for risk means, but it’s also by design. To us, one life lost is one life too many.
— Taylor Trogdon and Robbie Berg
Rappaport, E.N., 2014: Fatalities in the United States from Atlantic Tropical Cyclones: New Data and Interpretation. Bull. Amer. Meteor. Soc., 95, 341–346, https://doi.org/10.1175/BAMS-D-12-00074.1
Skill or Luck?
There’s one thing that many of us are missing right now while we’re occupying ourselves at home: sports. We should have been all set for the playoffs in major league hockey and basketball, and we would be excited about the beginning of the major league baseball and soccer seasons. We also would have been eagerly anticipating some of this spring and summer’s major sporting events, including the Olympics. So let’s dream a little…
When we set out to write this blog post for Inside the Eye, we wanted to show how National Hurricane Center (NHC) forecasters use their skill and expertise to predict the future track of a hurricane. And then it got us thinking, how does luck factor into the equation? In other words, when meteorologists get a weather forecast right, how much of it is luck, and how much of it is forecasters’ skill in correctly interpreting, or even beating, the weather models available to them?
Investment strategist Michael Mauboussin created a “Skill-Luck Continuum” where individual sports, among other activities in life, are placed on a spectrum somewhere between pure skill and pure luck (Figure 1). Based on factors such as the number of games in a season, number of players in action, and number of scoring opportunities in a game or match, athletes and their teams in some sports might have to rely on a little more luck than other sports to be successful. On this spectrum, a sport like basketball would be closest to the skill side (there are a lot of scoring opportunities in a basketball game) whereas a sport like hockey would require a little more luck (there are fewer scoring opportunities in a hockey match, and sometimes you just need the puck to bounce your way). Fortunately for hockey fans, there are enough games in a season for their favorite team’s “unlucky” games to not matter so much.
Figure 1. The Skill-Luck Continuum in Sports, developed by investment strategist Michael Mauboussin.
Where would hurricane forecasting lie on such a continuum? There’s no doubt that luck plays at least some part in weather forecasting too, particularly in individual forecasts when random or unforeseen circumstances could either play in your favor (and make you look like the best forecaster around) or turn against you (and make you look like you don’t know what you’re doing!). But luck is much less of a factor when you consider a lot of forecasts over longer periods of time, where the good and bad circumstances should cancel each other out and true skill shines through (just as in sports). At NHC, we routinely compare our forecasts with weather models over these long periods of time to assess our skill at predicting, for example, the future tracks of hurricanes.
An International Friendly?
From our experience of talking to people about hurricanes and weather models, it seems to be almost common “knowledge” that only two models exist – the U.S. Global Forecast System (GFS) and the European Centre for Medium Range Weather Forecasts (ECMWF) model. It’s true that those two models are used heavily at NHC and the National Weather Service in general, but there are many more weather models that can simulate a hurricane’s track and general weather across the globe. (Here’s a comprehensive list showing all of the available weather models that are used at NHC today, if you’re interested: https://www.nhc.noaa.gov/modelsummary.shtml.) We’ve also heard and seen people compare the GFS and ECMWF models and talk about which model scenario might be more correct for a given storm. This blog entry summarizes the performances of those models and discusses how, on the whole, NHC systematically outperforms them on predicting the track of a storm.
Below are the most recent three years of data (2017, 2018, and 2019) of Atlantic basin track forecast skill from NHC and the three best individual track models: the GFS, ECMWF, and the United Kingdom Meteorological Office model (UKMET) (Figure 2). Track forecast skill is assessed by comparing NHC’s and each model’s performance to that of a baseline, which in this case is a climatology and persistence model. This model makes forecasts based on a combination of what past storms with similar characteristics–like location, intensity, forward speed, and the time of year–have done (the climatology part) and a continuation of what the current storm has been doing (the persistence part). This model contains no information about the current state of the atmosphere and represents a “no-skill” level of accuracy.
Figure 2. NHC and selected model track forecast skill for the Atlantic basin in 2017, 2018, and 2019.
On the skill diagrams above, lines for models or forecasts that are above other lines are considered to be the most skillful. It can be seen that in each year shown, NHC (black line) outperforms the models and has the greatest skill at most, if not all, forecast times (the black line is above the other colored lines most of the time). Among the models, the ECMWF (red line) has been the best performer, with the GFS (blue line) and UKMET (green line) trading spots for second place.
Yet another metric to estimate how often NHC outperforms the models is called “frequency of superior performance.” Based on this metric, over the last 3 years (2017-19), NHC outperformed the GFS 65% of the time, the UKMET 59% of the time, and the ECMWF 56% of the time. This means that more often than not, NHC is beating these individual models. So the question is, how do the NHC forecasters beat the models?
Keep Your Eyes on the Ball
Forecasters at NHC are quite skilled at assessing weather models and their associated strengthens and weaknesses. It is that experience and a methodology of using averages of model solutions (consensus) that typically help NHC perform best. If you ever read a NHC forecast discussion and see statements like “the track forecast is near the consensus aids,” or “the track forecast is near the middle of the guidance envelope,” the forecaster believed that the best solution was to be near the average of the models. Although this strategy often works, NHC occasionally abandons this method when something does not seem right in the model solutions. One recent example of this was Tropical Storm Isaac in 2018. The figure below (Figure 3) shows the available model guidance, denoted by different colors, at 2 PM EDT (1800 UTC) on September 9 for Isaac, with the red-brown line representing the model consensus (TVCA).
Figure 3. NHC forecast (dashed black line) and selected model tracks at 2 PM EDT (1800 UTC) September 9, 2018 for then-Tropical Storm Isaac. The solid black line represents the actual track of Isaac and the red-brown line represents the model consensus.
Although the models were in fair agreement that the storm would head westward for some time, a few models diverged by the time Isaac was expected to be near the eastern Caribbean Islands, mostly because they disagreed on how fast Isaac would be moving at that time. Instead of being near the middle of the guidance envelope, NHC placed the forecast on the southern side of the model suite (dashed black line) at the latter forecast times since the forecaster believed that the steering flow would continue to force Isaac westward into the central Caribbean. Indeed, NHC was correct in this case, and in fact, for the entire storm, NHC had very low track errors.
In some cases all of the models turn out to be wrong, which usually causes the official forecast to suffer as well. That was the case for a period during Dorian in 2019. Figure 4 shows many of the available operational models at 8 PM EDT on August 26 (0000 UTC August 27) for then-Tropical Storm Dorian. As you can see by noting the deviation of the colored lines from the solid black line (Dorian’s actual track), none of the models or the official forecast (colored lines) anticipated that Dorian would turn as sharply as it did over the northeastern Caribbean Sea, and no model showed a direct impact to the Virgin Islands, where Dorian made landfall as a hurricane.
Figure 4. NHC forecast (dashed black line) and selected model tracks at 8 PM EDT on August 26 (0000 UTC 27 August), 2019 for then-Tropical Storm Dorian. The solid black line represents the actual track of Dorian.
Figure 5 shows many of the operational models at 2 AM EDT (0600 UTC) on August 30 when Dorian, a major hurricane at the time, was approaching the Bahamas. You can see that all of the models showed Dorian making landfall in south or central Florida in about four days from the time of the model runs, and none of them captured the catastrophic two-day stall that occurred over Great Abaco and Grand Bahama Islands. NHC’s forecast followed the consensus of the models in this case and thus did not initially anticipate Dorian’s long, drawn-out battering of the northwestern Bahamas.
Figure 5. NHC forecast (dashed black line) and selected model tracks at 2 AM EDT (0600 UTC) on August 30, 2019 for Hurricane Dorian. The sold black line represents the actual track of Dorian.
The Undervalued Player? A Consistently Good Field-Goal Kicker
In American football, probably one of the most undervalued players on the field is the kicker. They don’t see much action during the majority of the game. But at the end of close games, who has the best chance to win the game for a team? A dependably accurate field goal kicker. In that vein, it’s not just accuracy that can make NHC’s forecasts “better” than the individual models. Another important factor is how consistent NHC’s predictions are from forecast to forecast compared to those from the models. We looked at consistency by comparing the average difference in the forecast storm locations between predictions that were made 12 hours apart. For example, by how much did the 96-hour storm position in the current forecast change from the 108-hour position in the forecast that was made 12 hours ago (which was interpolated between the 96- and 120-hour forecast positions)? Figure 6 shows this 4-day “consistency,” as well as the 4-day error, plotted together for the GFS, ECMWF, UKMET, and NHC forecasts for the Atlantic basin from 2017-19. It can be seen that NHC is not only more accurate than these models (it’s farthest down on the y-axis), but it is also more consistent (it’s farthest to the left on the x-axis), meaning the official forecast holds steady more than the models do from cycle to cycle. We like to say that we’re avoiding the model run-to-run “windshield wiper” effect (large shifts in forecast track to the left or right) or “trombone” effect (tracks that speed up or slow down) that are often displayed by even the most accurate models.
Figure 6. 96-hour NHC and model forecast error and consistency for 2017-2019 in the Atlantic basin (change from cycle to cycle).
NHC’s emphasis on consistency is so great that there are times when we knowingly accept that we might be sacrificing a little track accuracy to achieve consistency and a better public response to the threat. An example would be for a hurricane that is forecast to move westward and pose a serious threat to the U.S. southeastern states. Sometimes, such storms “recurve” to the north and then the northeast and move back out to sea before reaching the coast. When the models trend toward such a recurvature, the NHC’s forecast will sometimes lag the models’ forecast of a lower threat to land. In these cases, NHC does not want to prematurely take the southeastern states “off the hook”, sending a potentially erroneous signal that the risk of impacts on land has diminished, only to have later forecasts ratchet the threat back up after the public has turned their attention and energies elsewhere if the models, well, “change their mind”. That would be the kind of windshield wiper effect NHC wants to prevent in its own forecasts. Now, there are times where the recurvature does indeed occur. Then, NHC’s track forecasts, which have hung back a little from the models, could end up having larger errors than the models. But, NHC can accept having somewhat larger track forecast errors than the models in such circumstances at longer lead times if in doing so it can provide those at risk with a more effective message–achieved in part through consistency.
The superior accuracy and higher levels of consistency of the NHC forecasts are both important characteristics since emergency managers and other decision makers have to make challenging decisions, such as evacuation orders, based on that information. It is not surprising to us that NHC’s forecasts are more consistent than the global models, since forecasters here take a conservative approach and usually make gradual changes from the forecast they inherited from the previous forecaster. Conversely, the models often bounce around more and are not constrained by their previous prediction. And, unlike human forecasters, the models also bear no responsibility or feel remorse when they are wrong!
Filling Out Your Bracket
Accuracy, consistency, and luck are important factors in one particularly favorite sport: college basketball. We just passed the time of year when we should have been crowning champions in the men’s and women’s college basketball tournaments. But before those tournaments would have kicked off, “bracketologists” (no known relation to meteorologists!) would have made predictions on which teams would make it into the tournaments and which teams would have been likely to win.
Think of it this way: a team can be accurate in that they have a spectacular winning record during the regular season, but does that mean they are guaranteed to win the tournament, or even advance far? Nope. As is often said, that’s why they play the game. An inconsistent team—one whose performance varies wildly from game to game—has a higher risk of having a bad game and losing to an underdog in the first few rounds, even if their regular season record by itself suggests they should have no problem winning. The problem is, they could have been very lucky in the regular season, winning a lot of close games that could have easily swung the other way. If that luck runs out, the inconsistent team could have an early exit from the tournament. With a consistent team, on the other hand, you pretty much know what kind of performance you’re going to get—good or bad—and that increases confidence in knowing how far in the tournament the team would advance. You’d want to hitch your wagon to a good team that is consistent and hasn’t had to rely on too much luck to get where they are.
The same can be said for hurricane forecasts from NHC and the models. NHC’s track forecasts are more accurate and more consistent than the individual models in the long run, and that fact should increase overall user confidence in the forecasts put out by NHC. Even still, there is always room to improve, and it is hoped that forecasts will continue to become more accurate and consistent in the future. It is always a good idea to read the NHC Forecast Discussion to understand the reasons behind the forecast and to gauge the forecaster’s confidence in the prediction. For more information on NHC forecast and model verification, click the following link: https://www.nhc.noaa.gov/verification/
— John Cangialosi, Robbie Berg, and Andrew Penny
Semper Paratus (Always Ready): A Shared Mission of Watching Over a Vast Blue Ocean
The National Hurricane Center (NHC) has the responsibility for issuing weather forecasts and warnings for a wide expanse of the Atlantic and eastern North Pacific Oceans. Within NHC, the Hurricane Specialist Unit (HSU) issues forecasts for tropical storms and hurricanes in these regions, issues associated U. S. watches and warnings, and provides guidance for the issuance of watches and warnings for international land areas. NHC’s Tropical Analysis and Forecast Branch (TAFB) makes forecasts of wind speeds and wave heights and issues wind warnings year-round for the eastern North Pacific Ocean north of the equator to 30°N, and for the Atlantic Ocean north of the equator to 31°N and west of 35°W (including the Gulf of Mexico and Caribbean Sea). These wind warnings include tropical storms and hurricanes as well as winter storms, tradewind gales, and severe gap-wind events (for example, the “Tehuantepecers” south of Mexico).
The United States Coast Guard (USCG) has areas of responsibility (AORs) that extend well beyond those of NHC, with potential weather hazards affecting the fleet and their missions over the ocean, inland U.S. waterways, and flood-prone U.S. land areas. Although the USCG is responsible for search and rescue missions that may occur due to weather hazards, they are also vulnerable to severe weather and must also protect their own fleet and crews from these hazards.
One of the USCG’s oldest missions and highest priorities is to render aid to save lives and property in the maritime environment. To meet these goals, the United States’ area of search-and-rescue responsibility is divided into internationally recognized inland and maritime regions. There are five Atlantic USCG Search and Rescue Regions (SRRs) (Boston, Norfolk, Miami, New Orleans, and San Juan) and two Pacific USCG SRRs (Alameda and Honolulu) that overlap with NHC’s hurricane and marine areas of responsibility. The other eastern Pacific regions north of the Alameda SRR do not typically, if ever, experience hurricane activity. The multi-million square mile area of the agencies’ overlap allows NHC to provide weather hazard Decision Support Services (DSS) for the USCG.
Building Partnerships with the Districts
The National Weather Service (NWS) signed a Memorandum of Agreement (MOA) with the USCG to provide them with weather support. Over the past couple of years, staff at NHC have had numerous discussions with several of the USCG districts in order to build stronger partnerships. These discussions, primarily involving how NHC can better serve the USCG, established criteria for requiring TAFB to provide weather briefings to key decision makers within the USCG. When criteria are met, TAFB provides the relevant USCG District with once- or twice-a-day briefing packages detailing the weather impacts on their area of responsibility. This information provides the USCG districts with the details necessary to make efficient and effective decisions about potential mobilization of their fleet.
2018 Hurricane Season Briefing Support
During the 2018 hurricane season, TAFB provided 30 briefings to USCG Districts 5 (Norfolk), 7 (Miami), 8 (New Orleans), and 11 (Alameda) for the several tropical storms and hurricanes that affected them. These interactions helped to build the relationships between NHC and the USCG districts and aided the districts in making decisions regarding fleet mobilization, conducting search and rescue missions, and preparation for USCG’s land-based assets and personnel. Some of these briefings occurred during rapidly evolving high impact scenarios, including Hurricane Michael. Michael was forecast to become a hurricane within 72 hours of developing into a tropical depression and was forecast to make landfall within 96 hours of its formation. Ultimately, Michael rapidly intensified into a category 5 hurricane only 3½ days after formation, before making landfall on the Florida Panhandle. Hurricane Michael’s track across the east-central Gulf of Mexico straddled the border of USCG Districts 7 (Miami) and 8 (New Orleans), leading to both Districts taking action in advance of the hurricane.
Support for District 5 (Norfolk)
The NWS’s Ocean Prediction Center, the NHC (through TAFB), and the NWS National Operations Center have worked together to provide weekly high-level coordination briefings to USCG District 5 on upcoming hazards focused on the Atlantic Ocean north of 31°N over the following seven days. Each Monday (except Tuesday if Monday is a holiday) by noon Eastern Time, the NWS provides a briefing that covers the mid-Atlantic region from New Jersey through North Carolina. Typically, the briefing covers the area to roughly 65°W, though the exact area covered can vary based on the week’s expected weather hazards. The USCG, in turn, has been sharing the information with mariners, port partners, and industry groups for situational awareness and critical decision-making.
NHC’s TAFB is ready to provide decision support services to the USCG Districts for the 2019 hurricane season. Plans are being developed to continue this type of support for many years to come.
— Andy Latto
The State of Hurricane Forecasting is . . .
The National Hurricane Center (NHC) has the responsibility for issuing advisories and U.S. watches/warnings for tropical cyclones (TCs), which includes tropical depressions, tropical storms, and hurricanes, for the Atlantic and east Pacific basins. NHC has a long history of issuing advisories for TCs, with the first known recorded forecast being in 1954, when 24-hour predictions of a TC’s track were made. Since then, we’ve expanded our forecasts out in time and added predictions of TC intensity, size, and associated hazards, such as wind, storm surge, and rainfall. In addition, the lead times of tropical storm and hurricane watches and warnings have increased to give the public additional time to prepare for these potentially devastating events. Since we’re at the time of year when the U.S. President and state governors have just given their “State of the Union” or “State of the State” speeches, we thought this might be a good time to give our own “State of Hurricane Forecasting” speech. This blog entry takes a look at the accuracy of NHC’s forecasts and quantifies how much more accurate they are today compared to decades ago.
Track Forecasting (a.k.a., Where the Storm Will Go)
We are usually more confident in predicting the path of TCs as compared to predicting the strength or size of a TC. The primary reason for this is because the track of a TC is governed by forces larger than the tropical system itself, since the surrounding steering currents cover a much larger area than the hurricane. Because these nearby weather patterns are big, we can usually “see” them easily, and the global weather models do a fairly good job in predicting how these steering features might evolve over the course of a few days.
The figure below shows the average NHC track forecast errors for tropical storms and hurricanes by decade beginning in the 1960s. You can see that there has been a steady reduction in the track errors over time, with the average errors in the current decade about 30-40% smaller than they were in the 2000s and about half of the size (or even smaller) than they were in the 1990s.
If that doesn’t seem impressive, let’s look at another example. The next graphic shows two circles centered on a point near Pensacola, Florida, with the blue one representing the average 48-hour track error in 1990 and the red one showing the average 48-hour error today. What it shows is that if NHC had made a forecast for a storm to be over Pensacola in 48 hours back in 1990, the TC would have ended up, on average, not exactly over Pensacola but somewhere on the blue circle. If NHC makes the same forecast today, now the storm ends up, on average, somewhere on the red circle. You can easily see that the NHC forecasts for the path of a TC today are much more accurate, on average, than they were decades ago, and these more accurate forecasts have helped narrow the warning areas, save lives, and make for more efficient and less costly evacuations.
So, you might be wondering why the track forecasts are more accurate today than in the past. Well, the primary reason is the advancements in technology, specifically the improvements in the observing platforms (satellites, for example) and the various modeling systems we use to make forecasts. The amount and quality of data available to the models so they can paint an initial picture of the atmosphere have increased dramatically in the last 20 to 30 years. Also, the resolution and physics in the models we use today are far superior to what forecasters had available in the 1990s or prior decades, in part due to the tremendous improvements in computational capabilities. In addition, NHC has found ways to even beat the individual dynamical models by using a balance of statistical approaches and experience.
We often hear a lot of questions asking which model is the best one. Although some models are usually better than others, no model is perfect, and their performance varies from season to season and from storm to storm. Two of the most well-known models for weather forecasting are the U.S. National Weather Service’s Global Forecast System (GFS) and the European Centre for Medium-Range Weather Forecasts (ECMWF). The figure below shows a comparison of the NHC forecasts (OFCL, black) and forecasts from the GFS (GFSI, blue) and ECMWF (EMXI, red) models for Hurricanes Harvey, Irma, Maria, and Nate in 2017. In all of these cases, except for Hurricane Irma, OFCL performed as well as or better than GFSI and EMXI. Among the two models, EMXI beat GFSI for Harvey, Irma, and Nate, but GFSI beat EMXI for Maria.
Over the past decade, the average track errors of GFSI and EMXI models have been quite close, so even though EMXI was the best-performing model most of the time in 2017, it does not mean that it will always be the best for every storm. The models that typically have the lowest errors are consensus aids, which blend several models together. Forecasters construct their own forecasts of how the storm will evolve, aided by model simulations and their knowledge of model strengthens and weaknesses.
Even though our track forecasts are much more accurate today – in fact preliminary estimates are that the 2017 Atlantic track forecasts set record low errors at all time periods – typical track errors currently start off at 37 n mi at 24 hours and then increase by about 35 n mi (40 mi ) per day of the forecast. This means that our 5-day track error is on average around 180 n mi (210 mi). So, keep that in mind and be sure to account for forecast uncertainty when using NHC forecasts next hurricane season.
Intensity Forecasting (a.k.a., How Strong the Storm Will Get)
Predicting the intensity of a tropical storm or hurricane is usually more challenging than forecasting its track. This is because the intensity of these weather systems is affected by factors that are both big and small. On the large scale, vertical wind shear (the change of wind speed and direction with height) and the amount of moisture in the atmosphere greatly affect the amount or organization of the thunderstorm activity that the TC can produce. Ocean temperatures also affect the system’s intensity, with temperatures below 80° F usually being too cool to sustain significant thunderstorm activity. However, smaller-scale features can also be at play. One of the more complex phenomena that affects a TC’s intensity is an eyewall replacement cycle. Initially, when two eyewalls, one inside the other, are present, the hurricane’s wind field will begin to expand, and as the inner eyewall dies, the hurricane’s peak winds start to weaken. However, if the second eyewall contracts, the hurricane can often re-intensify. The radar image below of Hurricane Irma (2017) was taken at the beginning of an eyewall replacement cycle, when the hurricane had a double eyewall structure.
Given these complex factors and the fact that errors in the track can also affect the TC’s future intensity, we have not made as much progress in this area as we have for track forecasting. The next graphic (below) shows NHC average intensity errors for Atlantic tropical storms and hurricanes by decade starting in the 1970s. Note that only small improvements were made in the intensity predictions from the 1970s through the 2000s. A much more significant reduction in error has occurred in the current decade, which could mean that the recent investment in new models and techniques is beginning to pay off. Today’s intensity errors are close to 15 kt (17 mph) from 72 to 120 h. This number is on the order of one Saffir-Simpson category, so we often encourage those who could be affected by a TC to prepare for a storm one category stronger (on the Saffir-Simpson Hurricane Wind Scale) than what we are forecasting.
Although the GFS and ECMWF models are skillful for track forecasting and help us understand the environment around the TC, did you know that these models are typically inadequate to predict how strong a TC might become? Both the GFS and ECMWF are global models, and they cannot “see” sufficient detail within the storm to represent and predict the core winds in the hurricane’s eyewall. Therefore, we use different models to predict intensity, some that are run at high resolution specifically for TCs (e.g., Hurricane Weather Research and Forecasting [HWRF] model, Hurricanes in a Multi-scale Ocean-coupled Non-hydrostatic [HMON] model) and some that are statistical in nature (e.g., Statistical Hurricane Intensity Prediction Scheme [SHIPS], Logistic Growth Equation Model [LGEM]). The statistical models tell the forecaster what typically occurs for a TC in a specific location and environment based on past storm behavior. Even though the intensity models are improving, the gains in these models are much smaller than what has occurred in the models we use for track forecasting.
If you want more information on the models, please visit the following page for details: http://www.nhc.noaa.gov/modelsummary.shtml
Will the errors keep decreasing?
The short answer is they likely won’t forever. At some point the forecasts made by NHC and other forecasting centers will likely reach the limits of predictability. No one knows for sure what those limits are or when they will be reached, but researchers are still providing great information that is helping NHC make steady advancements as discussed above.
For more information on the NHC and model verification please visit the following page: http://www.nhc.noaa.gov/verification/