Pacing Strategy – Can Analytics Help Us Run Faster in Cross-Country? [ARTICLE]

Pacing Strategy - Can Analytics Help Us Run Faster in Cross-Country?

By: Stephen Lane

Originally Published in Techniques Magazine

Over the past decade, data analysis has sparked a revolution across all sports. Well-known "truths" have been gradually - often begrudgingly - over-turned by unorthodox heresies that not so long ago were casually dismissed. This dynamic burst into the popular consciousness with the book (and subsequent movie) Moneyball, about the low-budget Oakland A's innovative approaches to competing with behe-moths like the Yankees. In the NBA, analytics transformed offensive strategy: 3-point attempts - previously undervalued and under-utilized, now a primary weapon - have increased from 13.7 per game in 2000-01 to 24 in 2015-16 (basketball-reference.com). And in the NFL, recent analyses argue that teams should go for it on 4th down more often than they do (Romer, 2006). The efficacy of data-driven analysis has convinced most of the early doubters. Nearly all major-sport franchises now spend lavishly on analytics; those that don't are derided as cave-dwelling adherents to stone age methods of scouting, analysis, and coaching. The question is no longer whether to invest in data analysis, but which approaches are most useful.

The purpose of this paper is to analyze the pacing strategies of athletes in championship collegiate cross-country races, for which there now exists a great deal of information. Basic statistical analysis suggests a very strong relationship between pacing and finishing time: relatively even pacing predicts faster times. Or, more accurately, athletes with less positive splits run faster and place higher. While at first glance this seems an obvious and not particularly useful statement, it suggests that the vast majority of championship-level collegiate runners employ sub-optimal pacing. This in turn leads to an important question without an obvious answer: Why do highly-trained athletes, guided by experienced and highly-educated coaches - with every incentive to optimize performance - stray so far from good pacing? High-level programs devote significant resources to nutritionists, psychologists, athletic trainers, and massage therapists; yet they appear to pay inadequate attention to a deceptively important aspect of success.

Endurance sports have historically been at the forefront of data analysis. Scientific research on energy production and the causes of fatigue inform coaches' training plans. And of course, coaches collect copious data on pacing from both races and practices. Our under-standing of pacing strategy is imperfect, but what we do know suggests that relatively even pacing is better than dramatically uneven pacing (Abbiss et al, 2006; Foster et al, 2008; Gosztyla, 2006; van Schenau et al, 1994). Studies of record-setting performances on the track show that in events longer than 800m, pacing strategy is characterized by a fast start, slower middle, and a fast finish (Tucker et al, 2006). Physiology supports the analytical conclusion that in distance events, relatively even pacing yields more successful results than uneven. Yet, cross-country results illustrate an inability for runners to apply the lessons that research teaches.

There are several significant obstacles to understanding cross-country pacing. First, terrain varies. To use an extreme example, if a course goes straight uphill for the first half, and straight downhill for the second, even splits would not be optimal. And a course with tight bottlenecks might require getting through the bottleneck first, even if doing so causes dramatic slowing later in the race. (Although, the success of such a strategy hinges on one's ability to get through the bottleneck ahead of all the other runners trying to get there first.) Secondly, course measurements are imprecise - neither USATF nor IAAF will certify cross-country course distances. Third, coaches may intentionally instruct athletes to race in ways that aren't conducive to achieving the fastest times - perhaps most commonly, employing pack-racing tactics which result in non-optimal pacing for both the fastest and slowest members of the pack.

Despite these obstacles, cross-country racing is ripe for analysis. Part one of this paper details the statistical analyses used, and demonstrates the degree to which pacing correlates with finishing time. Part two addresses the potential impact of better pacing on performance. Part three concludes with a discussion of possible reasons why proper pacing in cross-country appears to be so difficult to achieve. It is important to stress that this is a preliminary investigation. The statistical analysis is rudimentary, and there is much we do not know and cannot yet control for. As such, this is the beginning of a conversation. Finding ways to improve pacing strategy should be a significant training goal for any cross-country runner - indeed, better pacing is essentially free improvement: without increasing speed or fitness, athletes will run faster, and teams will place higher.

Part One - Statistical Analysis

Results from 12 NCAA cross-country races were analyzed. Races were selected on the following criteria: Field size greater than 150; race distance of 10,000m and provision of 2000m splits for all athletes; and regional or national championship races (under the assumption that athletes and coaches are more likely to be racing to optimize finishing time and place at championships than during the regular season). These criteria limited the selection of races to men's NCAA Division I or II Regional or National Championship races. Five National Championship races and seven Regional Championships were used in the analysis. (Table 1 lists the twelve meets analyzed).

Each athlete's final race pace was expressed as a percentage of their 2K split pace (Final RP / 2K Pace). For example, if Final RP = 102.53% of 2K Pace, the athlete's final overall RP was 2.53% slower than their pace early in the race; if Final RP = 98% of 2K Pace, the athlete actually got faster throughout the race. We can then look at the correlation between this number and finishing time. Graphs 1 & 2 show a very clear correlation: for each quintile, pacing grows progressively more positive. The top quintile of finishers had the least positive splits, the 2nd quintile paced more positively, down to the last quintile, which used the most positive pacing.

Table 1 shows a more precise look at this relationship. These results suggest very strong correlations and predictive powers (all p values < .0001, adjusted R2 values ranging from 38.9% to 82.4%), with predicted finishing time improving between 18.6 and 31.7 seconds for each 1% improvement in Final RP / 2K Pace, depending on the race.

As a second step in expressing the relationship between pacing and finish time, a virtual race model was created using data from the 12 races. First, finish results from each race were collapsed into 100 data points by taking the average finishing time for each percentile in each race. Then, for each percentile, the average pacing strategy (expressed as Final RP / 2K Pace of the runners in each percentile) was calculated in each race. Finally, these finishing times and pacing strategies were averaged across all 12 races. Essentially the model turns the thousands of runners from the 12 races into 100 virtual "runners" - with each runner's finish time and pacing strategy calculated by averaging the respective percentile results from each race: that is, the finishing time and pacing strategy of the first virtual runner is the average of the finishing times and pacing strategies of the top one percent of finishers in each race, and the 2nd place virtual runner's time and pacing is the average of the times and paces of the second percent of finishers, all the way down to 100th place and the last percent. Graph 3 shows the relationship between pacing and finish time for this model. It is remarkably strong: R2 = .96. The fastest virtual finishers pace closest to even, and finishing time and pacing progress in a remarkably linear fashion down to the slowest (and most positive-pacing) runners. Granted, turning the multitudes of runners from 12 races into 100 data points has the effect of smoothing out a lot of statistical variation; even so, this is a surprisingly close linear relationship between pacing and finishing time.

To summarize most simply, the top runners are the ones who slow down the least, while rear-pack runners start too fast and slow too much. In some cases (this appears to be one), statistical analysis merely restates the obvious. Although it may be uncharitable to say so, the news that college cross-country runners employ sub-optimal pacing is not earth-shaking. However, the stark truth of this analysis leads to an important and deceptively difficult question: Why do so many athletes employ such sub-optimal pacing? There is no rational reason this should be, no reason why runners with PRs of 33 minutes should pace differently than 30-minute runners. If runner A has a PR 10% slower than runner B, sensible strategy dictates that runner A run the first 2K roughly 10% slower than runner B. Why does runner A race as if his PR is much closer to that of runner B?

Several caveats: First, variables besides pacing exert much greater influence on finishing time - most obviously, running ability. It is fair to assume that the top finishers are more talented than mid- and rear-pack finishers. Pacing is not the causative factor - a 33-minute 10K runner is not going to beat a 30-minute 10K runner just by pacing better. But because there is no obvious way to control for talent, this analysis does not account for running ability - thus, pacing appears to be a stronger predictor than it is. However, it is worth stating again: there is no reason why the optimal pacing strategy for a 33-minute runner should be different from that of a 30-minute runner. Their pacing profiles should be similar, merely adjusted for speed. Secondly, some races (most egregiously, the 2014 Great Lakes Regional) are essentially sit-and-kick affairs with very conservative early paces. In these cases, pacing was probably sub-optimal; however, even eliminating sit-and-kick races, there is still a clear correlation between less-positive pacing and faster times. Further, in cases where we know the final 2K split, overall pacing is much more predictive of final time than finishing speed (time of the last 2K relative to overall race pace).

Part Two - Should Runners Start Slower?

The above analysis is not enough to show that runners could finish significantly faster simply by pacing better. Assuming finishing time reflects near-maximal effort, runners cannot be expected to go out at the same pace and finish faster. To pace more evenly, a runner must start slower. We may not know how much slower a runner should start - this depends on how far over the redline his early pace was. However, we can say how much slower he could start and still run faster if his pacing were better. For this analysis, successful pacing strategy is defined as the average pacing strategy (Final RP / 2K Pace) of the top quintile of finishers in a race. Then, we can ask the following question: if runners outside of the top quintile adopted the successful pacing strategy ("Top Quintile's Final RP/2K"), how much more slowly could they have run the first 2K and still finish at least one second faster than their actual time? This newer, slower 2K split is found by the following equation:

New 2K Split = (Actual Finishing Time - 1)/(5*Top Quintile A' s (Final RP) /2Kpace)

Table 2 gives the results by quintile for each race. Runners in the faster quintiles (who tend to pace better than those in slower quintiles) have less wiggle room - they would need to run relatively closer to their actual 2K split in order to finish higher. Still, on aver-age, those in the 2nd quintile could run the first 2K 7.4 seconds slower, and still run a faster time if they used a better pacing strategy. (On the track, this would be 1.5 seconds per lap, a significant deviation from prescribed pace for elite runners.) For the third and fourth quintiles, their first 2Ks could be 10.8 and 14.7 seconds slower respectively, while for the last quintile, the first 2K could be a remarkable 23.7 seconds slower. In other words, many athletes could probably rein in their early pace quite significantly and still finish in a better spot.

Part Three - Why is Pacing in Race Conditions so Difficult?

Pacing is not easy. Decades of coaching observations at meets of all levels support this conclusion. Whatever mechanism in the body is responsible for regulating pace, humans are not naturally good at it in race conditions. Yet, athletes in organized training programs spend significant amounts of time training at particular paces -including race pace. Assuming athletes are able to pace themselves relatively successfully in weekly interval work-outs, something must change in races.

Several possibilities suggest them-selves. First, team scoring exigencies may lead coaches to employ sub-optimal pacing in pursuit of lower scores. Coaches may instruct athletes to "run up" - for example, have runners 4 and 5 stick with #3 for as long as possible. Or, coaches may exhort athletes to "get out" early in the race, to the detriment of athletes' ability to maintain that pace throughout. But we are speculating. We do not have data on how many athletes were instructed to get out quickly in a given race, nor do we know whether those are the athletes falling off the pace. Possibly, these strategies are sensible gambles, but if everyone in a race is attempting to get out, most are getting in over their heads without improving their position relative to the field. An overly aggressive start won't work if everyone uses it.

A second possibility is that athletes need earlier external feedback to support their internal pacing cues. On the track, athletes get pacing information no later than 400m - or 4% - into a 10K race. In cross-country, the first data point may come 10% (1K), or 17% (1 mile) into the race - at which point it may be too late to correct. Athletes may need information earlier in cross-country. (Although, it is precisely during this first part of the race that athletes are most often told to get out quickly.)

Finally, the useful information runners receive may be overwhelmed by more powerful external signals. Noakes and others postulate that the central nervous system plays a significant role in regulating exertion in endurance events (Tucker and Noakes, 2009); perhaps, under competitive stress, runners are less able to heed messages from this regulating system. Here, the likely culprit is the field itself, as internal pacing cues are overwhelmed by signals from hundreds of fellow competitors. A herd mentality takes over. Herding, well-studied in the social and cognitive sciences, is defined as "the alignment of thoughts or behaviors of individuals in a group (herd) through local interaction, and without centralized coordination." (Raafat et al, 2009.) It is often an unconscious emotional reaction - individuals respond without realizing that they are doing so. Self-awareness is dramatically impacted by what one researcher describes as an "emotional contagion" (Raafat et al, 2009), which could well interfere with internal regulators. Thus, lacking definitive information either from the coach or the stopwatch, runners rely on the herd, which reinforces an overly emotional and aggressive approach in the early stages of the race. No matter the pacing habits inscribed in practice, the herd drives the runners to a pace they cannot maintain. Eventually, physiological reality overcomes the herding impulse, and early exuberance melts in the face of fatigue. The runner slows down, and is dropped from the herd.

The practical implications of herd mentality for distance running, though well-understood in a general way by coaches, probably would yield greater insight if subjected to greater scientific study. Herding is not entirely negative. A "safe" pace - what the body might naturally choose in isolation - likely minimizes risk of damage to the body, but it is probably not fast enough to maximize competitive potential. Success in competitive distance running probably requires overriding (to some degree) internal regulators. There is evidence that competition - both against oneself and against others - can lead to "supramaximal" performance. (Wilmore, 1968). It remains to the coach to find the appropriate balance between using the power of the herd and not surrendering completely to its siren call.

As noted at the top, the analyses presented in this article are preliminary. There are many areas in need of further investigation: how coaches teach pacing in practice, and what instruction is given at races; which athletes set the pace in a crowded field, and how a crowded field impacts runners' exertion and pacing; how fields impede runners who start conservatively and attempt to move up; the degree to which cross-country pacing differs from that in track and road races; whether women pace differently from men - all of these may yield deeper insight. Successful cross-country pacing is at once both simple, and difficult to define: To paraphrase Justice Potter Stewart's threshold for obscenity, we know good pacing when we see it. Unfortunately, we also too readily accept bad pacing from our athletes. A better under-standing of what good pacing is - and how to coach it in athletes - will help coaches and athletes employ better pacing and enjoy better results.

---------------------------------------
SOURCES

http://www.basketball-reference.comlleaguesINBA_2001.html, and http://www.basketball-reference.comlleaguesINBA_2016.html#all_team_stats.

Abbiss, C., & Laursen, P., (2008). Describing and understanding pacing strategies during athletic competition, Sports Medicine, 38(3), 239-252.

Foster, C., Snyder, A., Thompson, N., Green, M. Foley, M. & Schrager, M., (1993). Effect of pacing strategy on cycle time trial performance, Medicine and Science in Sports Exercise, 25(3), 383-388.

Gosztyla, A., Edwards, D„ Quinn, T., & Kenefick, R., (2006). The impact of different pacing strategies on five kilometer running time trial performance, Journal of Strength and Conditioning Research, 20(4), 882-886.

Raafat, R., Chater, N., & Frith, C., (2009). Herding in humans, Trends in Cognitive Sciences, 13(10), 420-428.

Romer, D., (2006). Do firms maximize? Evidence from professional football, Journal of Political Economy, 114(2), 340365.

Tucker, R., Lambert, M., & Noakes, T., (2006). An analysis of pacing strategies during men's world-record performances in track athletics, International Journal of Sports Physiology and Performance, 1(3), 233-245.

Tucker, R., & Noakes, T. (2009). The physiological regulation of pacing strategy during exercise: a critical review, British Journal of Sports Medicine, 43(1), 1-9.

van Schenau, G. de Koning, J., & de Groot, G. (1994). Optimisation of sprinting performance in running, cycling and speed skating, Sports Medicine, 17(4), 259-275.

Wilmore, J. (1968). Influence of motivation on physical work capacity and performance, Journal of Applied Physiology, 24(4), 459-463.

SOURCES For Meet Results

http://www.flashresults.com/2010_Meets/xcINCAAMWRegion/
http://www.gosycamores.com/fls11520011statistics/cross/MenIndividual.htm?DB_OEMID=15200
http://www.deltatiming.com/results/xcresults.aspx?yf=2011&mf=2011-ncaad1-great-lakes-region-xc&ev=1&sp=True
http://www.ustfccca.org/assets/results12011xc/ncaa-dl-xc-2011-MEN.html
http://www.deltatiming.com/results1xcresults.aspx?yf=2012&mf=2012-ncaaxc-south-regional&ev=2&sp=True
http://results.deltatiming.com/xcl2012-ncaa-d1-cross-country-championships/results/2
http://pttiming.comlevents187928395
http://www.onlineraceresults.coml racelview_plain_text.php?race_id=37193, https://www.lifI s.orglresultslxcl9178.html#76508
http://flashresults.com12016_MeetslxclNCAASE/index.htm
http://results.flotrack.org12015111-13 -NCAAD1SIVVebITimetable.php?D=1
http://branchsportstech.com12015_Meets/XCView/#
http://www.ncaa.comIncaa-cross-country-championship-live-timing0

----------------------------------------
Steve Lane has been the cross-country and track and field coach at Concord-Carlisle High School since 1998. His teams have won seven MIAA titles, 13 Dual County League titles, and the Concord-Carlisle boys cross-country team was named "Team of the Decade" for 2000-2009 by the Mike Mahon XC Poll. He has twice been named Boston Globe Coach of the Year.