Bigger, Stronger, and Faster — but Not Quicker?



There’s some controversial IQ research which suggests that reaction times have slowed and people are getting dumber, not smarter. Here’s Dr. James Thompson’s summary of the hypothesis:

We keep hearing that people are getting brighter, at least as measured by IQ tests. This improvement, called the Flynn Effect, suggests that each generation is brighter than the previous one. This might be due to improved living standards as reflected in better food, better health services, better schools and perhaps, according to some, because of the influence of the internet and computer games. In fact, these improvements in intelligence seem to have been going on for almost a century, and even extend to babies not in school. If this apparent improvement in intelligence is real we should all be much, much brighter than the Victorians.

Although IQ tests are good at picking out the brightest, they are not so good at providing a benchmark of performance. They can show you how you perform relative to people of your age, but because of cultural changes relating to the sorts of problems we have to solve, they are not designed to compare you across different decades with say, your grandparents.

Is there no way to measure changes in intelligence over time on some absolute scale using an instrument that does not change its properties? In the Special Issue on the Flynn Effect of the journal Intelligence Drs Michael Woodley (UK), Jan te Nijenhuis (the Netherlands) and Raegan Murphy (Ireland) have taken a novel approach in answering this question. It has long been known that simple reaction time is faster in brighter people. Reaction times are a reasonable predictor of general intelligence. These researchers have looked back at average reaction times since 1889 and their findings, based on a meta-analysis of 14 studies, are very sobering.

It seems that, far from speeding up, we are slowing down. We now take longer to solve this very simple reaction time “problem”.  This straightforward benchmark suggests that we are getting duller, not brighter. The loss is equivalent to about 14 IQ points since Victorian times.

So, we are duller than the Victorians on this unchanging measure of intelligence. Although our living standards have improved, our minds apparently have not. What has gone wrong? [“The Victorians Were Cleverer Than Us!” Psychological Comments, April 29, 2013]

Thompson discusses this and other relevant research in many posts, which you can find by searching his blog for Victorians and Woodley. I’m not going to venture my unqualified opinion of Woodley’s hypothesis, but I am going to offer some (perhaps) relevant analysis based on — you guessed it — baseball statistics.


It seems to me that if Woodley’s hypothesis has merit, it ought to be confirmed by the course of major-league batting averages over the decades. Other things being equal, quicker reaction times ought to produce higher batting averages. Of course, there’s a lot to hold equal, given the many changes in equipment, playing conditions, player conditioning, “style” of the game (e.g., greater emphasis on home runs), and other key variables over the course of more than a century.

Undaunted, I used the Play Index search tool at to obtain single-season batting statistics for “regular” American League (AL) players from 1901 through 2016. My definition of a regular player is one who had at least 3 plate appearances (PAs) per scheduled game in a season. That’s a minimum of 420 PAs in a season from 1901 through 1903, when the AL played a 140-game schedule; 462 PAs in the 154-game seasons from 1904 through 1960; and 486 PAs in the 162-game seasons from 1961 through 2016. I found 6,603 qualifying player-seasons, and a long string of batting statistics for each of them: the batter’s age, his batting average, his number of at-bats, his number of PAs, etc.

The raw record of batting averages looks like this, fitted with a 6th-order polynomial to trace the shifts over time:


That’s nice, you might say, but what accounts for the shifts? I considered 21 variables in an effort to account for the shifts, and ended up using 20 of the variables in a three-stage analysis.

In stage 1, I computed the residuals resulting from the application of the 6th-order polynomial. That is, I subtracted from the actual batting averages the estimates produced by the equation displayed in figure 1. For ease of reference, I call this first set of residuals the r1 residuals.

I began stage 2 by finding the correlations between each of the 21 candidate variables and the r1 residuals. I then estimated a regression equation with the r1 residuals as the dependent variable and the most highly correlated variable as the explanatory variable. I next found the correlations between the remaining 20 variables and the residuals of that regression equation. I introduced the most highly correlated variable into a new regression equation, as a second explanatory variable. I continued this process in the expectation that I would come across an explanatory variable that was statistically insignificant, at which point I would stop. But I ran through 16 explanatory variables without hitting a stopping point, and that exhausted the number of explanatory variables allowed by the regression function in Excel 2016.

The 16th regression on the r1 residuals left me with a set of residuals that I call the r2 residuals. In stage 3, I estimated a new equation with the r2 residuals as the dependent variable, following the same procedure that I used to obtain the 16-variable regression on the r1 residuals. In this case, I used 4 of the remaining explanatory variables; the 5th proved statistically insignificant.

I then combined the estimates obtained in the three stages to obtain the equation that’s discussed later, and at length. For now, I’ll focus on the apparent precision of the equation and its implications for the hypothesis that the general level of intelligence has declined with time.


Here’s how well the equation fits the data:


The 6th-order polynomial regression lines (black for actual, purple for estimated) are almost identical.

Here’s how the final estimates (vertical axis) correlate with the actual batting averages (horizontal axis):


I’ve never seen such a tight fit based on more than a few observations, and this one is based on 6,603 observations. I’m showing 6 decimal places in the trendline label so that you can see the 3 significant figures in the constant, which is practically zero.

Year (YR) enters as a significant variable in stage 3, with a coefficient of
-0.0000284 . (The 95-percent confidence interval is  -.0000214  to  -.0000355 ; the p-value is  3.40E-15 .) So, everything else being the same (a matter to which I’ll come), batting averages dropped by  .00327  between 1901 and 2016 ( -0.00327 =  -.0000284 x 115 ). (Note: It’s conventional to drop the 0 to the left of the decimal point in baseball statistics. And if you’re unfamiliar with baseball statistics, I can tell you that a difference of .00327 is taken seriously in baseball; many a batting championship race has been decided by a smaller margin.)

If the compound equation resulting from stages 1, 2, and 3 accounts satisfactorily for all changes affecting BA, the estimate of  -.00327  might be attributed to the slowing of batters’ reaction times. However, despite the statistical robustness of the coefficient on YR, it’s necessary to ask whether there are factors not properly accounted for that might point to the conclusion that reaction times have remained about the same or improved. To get at that question, I’ll present and discuss in the next section a table that summarizes the complete equation and all 20 of its explanatory variables. As you read and interpret the table, keep these points in mind:

The 6th-order polynomial (stage 1) is a filter. It captures the fluctuations over time that must be accounted for by the 20 “real” variables that are listed in the table (including YR) and discussed below the table. The “year” terms in the 6th-order polynomial are therefore irrelevant to the question of whether reaction times have slowed.

Every p-value in the stage-2 and stage-3 regression equations is smaller than  0.0001 , and most of them are far, far below that threshold.

The significance of the explanatory variables notwithstanding, the standard errors of the stage-1 and stage-2 equations are both about  .0027 . Therefore, the 95-percent confidence interval surrounding estimates of BA derived from those equations is plus or minus  .0053 . As discussed above, that’s not a small error in the context of baseball statistics. In fact, it’s enough to swamp the effect of YR.

As discussed below, many of the explanatory variables have intuitively incorrect signs and are highly correlated with each other. This casts doubt on the validity of the derived coefficients, including the coefficient on YR.

I don’t mean to say that reaction times have stayed the same or become faster. I simply mean that this analysis is inconclusive about the trend (if any) of reaction times — possibly because there is no trend, in one direction or the other.

The equation, taken as a whole, does an admirable job of accounting for changes in BA over the span of 115 years. But I can’t take any of its parts seriously.

It’s been great fun but it was just one of those things.


Table 1 gives the coefficients and maxima, minima, means, standard errors, and 95-percent confidence intervals around the coefficients of the explanatory variables. Statistical parameters and estimated values are expressed to three significant figures. For ease of comparison, I use decimal notation rather than scientific notation for the explanatory variables.


Next is table 2, which gives the cross-correlations among the explanatory variables (including the 21st variable that’s not in the equation). Positive correlations above 0.5 are highlighted in green; negative correlations below 0.5 are highlighted in yellow; statistically insignificant correlations are denoted by gray shading.

TABLE 2 (right-click to open a larger image in a new tab)

Here’s my explanation and interpretation of the instrumental variables:

Intercept (c) (shown in table 1)

This is the sum of the intercepts derived from the 6th-order polynomial fit and the stage-2 and stage-3 regression analyses.

On-base-plus-slugging percentage minus batting average (OPS – BA)

BA is embedded in both components of on-base-plus-slugging percentage (OPS). By subtracting BA from OPS, I partly decouple that relationship and obtain rough measure of a batter’s propensity to get on base (mainly) by walking, plus his propensity for hitting doubles, triples, and home runs. But see OBP – BA and SLB – BA, below.

Strikeouts per plate appearance (SO/PA)

The positive coefficient on SO/PA is counterintuitive. In any particular at-bat, striving to hit a home run is thought to reduce a batter’s ability to make contact with the ball. The positive coefficient therefore reflects the positive relationship between HR/PA and BA (see below), and the tendency of home-run hitters to strike out more often than other hitters.

On-base percentage minus batting average (OBP – BA)

The negative coefficient on this variable probably means that it’s compensating for the residual component of BA that lingers in OPS – BA. This variable and OPS – BA should be thought of as a complementary variable — one that’s meaningless without the other.

Home runs per plate appearance (HR/PA)

The positive coefficient on this variable seems to capture the positive relationship between HR and BA. For example, most of the great home-run hitters also compiled high batting averages. (Peruse this list.)

Integration (BLK)

I use this variable to approximate the effect of the influx of black players (including non-white Hispanics) since 1947. BLK measures only the fraction of AL teams that had at least one black player for each full season. It begins at 0.25 in 1948 (the Indians and Browns signed Larry Doby and Hank Thompson during the 1947 season) and rises to 1 in 1960, following the signing of Pumpsie Green by the Red Sox during the 1959 season. The positive coefficient on this variable is consistent with the hypothesis that segregation had prevented the use of players superior to many of the whites who occupied roster slots because of their color.

Deadball era (DBALL)

The so-called deadball era lasted from the beginning of major-league baseball in 1871 through 1919 (roughly). It was called the deadball era because the ball stayed in play for a long time (often for an entire game), so that it lost much of its resilience and became hard to see because it accumulated dirt and scuffs. Those difficulties (for batters) were compounded by the spitball, the use of which was officially curtailed beginning with the 1920 season. (See this and this.) Batting averages and the frequency of long hits (especially home runs) rose markedly after 1919. Given the secular trend shown in figure 1, it’s surprising to find a positive coefficient on DB, which is a dummy variable (value =1) assigned to all seasons from 1901-1919. So DB is probably picking up the net effect of other factors. It should be considered a complementary variable.

Performance-enhancing drugs (DRUG)

Their rampant use seems to have begun in the early 1990s and trailed off in the late 2000s. I assigned a dummy variable of 1 to all seasons from 1994 through 2007 in an effort to capture the effect of PEDs on BA. The resulting coefficient suggests that the effect was (on balance) negative, though slight. Players who used PEDs generally strove for long hits, which may have had the immediate effect of reducing their batting averages.

Slugging percentage minus batting average (SLG – BA)

I consider this variable to be a complement to OPS – BA and OBP – PA.

Number of major-league teams (MLTM)

The standard view is that expansion hurt the quality of play by diluting talent. However, expansion didn’t keep pace with population growth over the long run. (see POP/TM, below). In any event, MLTM should be considered another complementary variable.

Night baseball, that is, baseball played under lights (LITE)

It has long been thought that batting is more difficult under artificial lighting than in sunlight. This variable measures the fraction of AL teams equipped with lights, but it doesn’t measure the rise in night games as a fraction of all games. I know from observation that that fraction continued to rise even after all AL stadiums were equipped with lights. The positive coefficient on LITE suggests that it’s yet another complementary variable. It’s very highly correlated with BLK, for example.

Average age of AL pitchers (PAGE)

The r1 residuals rise with respect to PAGE rise until PAGE = 27.4 , then they begin to drop. This variable represents the difference between 27.4 and the average age of AL pitchers during a particular season. The coefficient is multiplied by 27.4 minus the average age of pitchers; that is, by a positive number for ages lower than 27.4, by zero for age 27.4, and by a negative number for ages above 27.4. The positive coefficient suggests that, other things being equal, pitchers younger than 27.4 give up hits at a lower rate than pitchers older than 27.4. I’m agnostic on the issue.

Complete games per AL team (CG/TM)

A higher rate of complete games should mean that starting pitchers stay in games longer, on average, and therefore give up more hits, on average. The positive coefficient seems to contradict that hypothesis. But there are other, related variables (P/TM and IP/P/G), so this one should be thought of as a complementary variable.

Number of pitchers per AL team (P/TM)

It, too, has a surprisingly positive coefficient. One would expect the use of more pitchers to cause BA to drop (see IP/P/G).

World War II (WW2)

A lot of the game’s best batters were in uniform in 1942-1945. That left regular positions open to older, weaker batters, some of whom wouldn’t otherwise have been regulars or even in the major leagues. The negative coefficient on this variable captures the war’s effect on hitting, which suffered despite the fact that a lot of the game’s best pitchers also served.

Bases on balls per plate appearance (BB/PA)

The negative coefficient on this variable suggests that walks are collected predominantly by above-average hitters, who are deprived of chances to hit safely. See, for example, the list of batters who collected the most career bases on balls. Anecdotally, during the many years when I regularly listened to and watched baseball games, announcers often spoke of the “intentional” unintentional walk and “pitching around” a batter. In both cases, a pitcher would aim for the outside edges of the plate, to avoid giving a batter a good pitch to hit. If that meant a walked batter and a chance to pitch to a weaker batter, so be it.

Innings pitched per AL pitcher per game (IP/P/G)

This variable reflects the long-term trend toward the use of more pitchers in a game, which means that batters more often face rested pitchers who come at them with a different delivery and repertoire of pitches than their predecessors. IP/P/G has dropped steadily over the decades, exerting a negative effect on BA. This is reflected in the positive coefficient on the variable, which means that BA rises with IP/P/G. But the effect is slight, and it’s prudent to treat this variable as a complement to CG/TM and P/TM.

AL fielding average (FA)

Fielding averages have risen generally since 1901, which was an especially bad year at .938. The climb from .949 in 1902 to .985 in 2016 was smooth and almost uninterrupted. How would that affect BA? Here’s an example: A line drive that in 1916 bounced off the edge of a fielder’s glove might have been counted as a hit or an error, and if it just missed the glove it would usually be counted as a hit. A century later the same line drive would almost always be caught in the much larger glove worn by a fielder in the same position. It therefore seems to me that the coefficient on this variable should be negative, that is, a higher FA should mean a lower BA. The positive coefficient points to a confounding factor (e.g., BLK).

Year (YR)

This is the crucial variable, and the value of its coefficient — given the inclusion of all the other variables — may say something about the IQ hypothesis. After taking into account the 19 other variables in this equation, the coefficient on YR is slightly negative, which suggests that batters have generally been getting a bit slower. But as discussed throughout this post, there’s much uncertainty about the validity of the equation and, therefore, about the validity of the coefficient on BA.

Maximum distance traveled by AL teams (TRV)

Does travel affect play? Probably, but the mode and speed of travel (airplane vs. train) probably also affects it. The slightly positive coefficient on this variable — which is highly correlated with YR, BLK, MLTM, and several others — is meaningless, except insofar as it combines with all the other variables to account for BA.

U.S. population in millions per major-league team (POP/TM)

POP/TM has been rising almost without pause, despite expansion, and is now at its peak value. The negative coefficient is therefore surprising, and probably reflects the strong correlation of POP/TM with BLK, and perhaps other variables.

Batter’s age (BAGE)

This is the 21st variable, which isn’t in the final equation. The r1 residuals don’t vary with BAGE until BAGE = 37 , whereupon the residuals begin to drop. Accordingly, this variable represents the difference between 37 and a player’s age during a particular season.

In sum, there’s no way of knowing whether the negative coefficient on YR is related to reaction time, the (probably) greater speed of today’s pitchers, the greater variety of pitches thrown by today’s pitchers,  or anything else that’s not adequately reflected by the 20 variables in the final equation. I rest my case and throw myself on the mercy of the court.


Baseball Stuff



In the past two days I’ve published two long posts about baseball at my other blog. If you like baseball statistics and the interpretation thereof, you should read “Yankee-Killers (and Victims)” and “Great (Baseball) Performances.”

See also “A Rather Normal Distribution.”