In Chapter 16 of my book, “Sandlot Stats: Learning Statistics with Baseball,” I developed a new formula which uses a player’s seasonal batting statistics to assign a probability of that player duplicating any batting streak. Then I apply my formula to calculate which players had the highest probabilities of duplicating special batting streaks. Of course, the most talked about batting streak is DiMaggio’s 56-game hitting streak. Another interesting streak belongs to Ted Williams when in 1949 he reached base successfully in 84 straight games. Which streak was harder to achieve? My formula assigned DiMaggio a probability of .0001 (1/10,000) and assigned Williams a probability of .0935 (935/10,000) of achieving their respective streaks. This says that, using their batting statistics for 1941 and 1949, for every 10,000 seasons DiMaggio would duplicate his streak one time while Williams in 10,000 seasons would duplicate his streak 935 times. Clearly, DiMaggio’s streak was the harder to achieve.

I also applied my formula to many other batting streaks such as the most consecutive games with at least one home run, the most consecutive games without striking out and many other streaks. If you are interested in seeing the mathematics I used to develop my formula and the players who actually own these records, please read Chapter 16—titled ‘Streaking’— in my book.

Below are the players with the longest hitting streaks in both the Major and Minor Leagues. Observe that Joe is the only player that appears on both lists.

In a recent article Sara Lang looked at the streak by the numbers:

**.408:** DiMaggio hit .408 (91-for-223) during the streak with 15 home runs and 55 RBIs.

**.375:** He entered May 15 (the first game of the streak) with a .306 batting average. That rose to .375 after the July 16 game, the final game of the streak.

**4:** DiMaggio faced four future Hall of Fame pitchers: Lefty Grove, Hal Newhouser, Bob Feller and Ted Lyons.

**10:** DiMaggio extended the streak in his final plate appearance 10 times, as Elias research notes.

**16:**DiMaggio started a 16-game hitting streak the game after the 56-game one ended. So he hit in 72 of 73 games total. In those 73 games, he had 120 hits, 20 home runs and six strikeouts.

**44:** The longest hitting streak since DiMaggio’s is a 44-gamer by Pete Rose in 1978.

**29:** The longest hitting streak by a Yankees player since DiMaggio’s streak ended is a 29-gamer by Hall of Famer Joe Gordon in 1942. Derek Jeter’s longest hitting streak was 25 games in 2006. Don Mattingly’s longest was 24 in 1986. Those are the three longest for the Yankees since DiMaggio.

A discussion of DiMaggio’s 56-game hitting streak always resurfaces whenever a player starts approaching DiMaggio’s record. The Cleveland Indians 20-year old catcher Francisco Mejia, ranked their number four prospect, playing for the High-A Lynchburg Hillcats entered August batting .344. On Aug.4, he extended his hitting to 45 games by doubling in the 9^{th} inning after going 0-4. As of this writing his hitting streak stands at 47 games. This ranks his streak as the 7^{th} longest in Minor League history.

Below are the players with the longest hitting streak in both the Major and Minor Leagues. There are many observations that can be made from these two tables. DiMaggio is the only player that appears on both lists. In fact, his 61-game streak in the Minors was longer than his 56-game streak in the Majors. Yes, Joe D. was a very special player. Except for Joe the other players listed in the Minor League Table had limited Major League success. In contrast, the Major League players listed are all Hall of Fame caliber players. What conclusions can you draw from this?

In an article by Herm Krabbenhoft which appeared in the Baseball Research Journal, he compares DiMaggio’s 56-game hitting streak to Williams’ 84-game on-base streak. Krabbenhoft gives his answer in terms of approachability. He states, “Since DiMaggio achieved his streak in 1941, the closest any major league player has come to it was the 44-game hitting streak by Pete Rose in 1978. Forty-four is 78.6% of the way to 56. Since Williams achieved his 84-game streak in 1949, the closest any player has come to it were the 58 consecutive game on-base streak by Duke Snider in 1954 and Barry Bonds in 2003. Fifty-eight is 69% of the way to 84. So, with the above approachability considerations in mind, it can be argued that Teddy Ballgame’s 84 game on-base safely streak may be the greatest batting achievement of all.” Since Krabbenhoft’s article was published in 2004, Orlando Cabrera recorded a consecutive game on-base streak of 63 games in 2006. Sixty-three is 75% of the way to 84. This blows a hole in the approachability argument.

As a sabermetrician, I give my answer using probability theory. Which player DiMaggio or Williams, based on their statistics for that year, had the smallest probability of achieving their streak? Using the number of games played, number of plate appearances and number of successes of any player combined with the length of the streak, I created a probability formula which gives the probability of any player, based on their season’s batting statistics, duplicating any batting streak. The development of my probability formula for different batting streaks can be found in two books. In my book, *Sandlot Stats: Learning Statistics with Baseball*, published by John Hopkins Press I devote the entire Chapter 16 to comparing different batting streaks. My research on streaks was also published as Chapter 4 in the book *Mathematics and Sports*, published by the Mathematical Association of America.

Applying my probability formula to both players’ streaks, here are the results.For the year 1941, the probability of Joe DiMaggio achieving his 56-game hitting streak was 0.0001 or 0.01%. For the year 1949, the probability of Ted Williams achieving his 84-game on-base streak was 0.0944 or 9.44%. For every 10,000 seasons, we would have expected DiMaggio in 1941 to accomplish his streak once while we would have expected Williams in 1949 to accomplish his streak 944 times. Ted Williams himself said, “I believe there isn’t a record on the books that will be tougher to break than Joe DiMaggio’s 56-game hitting streak.”

Based on the probabilities calculated above, I agree with Williams that DiMaggio’s 56-game hitting streak is the more impressive.What about the probabilities associated with the 2016 streaks of Bradley and Ozuna? As for Bradley’s 29-game hitting streak his probability was 0.00281 or 0.281%.

Ozuna probability of a 36-game on-base streak was 0.0125 or 1.25%. Bradley’s streak is the more impressive one.

If you are wondering why Williams’ 84-game streak had such a high probability of occurring in 1949 the lengthy answer is in my book

The Miami Heat had a 27-game winning streak in the 2012-13 season with a 66-16 record. The Heat went on to defeat SA Spurs in a series that went the full seven games to repeat as NBA champions. We shall see if the Warriors can duplicate the Lakers and Heat and win the 2015-16 NBA title.In 2015, the NFL Carolina Panthers started the season with 14 straight wins. In the history of the NFL there have been only three other NFL teams to do this. They are the 1972 Miami Dolphins, the 2007 New England Patriots, and the 2009 Indianapolis Colts. The 2007 Patriots became the first team after the NFL expanded its regular season to sixteen games in 1978 to finish undefeated. They won the divisional and conference playoffs before losing Super Bowl XLII to the NY Giants, giving them a final record of 18–1. The 1972 Dolphins finished the regular season 14-0 and continued on to win Super Bowl VII and thus go undefeated 17-0 for the entire season. The 2009 Colts lost their last two regular season games going 14-2 for the regular season. The Colts made it to the Super Bowl but lost to the Saints.

This brings me to baseball. In 2015, there was an impressive streak achieved by the Toronto Blue Jays. In fact, this streak was repeated by the Blue Jays for a second time in 2015. Surprisingly these streaks really went unnoticed by many fans. The Blue Jays had two 11-game winning streaks in 2015. The last time a MLB team had two winning streaks of at least 11 games was done by the Cleveland Indians in 1954. The Blue Jays clinched a playoff berth on September 25, 2015, their first since 1993, ending what was the longest playoff drought in North American professional sports at the time. On September 30, the team clinched the American League East Division. They went on to defeat the Rangers in the Division Series. They were eliminated from the playoffs when they lost to the Royals in game 6 of the ALCS.

This led me to comparing the 1954 Indians to the 2015 Blue Jays. Boy was I surprised at what I found. Here are their statistics:

The Formula I created to calculate a team’s Expected Win% is .000673*(RS-RA) +.5.

Using this formula:

The Expected Win% for Indians is .000673*242+.5 =.668.

The Expected Win% for Blue Jays is .000673*221+.5 = .649.

Conclusion: The Indians OVERPERFORMED; the Blue Jays UNDERPERFORMED.

The record for the longest winning streak by a Major League baseball team belongs to the 1916 New York Giants. That year the Giants had a 26-game winning streak which is the record. But, what is amazing is that prior to that streak they pulled off a 17-game winning streak. These two streaks gave the Giants a combined record of 43-0. But the most surprising fact about the 1916 Giants is there overall record in 1916 was 86-66 with 597 RS and 504 RA. This means if we take away their two streaks their record would be a dismal 43-66. Their actual Win% was .566 and their expected Win% was .000673*(597-504) +.5 = .563.

These three teams need to be examined using sabermetrics to explain their strange results

]]>The title of this blog comes from the book *Joe DiMaggio and the Last Magic Number in Sports *by Kostya Kennedy. His book takes the reader through each game of Joe DiMaggio’s 56-game hit streak at a time when America was preparing for war with Japan. Joe’s streak began on May 15, 1941 when he blooped a single to right field in a game against the White Sox. The streak ended two months later at Cleveland’s Municipal Stadium, in front of 67,000 cheering fans. That day Joe had 4 plate appearances. Joe walked once and hit 3 ground balls. The first ground ball was a rocket hit down the 3rd base line which was backhanded by Cleveland’s Ken Keltner throwing Joe out by a step. Joe Walked in the 4th inning. In the 7th he ripped another rocket to Keltner who threw him out again. In his final plate appearance he hit a routine grounder to the shortstop. Joe’s greatness showed when he promptly started a new hitting streak which lasted for 16 games. All told Joe produced at least one hit in 72 of his 73 games. Both his 56-game hitting streak and hitting safely in 72 out of 73 consecutive games have never been duplicated. Without Keltner’s great fielding, the consecutive game streak might have reached 73 games.

The year 1941 also marked the last time a Major League hitter batted over .400 when Ted Williams batted .406 for the season. The year 1941 witnessed two remarkable baseball feats that many baseball experts say will never happen again. The baseball writers had a tough choice for the 1941 AL MVP Award. They chose the Yankees’ DiMaggio over the Red Sox’s Williams.

In my book, *Sandlot Stats Learning Statistics with Baseball*, I devote Chapter 16 to the study of many different types of batting streaks. In that chapter I develop a new probability formula which uses a player’s actual batting statistics for a season to calculate his probability of duplicating any of these batting streaks. These calculated probabilities allows us to compare different batting streaks seeing which streak would be the hardest to duplicate.

The rivalry between DiMaggio and Williams also extended to batting streaks. Ted Williams possesses 2 amazing on-base streaks. He holds the record for getting on-base in 84 consecutive games (1949) and the record for getting on-base in 16 consecutive plate appearances (1957). To be credited with getting on-base a player must either get a hit, a walk or be hit by a pitch. Using my probability formula, I calculated the probability of Joe and Ted achieving their 3 streaks. DiMaggio had a 1 in 10,000 chance of achieving his 56-game hitting streak while Williams had a 1 in 10 chance of achieving his 84-game on-base streak and a 1 in 25 chance of achieving his 16-plate appearance on-base streak. Which streak was the hardest to achieve? From a probability point of view the answer is clear. Yes, Joe DiMaggio’s streak was the hardest to achieve. In fact, Ted Williams said, “I believe there isn’t a record on the books that will be tougher to break than Joe DiMaggio’s 56-game hitting streak.”

In Chapter 16 of my book I provide 4 lists of special baseball and softball players. The lists include the players with the longest hitting streaks in the Major Leagues, the Minor Leagues, the college baseball leagues and the college softball leagues. In the Major Leagues Pete Rose (1978) and Willie Keeler (1897) are tied for second place with 44-game hitting streaks. For the Minor Leagues, Joe Wilhoit (1919) had a 69-game hitting streak followed by would you believe Joe DiMaggio with a 61-game hitting streak in 1933 for the San Francisco Seals in the PCL.

Considering the thousands of players in the history of professional baseball, for Joe to have 2 of the 3 longest hitting streaks speaks to the greatness of Joe D.

]]>So what feat in baseball do I think about when the Grand Slam in golf is discussed? The words grand slam in baseball refers to hitting a home run with the bases loaded. So the words home run come to mind. Adding the terms batting average and RBI to home run we are now talking about the Batting Triple Crown in baseball. A batter achieves The Batting Triple Crown when he leads either league in the three statistical categories of batting Average (BA), home runs (HR), and runs batted in (RBI) for the same season. These three categories represents a batter’s hitting skill, hitting for power, and creating runs for his team. Most recently in 2012 Miguel Cabrera earned the Batting Triple Crown, replacing Carl Yastrzemski (1967) as the last player to achieve this. Yastrzemski in 1967 actually tied with Harmon Killebrew for the league lead with 44 home runs. The Career Batting Triple Crown is accomplished when a player wins or ties for the three titles of BA, HR, and RBI but not in the same season.

Since the American League joined the National League in 1901 the list of Batting Triple Crown winners include the following 12 players: Nap Lajoie (1901), Ty Cobb (1909), Rogers Hornsby (1922, 1925), Jimmy Foxx (1933), Chuck Klein (1933), Lou Gehrig (1934), Joe Medwick (1937), Ted Williams (1942, 1947), Mickey Mantle (1956), Frank Robinson (1966), Carl Yastrzemski (1967), and Miguel Cabrera (2012). Every player on this list except Cabrera (who is not eligible) has been elected to the Baseball Hall of Fame. Unlike the Grand Slam in golf two players can win the Triple Crown for the same season. The 1933 season actually had two winners, one in each league.

In the years to come will we have our first Grand Slam winner in golf or our next Batting Triple Crown winner? Since we have never had a Grand Slam winner in golf one might vote for the Baseball Triple Crown occurring first. But there are also good arguments for the Grand Slam in golf occurring first. In golf starting with the year 1934 and ending with 2014 there could have been a maximum of 81 Grand Slam winners; whereas, in baseball from 1901 to 2014 there could have been a maximum of 114*2 =228 possible winners. This makes the 14 Batting Triple Crowns to the 0 Grand Slams less impressive. Many baseball writers believe the Batting Triple Crown is much more difficult to win today because today’s batters choose to specialize in batting average or hitting with power. The gap of 45 years between 1967 and 2012 demonstrates this. Further, it is more difficult today in baseball since each league has 15 teams instead of 8 teams. What do you think?

]]>What is a streak? It is any consecutive number of successes by a single team or player. Success for a team can mean winning a game, hitting a home run in a game, getting double digit hits in a game, etc. Success for a player can mean getting at least one hit in a game, hitting at least one home run in a game, getting on-base at least once in a game, etc. A streak can be over just one season or can extend to multiple seasons. In what follows we will only consider streaks for one season. By all definitions, a streak signifies dominance by a team or player because long streaks do not happen by accident. Yes, usually there is at least one game when luck was necessary for the streak to continue. Many say Kentucky’s victory over Notre Dame this year had a luck component. In what follows getting on-base means the result of a plate appearance is reaching base by either getting a hit, a walk, or being hit-by-pitch. A plate appearance is any result of an at-bat. Here are some notable baseball streaks by a player.

- Most consecutive games without striking-out (115): Joe Sewell, 1929
- Most consecutive plate appearances with a hit (12): Walt Dropo, 1952
- Most consecutive games with at least two hits (13): Rogers Hornsby, 1923
- Most consecutive games with at least three hits (6): George Brett, 1976)
- Most consecutive games with at least one home run (8): Dale Long (1956), Don Mattingly (1987), Ken Griffey Jr.’s (1993)
- Most consecutive games with at least one base-on-balls (22): Roy Cullenbine, 1947
- Most consecutive games scoring at least one run (18): Red Rolfe (1939) Kenny Lofton (2000)
- Most consecutive games with at least one triple (5) John Wilson (1912)
- Most consecutive games with at least one RBI (17): Ted Grimes, 1922.
- Most consecutive plate appearances getting on-base (16): Ted Williams, 1957

The mathematicsto develop a model for predicting which streak would be the hardest to duplicate and which player had the highest probability of duplicating the streak always fascinated me. This led me to develop a formula for using any player’s batting statistics for a given year to assign him a probability of duplicating a particular streak. The formula I developed is Ch. 4 in the book *Mathematics and Sports* published by the Mathematical Association of America. Also, Ch. 16 (Streaking) in my book *Sandlot Stats: Learning Statistics with Baseball* develops this formula and uses this formula to compare various batting streaks.

Two of the most celebrated players in baseball Joe DiMaggio and Ted Williams own two of the most notable consecutive game streaks in baseball. Ted Williams, known for his batting eye, owns the streak of most consecutive games getting on-base (84 in 1949). DiMaggio’s streak of most consecutive games with a hit (56 in 1941) immortalized the number 56.

Which of these two streaks would be hardest to duplicate? Applying my formula to these two streaks. Williams had a probability of, 09444 (1 in 11 chance) of achieving his 84-game streak in 1949. DiMaggio had a probability of .00010 (1 in 10,000 chance) of achieving his 56-game streak in 1941. As further proof that the 56-game streak was the tougher to duplicate, DiMaggio’s probability of achieving the 84-game streak in 1941 was .00565 (1 in 17 chance) and in 1949 the probability of Williams achieving the 56-game streak was .000001 (1 in 100,000 chance). The second longest hitting streak is 44 games (Pete Rose 1978).

]]>I wanted to compare the ESPN top ten players to my list of top ten players, as they appeared in Chapter 18 of my book Sandlot Stats. ESPN’s top ten list was based on the subjective opinion of their chosen committee of experts. However, they were encouraged to use advanced metrics. My top 10 was based on nine quantitative statistics which included AVG (batting average), OBP (on-base pct.), SLG (slugging pct.), OPS (on-base plus slugging), BRA (OBP*SLG), HRA (home run average), H (Number of Hits), HR (number of home runs), and Runs Created for their team [(H+BB)*TB]/[AB+BB]. Also, credit was given for winning a Triple Crown, a Career Triple Crown, and ranking in the top 10 in either Bill James’ Black or Gray-Ink Test. In Chapter 18, you can read about my 26 finalists and their total points. Like the ESPN list, I only looked at what the players did between the lines. Since my list only considered positional players, I only chose ESPN’s top ten positional players. There was one other difference. The ESPN list considered hitting, fielding, and base-running whereas my list only considered hitting. Therefore, Rickey Henderson, who finished number 11 on the ESPN list, did not make my list of 26 finalists. My list was based on player accomplishments before 2009 (when Chapter 18 was written).

Notice how similar the two lists are. This shows how important hitting is in the evaluation of positional players. Of course, both lists have “The Babe” as number 1. I can understand the difference in rank 2 between the two lists. Willie Mays was a five-tool player in the important position of center field; whereas Ted Williams was an adequate left fielder. In fact, the Yankees turned down a proposed trade of Ted Williams for Joe DiMaggio because they considered the center field position much more valuable than the left field position. Taking into account fielding and running, I can see why ESPN put Mays in front of Williams.

Except for the order of the 21 players on the two list, the only five players not on both lists are Nap Lajoie, Honus Wagner, Rogers Hornsby, Mickey Mantle, and Albert Pujols. Wagner and Mantle are on the ESPN top 10 list but not on my list. However, Wagner ranks 12^{th} and Mantle 13^{th} on my list of 26 players. The actual difference between the 8^{th} ranked Lajoie and the 13^{th} ranked Mantle is a total of five points in my scoring system. Excluding pitchers, Hornsby ranked 12^{th} , Pujols ranked 15^{th} , and Lajoie ranked 34^{th} on the ESPN list My major beef with the ESPN list is the extreme difference in rank between Lajoie (rank 34) and Wagner (rank 9). Both players played in the same era (1896-1917) and both were infielders. Lajoie’s career AVG was .338 compared to Wagner’s.327. I gave the edge to Lajoie because in 1901 he was a Triple Crown winner. I guess ESPN liked the fact that Wagner played shortstop while Lajoie played second base. I could have easily called it a tie between the two players.As I mentioned before my All-Time favorite player was Mantle. In my opinion if it wasn’t for his reckless life-style and unfortunate knee injury, he would have been in the top 5 on both lists

The table below compares the batting, pitching and fielding statistics for the 2013 Yankees to two other Yankee teams. The Yankees of 2009 had a record of 103-59 and won the World Series. The 1990 Yankees had a record of 67-95, the worst Yankee record since 1950. With still 87 games left in 2013 can we Yankee fans even dream of making the playoffs no less winning a World Series? The fact that this year there are two wild card selections will help. The statistics RS/G and RA/G are the runs scored per game and the runs allowed per game. The WHIP is the number of walks plus hits per inning.

In summarizing the table it is fair to say that the 2013 Yankee batting statistics are very similar to the 1990 Yankee batting statistics (not good). The pitching and fielding statistics for the 2013 teams are better than both the 1990 and 2009 teams. What makes the batting statistics even worse is that since the second week in May the offense has really dropped off. Even if Jeter and Rodriguez can return, without a spring training to get back into shape, I expect very little production from these aged veterans. With the trade deadline approaching, GM Cashman has big decisions to make. If he believes his All-Star players cannot return he must seek out skilled replacement players. However, this becomes difficult since management wants to get below the salary cap next year. This means no new long-term contracts. Right now 5-games separate the five AL East teams in the standings. Clearly, the pitching and fielding have been responsible for the Yankees’ 2013 current record. The pitching staff is solid and Michael Pineda is expected to return after the All-Star game. Gardner and Cano are the only two hitters that other teams fear. Left-handed pitchers dominate the Yankee batters because of a lack of any reliable right-handed batters. In the last few games the first six batters were all left-handed batters. Unless the veteran positional players drink from the “Fountain of Youth” and the youthful Yankee players overachieve, sadly I must predict that the 2013 Yankees will not play in the post-season this year.

What does Miguel have going for him? He is protected in the batting order by Prince Fielder so he should see good pitches to hit. He is a power hitter and power hitters tend to have a higher IPBA. In my book, I list the 56 players between 2000 and 2007 with IPBA > .400 for an entire season. Of those, 71% were power hitters with at least 30 home-runs. What is going against Miguel batting .400? Being slow-footed he will not be able to get infield hits. Being a power hitter he tends to strike-out more. The term regressing toward the mean definitely applies in baseball. What this term says is as a player increases his at-bats his observed average will move toward his true average. This applies to his BA, SOA, and IPBA. His average SOA from age 27 to today was .154 and his average IPBA was .401. Since his observed SOA of .154 and observed IPBA of .401 are based on close to 2000 at-bats in his prime years, we can assume that his true SOA is close to .154 and his true IPBA is close to .401. Therefore, it would be unrealistic to expect his current SOA to move toward .103. Also, his current IPBA should dip toward .401 as this year’s at-bats pile up. Therefore, Miguel Cabrera will not join the elite .400 club.

What about Miguel repeating his Triple Crown achievement? After 45 games last year, Miguel had a BA of .306 with 8 HR and 35 RBI. After 45 games this year, he has 14 HR and 55 RBI to go along with his BA of .391. His improvement in each category is amazing. After 45 games, he is in second place, 1 home run behind Chris Davis and leads second place Davis by 11 RBI. In BA, he is 41 points ahead of second place James Loney. Leading the league in RBI will be no problem for Cabrera . Currently, the Detroit Tigers lead the league in both BA and OBP. This along with the fact that Miguel bats third in the Detroit line-up behind two .300 hitters, assures him of batting several times with men in scoring position. In fact, in his last 23 at-bats with two outs and men in scoring position he has 20 hits. No wonder he is an RBI machine. With such a large lead, winning the batting title will also be no problem. This leaves one question mark. Can he lead the league in home runs? Last year he won the HR title by just 1 HR and there were 3 players within 3 HR. Currently, he trails Chris Davis by 1 HR and is 2 ahead of Cano and Encarnacion. Right now he is on pace to hit 48 HR, 4 more than in 2012. My feeling is the home run race will go down to the wire. Encarnacion who hit 42 in 2012 and C. Davis (breakout year) are his biggest threats.

Not only do I believe Miguel Cabrera can repeat his Triple Crown, but I believe he will. This will mark the first time in baseball history that a player has won back-to-back Triple Crowns. News Flash: Cabrera is now on pace for 198 RBI which would break Hack Wilson’s record of 191.

