The Differing Pages of the 1999 MLB All-Century Team on Wikipedia

by Jacob Walling

 

The general topic I chose to analyze was various pages of baseball players, specifically the players selected to the Major League Baseball All-Century Team in 1999. The team was selected by having experts create a list of the 100 greatest baseball players of all time and then fans voted for the actual twenty-five-man team with the top two vote getters from each position, except outfield and pitchers, where the top nine and top six were included. Finally, a panel selected five players, who they felt deserved to be on the team, to complete the list of thirty players selected for the team. I chose to analyze specifically this team instead of a list of the greatest baseball players by some statistic because this team was voted on by a large amount of people, with every player that made the team getting at least 140,000 votes, and the team had a mix of players across various eras. Several players that were still active in 1999 were voted onto the team, while several other players who played in the 1920s and 1930s were also voted onto the team. I decided to order the pages alphabetically by last name to make the most sense to the reader since the vote tallies are not attached to their pages, but the overall MLB All-Century Team page.

My first observation of the pages was based on the first image used in each page. The first interesting observation is that every image except one is a photograph. Since these are all actual people who lived when photographs were possible, I would expect all the images to be photographs instead of drawings, but Lefty Grove’s first image is his baseball card with a drawing on it.


A 1933 Goudey baseball card of Grove.

The fact that Lefty Grove is the only one with a drawing is interesting because the card they used is not noteworthy for its value or rarity and Lefty Grove is not the oldest player. Lefty Grove retired in 1941 while Honus Wagner, another member of the team, retired in 1917 and his card is famously the most expensive one in existence, yet the first image of Wagner is a photograph.


An example of the Honus Wagner card.

Honus Wagner's first image on his Wikipedia page.

My only assumption is that Grove was given a drawing instead of a photograph because he is not as well known as the other players. This assumption is based on the fact that his page is the shortest with slightly over seven thousand words where as the next shortest page is over ten thousand and also the fact that he received the fewest votes and was put on the team by the experts. The other interesting observation regarding the first images is that most deceased players are shown in their uniform from their playing days and most of the alive players are shown in current pictures with them in suits since every player that made the team is now retired. However, this is not true across every page and the exceptions are interesting. Ernie Banks is the only deceased player to be pictured not in uniform, but his picture is from his reception of the Presidential Medal of Freedom in 2013, so the image is both a modern one and one denoting an important recognition.


Ernie Banks receiving his Presidential Medal of Freedom in 2013.

The alive players with their images showing them during their careers are Nolan Ryan, Sandy Koufax, Willie Mays, Brooks Robinson, and Ken Griffey Jr. My best guess as to why these five players do not have current pictures is because they were iconic players of their generations. Nolan Ryan was the first pitcher to register a 100 mile-per-hour pitch and dominated the league in the seventies and eighties and shattered the career strikeout record. Sandy Koufax put together six of the best pitching seasons ever in the sixties and then retired at the peak of his career. Willie Mays dominated throughout his career from 1951 to 1973 and made twenty-four all star teams, the most of all time. Brook Robinson was a third baseman in the sixties and seventies and was known for playing spectacular defense winning sixteen Gold Gloves the most for a position player. Ken Griffey Jr. was a phenom in the 1990s and 2000s whose smooth and captivating swing hammered 630 home runs seventh all time. These images also follow a pattern since three out of five of them show the player in the last few years of their career.


Ken Griffey Jr. from 2009.

The clusters are also distinctly broken up especially since there are only thirty pages. The four clusters are pitchers, post-1940 position players, pre-1940 position players, and players with the last name of Robinson, which only includes Brooks and Jackie Robinson. I tried to remove this cluster as it seemed unnecessary, but when I tried to make the corpus only have three clusters the program kept the Robinson cluster and added the pre-1940 cluster into the post-1940 position player cluster. I decided to keep the page at four clusters because I found the pre-1940 cluster interesting. The players contained in this cluster are Ty Cobb, Lou Gehrig, Rogers Hornsby, Babe Ruth, and Honus Wagner. It is interesting that all the pitchers were kept together, but the hitters were split into their own cluster, which I determined to be players who retired before 1940. I figured this was the cluster because Stan Musial and Ted Williams were in the post-1940 cluster, but began their careers in 1936 and 1939 respectively, while Lou Gehrig retired in 1939 and was in the pre-1940 cluster. I figure that pitchers did not split into two clusters because there are only nine pitchers compared to the twenty-one batters (nineteen without the last name Robinson).


Lou Gehrig in 1923.

Another interesting observation is the pages that have been translated into more languages. My hypothesis was that players in eighties and nineties would have their pages translated into more languages, especially with the introduction of baseball to the Olympics in 1992. However, the pages with the most languages are from players popular in the thirties and fifties with the youngest player in the top ten of languages being Hank Aaron, who retired in 1974. My guess for why this is true involves Major League Baseball’s famous tours in Japan. In 1934 Major League Baseball took an all-star team that included Babe Ruth and Lou Gehrig to Japan since the country had recently developed its own professional league after an introduction to the sport in the 1870s. In 1953 Major League Baseball again brought teams to Japan in an effort to repair the countries’ relationship post World War II. These tours continued and still happen in modern times but the tours in the fifties and sixties included players like Jackie Robinson, Joe DiMaggio, Hank Aaron, Yogi Berra, Mickey Mantle, and Ted Williams, all of whom are in the top ten for pages translated into the most languages. My guess is that players that went on these tours in Japan and other Asian countries got their pages translated into more languages, especially languages native to Asia. The only exceptions in the top ten of most language pages are Ty Cobb and Cy Young, both players who retired before 1930. These two again disprove my hypothesis as I would expect Ken Griffey Jr. or Roger Clemens to be the exceptions after the players that went on the tours since they played in the 1990s and 2000s when it was easier to communicate and televise events globally. My guess for Cy Young is since he shares his name with the award given to the best pitcher each year is that his page gets translated to explain the award, but I have no idea why Ty Cobb, a well-known racist in his time, would have page translated more than the likes of Pete Rose, the all-time hits king.


A poster of Babe Ruth from the 1934 All-Star tour of Japan.

A final observation involves the pages of Roger Clemens and Mark McGwire and steroid as a top word over the history of the pages. Clemens and McGwire are both players who admitted to using steroids during their careers in the nineties, but it became a bigger deal in the early 2000s during investigations and congressional hearings on steroid use in baseball. Both players had their pages created in 2001, however there is no mention of steroids until 2004 for McGwire when he admitted to using andrstenedione, a steroid previously not prohibited by Major League Baseball, and 2007 for Clemens, when he was listed in the Mitchell Report, a famous investigation into steroid use in baseball. Steroid then steadily rose as a top word in both players’ pages until it was the ninth top word in Clemens page and the third top word in McGwire’s page. The difference being that McGwire has admitted to using steroids where as Clemens has always denied his use. Clemens denying his use is also why McNamee and congress are also top words on his page because McNamee is the name of the trainer who allegedly injected Clemens and because Clemens had to testify in front of congress multiple times about his steroid use because of his vehement denial of steroid use.


Mark McGwire hitting a home run in 2001.

Roger Clemens in 2001.

Overall this was a very interesting set of pages to analyze because of its spread of players across eras with their own histories and legacies attached to baseball.