File Analysis: Use Analytics to Play Your Files Like Moneyball

"Does he get on base?"

"Moneyball" is a movie I can't stop watching. It's one of those where I sit down to watch just one scene and then end up watching the whole thing. It's really good. A lot of people in 2011 agreed, and the concept of data analytics became common parlance. While most people stop thinking about advanced analytics when the conversation turns away from sports though, it quite simply doesn't end there. Your file analysis strategy could and should do the same.

"Moneyball" follows Oakland A's general manager Billy Beane as he seeks to create a contender with an incredibly limited budget. After being eliminated by the New York Yankees (the richest team in baseball), and facing an aging core of stars, Beane determines that Oakland will have to zig while the rest of the league zags. When Assistant GM Peter Brand (a composite of Oakland assistants, including data pioneer Paul DePodesta) suggests a sabermetric approach to scouting, Beane realizes that this is the zig he's been seeking. After applying advanced analytics focused on player efficiency and lowering cost, Oakland rattles off an MLB record-breaking 20 wins in a row with a shoestring budget.

The film ends on a rather melancholy note-- though one true to real life events. Beane is offered the highest salary for a GM in baseball by the Boston Red Sox but turns it down in an attempt to win a World Series with his beloved Oakland A's. As fate would have it, the Red Sox would employ his model anyway, and went on to win the 2004 World Series. Oakland has yet to do so.

Putting Analytics Back in File Analysis

As the film demonstrates, the application of data beyond the eye test is absolutely necessary. With the rise of calls for privacy and better data management, every file needs to be analyzed well beyond simply saying "well, I know where it is." It's a cliche, but it's a cliche for a reason-- you need to work smarter, not harder.

With the rise of legislation regarding data privacy (and more is almost certainly on the way), businesses need to be more accountable than ever where their files and data are concerned. You need to know metadata, key phrases, content, and many, many other aspects of every single file your business creates. If you don't, you'll be facing massive fines at the least, and possible legal action or business collapse at the worst. Consequently, your business must have some kind of file analysis solution. Obviously no human being could process that much data that quickly, so naturally, a machine that can learn is the only real answer.

This is where file analysis moneyball comes into play. Your solution needs to be robust enough to analyze all of these aspects, and in turn optimize the way your business deals with data. What phrases are most likely to generate compliance issues? What employees are most likely to practice good data security? What tasks should be cross-functional? Where is your business creating a bottleneck? What is the optimal customer to attract?

Data can do all these things and more, and when each and every one of your files is properly analyzed, you create a more optimized company. Yes, data is king now, but if you aren't applying that data, you might as well be doing everything by hand.


The best part of sports is that the trends in sports tend to mirror what's going on in the world at large. When sports destroyed the color barrier, the stain of segregation began to crumble. When sports bring about ceasefire, global conflict gets put on hold. Likewise, with the rise of big data in sports, we're seeing the application become necessary in everyday life. Take that data, apply it, and see what success it can bring to your team.

Tucker Partridge is a graduate of the University of Arkansas, and a newly minted Bay Area resident. He is a professional marketing associate, a semi-professional comedian, and an amateur trivia enthusiast.