Turning Info into Insight at NU



Northwestern’s Fall 2012 magazine features various researchers who are working in the area of Big Data.  The need to analyze Big Data is a reason that NU started the Masters in Analytics program:

“I’m getting calls from firms that see the value in big data, but they don’t know how to extract it,” says analytics expert Diego Klabjan, professor of industrial engineering and management sciences. “It’s definitely a very, very hot area. Everyone’s looking for expertise. We’ve had tremendous interest from companies. These days every company needs analytics. They need to hire a workforce that is capable of analyzing data.”

To that end, McCormick recently developed a master of science program in analytics. The inaugural class of the 15-month program is learning data warehousing techniques, the science behind analytics, and the business aspects of analytics. Directed by Klabjan, the program has its own computing cluster to take on big-data problems, and students will each do a summer internship. They will learn to identify patterns and trends, interpret and gain insights from vast quantities of structured and unstructured data, and communicate their findings in business terms.


It is easy to find articles that mention analytics.  It is harder to find one that gives some concrete examples.  A recent article on Fab.com (a fast rising $140 million design retailer) does a nice job in one small paragraph of mentioned what they are doing with analytics:

WSJ: How are you using data to figure out what customers want?

Mr. Goldberg: We do a ton of customer segmentation, a ton of cohort analysis. We’re starting to get smart about putting certain products in front of people based on what they’ve looked at in the past or bought in the past. We’re already one of the leaders in utilizing data for e-commerce when it comes to understanding a wide range of products and how to merchandise them. You’ll see much more personalization and customization come out next year.

(For a good definition of cohort analysis, see here or here)

For a long time, dynamic pricing was most visibly done with airline tickets, gas, and maybe hotels (although I’m not sure consumers realized this as much).

Recently, there has been several articles on this topic.

The Chicago Tribune discussed the Goodman Theater using revenue management techniques to price tickets based on demand (very similar to the airlines).   The Operations Room did a nice commentary on this.

The print version of the Dec 2012 National Geographic magazine had a short article on San Francisco’s pilot program to dynamically change street parking space based on demand. (You don’t really expect to see an article on dynamic pricing and revenue management in National Geographic!)

The Wall Street Journal ran a front page story on how on-line retailers were changing prices in real time to match competitors.  Since this is happening in real-time, the retailers need to have good systems for keeping track of competitors prices, and good algorithms for figuring out what to do.

Since this market moves a lot faster than the airline market, I’m guessing that the algorithms have to have a bit more game theory in their strategies than the airlines (what will competitor X do in real time if I change this price?).

To combat the price war, retailers are trying to offer unique items.  Of course, this means that the algorithms will have to figure out when two products are close enough to be substitutes.

I’m guessing that we’ll only see more articles like this as dynamic pricing moves to more product categories.

I found OpenSolver, a free Excel add-in, through Mike Trick’s OR Blog.  It took me about 5 minutes to download it, install it, and getting it working with one of my existing spreadsheets that I use in class.

I had used Excel’s built-in solver only because all my students had it installed and it was easy to learn.  The limitations, though, always kept the model size small and I never fully trusted it with integer programs.

One nice feature of OpenSolver is that it shows the model.  OpenSolver automatically added the color you see below to the model.  You can see in G208, the cell is red and it shows that I’m minimizing this cell.  Column F and the Y i-j matrix below is colored pink since these are decision variables.  In cell C213 you can see that it shows that these decision variables are binary.  And, F209:F210 show an example of a constraint and you can see that it is less than.

One of the reasons I use Excel is that the students are used to it.  And, it is logical for them to set up problems in Excel.  When OpenSolver adds this extra information to the spreadsheet, it is even better.  The students can see how their model is working and also look for bugs or problems.

I’ve only done a few large scale tests.  In one, I had 200 warehouses serving 200 customers and wanted the best 3 warehouses.  OpenSolver solved this fine, but took about 5-10 minutes to initially build the model.  By contrast, CPLEX Studio 12.4 read data from the same spreadsheet and solved the model in just a few seconds.

BusinessWeek just published an article on Big Data for the Dairy Industry.

I didn’t naturally think about big data being applied to the dairy industry.  But, this isn’t the first place I’ve seen this industry referenced.  Cows are expensive and produce a lot of milk.  Keeping the herd healthy and productive is important.  With new sensors, new tests, and the ability for dairy farmers to upload data for analysis the industry is a good target for big data efforts.





I often see articles in the business press that equate “analytics” with “big data.”  That is, these articles imply that the field of anlaytics is only about working with big data sets.

But, analytics is about much more than this.

Michael Schrage recently wrote that he asked executives what they would do with 100 times more data about their customers.  He said that no one would guess what they would do with that data, and  one CEO suggested that costs may actually increase since they couldn’t deal with the data.  The article points out that acquiring and analyzing big data is not the issue:

Instead of asking, “How can we get far more value from far more data?” successful big data overseers seek to answer, “What value matters most, and what marriage of data and algorithms gets us there?”

If you extend his key question, you can see that you may not even need big data to get value.  Analytics helps you determine what data you need to collect, how you need to analyze that data, and what actions you need to take as a result.

Depending on what you are trying to achieve, you may not even need a big data set.  We see many companies that already have the data they need to help improve their business.  They just need to use more advanced techniques– like optimization– to get value from that data and take action.

A more extensive definition of analytics breaks the field into the different types:  Descriptive Analytics, Predictive Analytics, and Prescriptive Analytics.  Different types of analytics are needed for different types of problems.

On Oct 30, Northwestern’s Transportation Center is hosting a workshop on dealing with big data in the transportation industry:

Data-Driven Business: Challenges and Best Practices in the Transportation Industry

Tuesday, October 30, 2012 – 2:00-4:45 pm

Transportation Center, Chambers Hall, Lower Level
Northwestern University
600 Foster Street
Evanston, IL

Transportation companies are confronted with growing – some may say exploding – and diverse sources of data. This data may be mined from social media, obtained from customer surveys, collected from environmental sensors, and gleaned from geo-positioning radios, among others. Looking through the lens of the transportation industry, the Northwestern University Transportation Center’s fall Industry Workshop will examine challenges and best practices in data-driven business. In the workshop a keynote speaker and two panel discussions will address questions, such as:

  • Why “data-driven” is a different way of competing?
  • What does it take to unlock opportunity in data?
  • What are the organizational implications for being “data-driven?”

2:00 – 3:15 pm – Panel 1: Marketing and Operations: Social media, Location-based, System-health

3:15 – 3:30 pm – Networking Break

3:30 – 4:45 pm – Panel 2: Freight Management & Logistics

The final program will be posted on October 23.

An article from Forbes reviewing Steve Sashihara’s book, The Optimization Edge, helps make a case for more optimization (and a good recommendation of the book):

One of his most interesting arguments is that a great deal of the effort spent on information gathering and analysis is wasted — or, at least, used sub-optimally — when it’s used to feed business intelligence systems that produce reports that ultimately wind up with being fed into spreadsheets and PowerPoint slides. Managers then sit around in a conference room listening to presentations and debating what the data means and what decisions should be made about it — when, in many cases, good software could make the decision itself. The GPS in your car is optimizing when it says “turn left at Main Street” rather than presenting you with a list of possible routes.

If you stop your analytics efforts short of applying optimization, you may be missing out on a lot of value.

Dan Gilmore at The SupplyChainDigest reported on a study that showed the importance of good inventory control:

“…permanently reducing your level of inventories relative to sales and sales growth can have a dramatic impact on a company’s share price.”

Inventory is a very visible way to measure a firm’s supply chain efficiency.  However, inventory is a great way to control for variability in the supply chain.  So, if you want to permanently reduce inventory, you need to go after the underlying variability.