Tuesday, July 13, 2010

What qualifies as a Simulation Model?

A theme that has been running through my career since my Master's project has been the question of measuring complexity in modelling and simulation. When can one proclaim to have built a simulation model and when is one glorifying simple analysis?

In the Operations Research ecosystem the tendency is certainly to inflate. Salesmen, curriculum vitae authors, recruiters and consultancies across the spectrum are all motivated to embellish the work that they do and work that is done. Like any scientific individual I seek to slice through the static, inform myself as to who is doing extraordinary work, and to build myself a framework from which I can safely criticize the inflations of others.

I have been working on a set of rules for separating "models" into models, calculations and simulations. I feel like there is a gaping opportunity here for contribution from complexity, chaos, and other disciplines in Computer Science and Mathematics, but here's what I've put together thus far:

Simulations are models, but not all models are simulations. Calculations are not models.

Models
  1. A model is a simplified representation of a system.
  2. All models are wrong, but some models are useful
Calculations
  1. The result of a calculation can be expressed in a single equation using relatively basic mathematical notation.
  2. Where calculations contain an time element, values at different times can be determined in any order without referring to previous values.
Simulations
  1. A simulation is a calculation in which one parameter is the simulation clock that increments regularly or irregularly.
  2. The outcome of a simulation could not have been determined without the use of the clock.
  3. While an initial state is typically defined, an intermediate state at a given time should be difficult or impossible to determine without having run the simulation to that point.
  4. Almost any model that involves repeated samples of random numbers should be classified as a simulation.
Consider the following progression of "models" that output an expected total savings:
  1. Inputs: Expected total savings.
  2. Inputs: Annual savings by year, time-frame of analysis.
  3. Inputs: Annual savings per truck per year, number of trucks by year, time-frame of analysis.
  4. Inputs: Annual savings per truck per year, current number of customers, number of trucks per customer, annual increase in customers, time-frame of analysis
  5. Inputs: Annual savings per truck per year, current number of customers by geographical location, annual increase in customers by geographical location, routing algorithm to determine necessary trucks, time-frame of analysis.
  6. Inputs: Annual savings per truck per year, current number of customers by geographical location, distribution of possible growth in customers by geographical location, routing algorithm to determine necessary trucks, time-frame of analysis.
As you can see, complexity builds and eventually passes a threshold where we would accept it as a model. "Model" 4 is still little more than a back of the envelope calculation, but Model 5 takes a quantum leap in complexity with the introduction of the algorithm. Model 5 however I would still not classify s a simulation, because any year could be calculated without having calculated the others. Finally Model 6 introduces a stochastic variable (randomness) that compounds from one year to another and brings us to a proper simulation.

I've seen calculations masquerading as simulations models at a Fortune 500 company both internally and externally. While the result is the same: outcomes determined from data where validity is asserted by the author, I know that Operational Research practitioners reading this will appreciate my desire to classify. At the very least it will help us separate what the MBAs do with spreadsheets from our own work.

I welcome input from others on this topic, as I am only just developing my own theories.

2 comments:

Paul A. Rubin said...

In an inventory system that repeatedly reorders the same amount Q of a single item, at regular intervals, the annual cost is given by a weighted sum of 1, Q and 1/Q. The order quantity that minimizes that cost (charitably ignoring any integrality restrictions), known as the Economic Order Quantity, is the square root of a fairly simple expression involving the coefficients.

Does that constitute a model or a calculation in your taxonomy? To me, the first equation is a (very simple) model of the system's cost structure, and the formula for the EOQ is a closed-form solution to the model.

Kundan Sen said...

What the MBAs do with OR is a good test of the value that OR can bring to the table. OR is taught at management schools and the MBAs know all about it. They are paid to do what they are doing and hence will only do what adds value.
An academic OR specialist may develop many theories and algorithms that have the potential to change the world for the better, but it will valueless if not appreciated by the managers in business or government. The trick is to sell OR to the management. That is the only way OR activity will get funded. We need to keep in focus the fact that OR is an inter-disciplinary field and requires team work.

The EOQ is old hat and makes sweeping assumptions that may not be valid for most items in most factories anymore. Each important product or product group requires specific treatment. These may come from the Information System or the OR methods but ultimately the cost of having that product has to be reduced (not necessarily minimized) without affecting appropriate availability. So, newer theories and models are required not only to handle J-I-T but other emerging requirements. The most critical element in application of OR or scientific management requires constant questioning of the present state of affairs. It also requires understanding of the human factor as much as the theories, an appreciation of timing of change and sound economic sense.