Why Doesn't Traditional Evaluation Work?