The diurnal cycle in solar power generation and dependency on accurate forecasts of cloudiness requires a new set of performance benchmarks that are different from those used to evaluate wind power forecasts. While traditional bulk statistical metrics such as the bias and correlation to observations describe some important error attributes, there are arguably more meaningful metrics dictated by the constraints of the grid operator (e.g., availability and price of grid balancing reserves). We present results and examples of different risk (or error) tolerances imposed on day-ahead solar power forecasts at several U.S. locations and quantify the resultant improvement an advanced forecast method delivers over unskilled forecasts, such as persistence and clear-sky.