We examine the effects of crowd characteristics on crowd value, which is measured by the improvement in the power to predict stock volatility using crowd-generated content. Leveraging a natural platform-wide event that changes the crowd compositions of S&P 500 stock discussions, we found empirical evidence that content from a larger crowd size is associated with a higher crowd value. Moreover, the magnitude of the effect of crowd size on crowd value decreases with increased diversity of the crowd’s background and the independence of the crowd’s opinions. Additionally, we found that crowd values derived using various machine learning algorithms exhibit different sensitivities to these crowd characteristics. Algorithms that can handle interrelated observations (i.e., non-independent) and potential nonlinear relationships among crowd-generated content are more robust in performance than algorithms that cannot. We discuss the mechanisms that drive these findings and highlight the implications of crowd diversity and crowd independence on model performances when analyzing crowd-generated content.