Abstract Using a comprehensive dataset of privacy policies, firm characteristics, consumer tracking, and cybersecurity incidents, we document several stylized facts about the heterogeneity of firms’ data extraction practices and the influence of privacy regulations. Rather than adopting standardized boilerplate privacy policies, we find substantial within-industry differences correlated with firms’ technical sophistication; firms engaging in data extraction have lengthier policies, seeking to hedge legal risks. Firms with intermediate technical sophistication appear to follow a “collect and share” model, collecting large amounts of consumer data and sharing it with third parties for processing, thus creating cybersecurity risks. Conversely, high sophistication firms appear to implement a “receive and process” model, consistent with a two-tier data market in which data flow from intermediate to high sophistication firms.