Lentis/"Data is the new oil"

From Wikibooks, open books for an open world
Jump to navigation Jump to search

"Data is the new oil!" was the title of a talk given by data scientist Clive Humby in 2006 at an Association of National Advertisers conference. In his talk, Humby claimed that raw data has to be processed just like crude oil has to be refined in order to have value. Since then, the phrase has entered mass circulation to illustrate a myriad of other parallels between data and oil. It is not a coincidence that the phrase “Data is the new oil” materialized just as the digital revolution began. Just as the Second Industrial Revolution of the 1900s was fueled by oil, the digital revolution today is being powered by data. Overall, data and oil have revolutionized the modern world and the global economy, and their economic value has allowed them to have tremendous sociotechnical impacts.


Processing and Refinement[edit]

Clive Humby initially made the analogy to demonstrate that the raw forms of data and oil needed some intermediate processing step before they gained value that allowed them to become sellable products. Just like oligopolistic oil companies which almost always own their own refineries (allowing them to maintain near-total control of their supply chain), oligopolistic data companies often perform their own in-house data processing. These oligopolistic data companies should be referred to as data-driven companies, because they completely rely on their processed data for revenue.


Like the oil industry, the data collection industry is limited to only a few large multinational companies which control the entire market. There are countless smaller companies that contribute in some small way to the macroscale supply chain operations of data/oil, but the large companies dominate the economy in terms of market share and overall sociotechnical influence. Examples of these larger companies in the oil industry include ExxonMobil, Chevron, BP, and Shell, and examples of their analogs in the data industry include Amazon, Google, Facebook, and YouTube. The world's first billionaire, John D. Rockefeller, was the CEO of the Standard Oil Company (the company that would eventually be split into companies like ExxonMobil and Chevron), and the world's first centi-billionaire, Jeff P. Bezos, is the CEO of Amazon (currently the world's largest data-driven e-commerce platform). The parallels between the two industries are clear in terms of the amount of wealth and power contained in them.

Data-Driven Companies[edit]

The business model of these companies is to collect data from their consumers and use the data to guide internal decision-making to improve sales by better targeting their products to individual consumers' preferences. A new variant of this business model accompanies the advent of the Internet, which makes it possible to use advertising as the sole means of revenue generation. Companies like these, especially social networking and e-commerce platforms, can collect data on their consumers in order to better tailor the ads that appear on their site to an individual's preferences. This can be achieved by utilizing technologies like tracking pixels which can monitor a person's previous browsing and buying history in order to recommend similar products in the future. In exchange for users' data, data-driven companies are able to offer their services to users for free. Facebook, YouTube, and Google are totally free, and while Amazon customers still have to pay for products and Prime membership, all consumers get free access to Amazon's massive sales and shipping infrastructure as well as a vast array of third-party vendors.

Data Bounty Hunter Companies[edit]

Also known as Data Brokers, Data Bounty Hunter companies profit almost exclusively by collecting and selling data to third-party companies and entities. They usually gather data by buying it, mining public records, and/or creating applications that collect user data[1]. There are three main types of Data Broker companies. Companies such as PeopleFinders and White Pages profit by creating an online marketplace where individuals can pay for information about individuals. Acxiom and Cambridge Analytica are companies who sell data to companies/entities that will use the data for marketing decisions. Then, there are companies like ID Analytics selling data to companies/entities using it for risk mitigation and identity verification[2].

A schematic of the difference in workflow between data-driven companies and data bounty hunter companies that shows how both classes of company generate revenue from their data.

Leaks and Spills[edit]

Oil spills cause lasting damage, especially on marine ecosystems, and usually result in massive backlash from environmentalists and erosion of public trust in oil companies. Some notable examples in the United States include the 1989 Exxon Valdez and 2010 Deepwater Horizon oil spills. Similar to these spills, data leaks result in backlash from privacy advocates and erosion of public trust in the companies collecting consumer data.

Major Data Leaks[edit]

Countless data breaches have resulted in the non-consensual publishing of consumer personal data. Reasons for breaches include hacking, accidental uploading, and intentional leaking. Breaches affect both the private and public sectors and encompass a wide swath of data sets ranging from electronic medical records to location data to friends lists on social networking sites.

Facebook and Cambridge Analytica[edit]

The Facebook and Cambridge Analytica data scandal was a prime example of a big data-driven company partnering with a data bounty hunter company. Cambridge Analytica, a political data firm hired by President Trump’s 2016 election campaign, gained access to private information on more than 50 million Facebook users. According to a company announcement about new implementations for the privacy policy, Mike Schroepfer, Facebook’s chief technology officer, issued that as many as 87 million users were affected, most of them residing in the United States [3].

Aleksandr Kogan, a researcher at Cambridge University, developed an application for a personality quiz that thousands of Facebook users installed on their Facebook accounts. This application was able to receive information about not just the users of the application but also their entire networks of friends. The information that was collected included details on user’s identities, friend networks, and “likes”. The application that Kogan developed stored all the data received from Facebook into a private database, which contained information of the 50 million Facebook users, and was given to Cambridge Analytica. With the data, Cambridge Analytica made 30 million “psychographic” profiles about voters . With the information collected, the idea was to map personality traits based on the “likes” of users, and use this to create digital ads for a target audience for the 2016 presidential campaign of Ted Cruz and the 2016 Trump campaign [4].

Before users signed up to create a Facebook account, they had to agree upon a Terms on Service, which essentially acts as a contract between a service provider and the people that are using the service [5]. Facebook insisted that what Cambridge Analytica did was not a data breach since it allows researchers to use the data for academic purposes, which is what users agreed upon before they created a Facebook account. However, Facebook does prohibit the data to be sold or transferred “to any ad network, data broker or other advertising or monetization-related service,” which is exactly what happened with the information being provided to Cambridge Analytica.


The data scandal with LocationSmart was another scandal that occurred where real-time locations of any phone in the United States were exposed without the knowledge of the phone carriers. LocationSmart worked with the big U.S. wireless carriers, including AT&T, Verizon, Sprint, and T-Mobile, and received users’ location data from them so it could triangulate their whereabouts more precisely using cell towers from the multiple providers [6]. Robert Xiao, a researcher at Carnegie Mellon University, was able to use a free trial of the service to instantly receive the locations of any mobile phone from any one of those major wireless carriers. Securus, one of LocationSmart’s clients, also received the service to find the location of any phone in the United States. An anonymous hacker was able to gain access to Securus’ website, which enabled the hacker to access the location data that was provided to law enforcement by Securus [7].

While the scandal did not receive as much news coverage as the Cambrige Analytica scandal, it does pose a serious data privacy and security crisis, especially when almost anyone can track your smartphone position in real-time and there is no option to opt out of this while using your phone.


In the United States, domestic offshore oil drilling and oil pipeline transport are the subject of regulation by the Environmental Protection Agency (EPA). Similar to how oil spills have engendered public uproar for increased regulations on oil companies, data leaks over the past twenty years have served as the impetus for data privacy regulations that are just beginning to take effect.

California Consumer Privacy Act[edit]

The California Consumer Privacy Act was originally passed in June 2018 under Governor Jerry Brown.[8] The main tenets of the law state that Californians now have the rights to 1) access what data is collected on them, 2) disallow collection of their data, and 3) delete their data.[9] The law is one of the first examples of comprehensive state legislation on consumer data protection.

“Data Dividend”[edit]

California is again trying to lead the way in data regulation by proposing the concept of the “data dividend.” Similar to how Alaskans receive a Permanent Fund dividend in part for oil drilling in their home state, California officials are proposing that Californians should receive a lump sum of money paid annually for consented use and selling of their data online. Entrepreneur and presidential candidate, Andrew Yang, and California governor, Gavin Newsom have both expressed their support for the "data dividend."[10][11] Many argue this proposal is impractical. For example, former Facebook executive Antonio Martínez contends that Amazon, Google, and Facebook do not believe they owe their consumers anything, because they provide their services for free in exchange for user data that can be used to generate advertising revenue.[12]

General Data Protection Regulation[edit]

The General Data Protection Regulation is a European Union law passed in April 2016.[13]


A Useful Commodity[edit]

How will this be different than talking about how companies use data? Will it focus more on history of when it was first used?

Power and Exploitation[edit]

Again, haven't we already talked about this? How will it be different?


Companies are built on getting data and provide marketing and etc. Obv they are dependent.


Participants who are for privacy and etc.

Accuracy of the Analogy[edit]


  1. Pasternack, A., & Meldenez, S. (2019, May 28). Here are the data brokers quietly buying and selling your personal information. https://www.fastcompany.com/90310803/here-are-the-data-brokers-quietly-buying-and-selling-your-personal-information (accessed December 2, 2019).
  2. Pasternack, A., & Meldenez, S. (2019, May 28). Here are the data brokers quietly buying and selling your personal information. https://www.fastcompany.com/90310803/here-are-the-data-brokers-quietly-buying-and-selling-your-personal-information (accessed December 2, 2019).
  3. Granville, K. (2018, March 19). Facebook and Cambridge Analytica: What You Need to Know as Fallout Widens. https://www.nytimes.com/2018/03/19/technology/facebook-cambridge-analytica-explained.html (accessed November 30, 2019).
  4. Meyer, R. (2018, October 26). The Cambridge Analytica Scandal, in 3 Quick Paragraphs. https://www.theatlantic.com/technology/archive/2018/03/the-cambridge-analytica-scandal-in-three-paragraphs/556046/ (accessed November 30, 2019).
  5. What Are Terms of Service: Everything You Need to Know. (n.d.). https://www.upcounsel.com/what-are-terms-of-service (accessed November 30, 2019).
  6. Oremus, W. (2018, May 21). The Privacy Scandal That Should Be Bigger Than Cambridge Analytica. https://slate.com/technology/2018/05/the-locationsmart-scandal-is-bigger-than-cambridge-analytica-heres-why-no-one-is-talking-about-it.html (accessed November 30, 2019).
  7. The critical security crisis nobody's talking about. (2018, May 22). https://nordvpn.com/blog/securus-locationsmart-phone-tracking/ (accessed November 30, 2019).
  8. California State Legislature. (2018). California Consumer Privacy Act of 2018. https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=201720180AB375 (accessed November 28, 2019).
  9. Californians for Consumer Privacy. (2019). About the California Consumer Privacy Act. https://www.caprivacy.org/about (accessed November 28, 2019).
  10. Clifford, C. (2019). Andrew Yang: You should get a check in the mail from Facebook, Amazon, Google for your data. https://www.cnbc.com/2019/10/17/andrew-yang-facebook-amazon-google-should-pay-for-users-data.html (accessed November 28, 2019).
  11. Daniels, J. (2019). California governor proposes ‘new data dividend’ that could call on Facebook and Google to pay users. https://www.cnbc.com/2019/02/12/california-gov-newsom-calls-for-new-data-dividend-for-consumers.html (accessed November 28, 2019).
  12. Martínez, A. (2019). No, Data Is Not the New Oil. https://www.wired.com/story/no-data-is-not-the-new-oil/ (accessed November 28, 2019).
  13. European Union. (2016). General Data Protection Regulation (GDPR). https://gdpr-info.eu/ (accessed November 28, 2019).