AI Democratization in the Era of GPT-3

AI Democratization in the Era of GPT-3

. 7 min read

On September 22nd, Microsoft announced that “Microsoft is teaming up with OpenAI to exclusively license GPT-3”. In the conclusion of the announcement, they state “we’ll also continue to work with OpenAI to keep looking forward: leveraging and democratizing the power of their cutting-edge AI research as they continue on their mission to build safe artificial general intelligence”. Yet in contrast to this statement, the exclusivity agreement and the prior decision by OpenAI to not open source the GPT-3 code represent troubling developments for the notion of democratizing AI, due to a host of concerns that I’ll address in this piece.

To me, AI democratization means making it possible for everyone to create artificial intelligence systems. This involves

  1. Having access to powerful AI models
  2. Having access to algorithms
  3. Having access to computing resources necessary to use algorithms and models
  4. Being able to use the algorithms and models, potentially without requiring advanced mathematical and computing science skills

Democratization of AI means more people are able to conduct AI research and/or build AI-driven products and services. On the research side, more researchers mean more diverse solutions to more research challenges can be explored with the potential for new breakthroughs and breakthroughs happening more often. On the product and service side, democratization means more diverse sets of people working on realizing things of value to their communities that might otherwise be overlooked. It also means more market competition. Democratization means lowered barriers to entry in terms of resources and knowledge.

For the purposes of this piece, I focus primarily on the "having access to powerful AI models" part of democratization since GPT-3 is such a pre-built AI model. Other relevant things to know about it from the perspective of AI democratization include:

  1. The neural network model is so large that it cannot easily be moved off the cluster of machines it was trained on
  2. It is so large that it cannot be trained without hundreds of thousands of dollars worth of cloud computing resources
  3. It is accessible to those outside of OpenAI via an API — that is once can send inputs and receive outputs but not see any of the details within the model
  4. OpenAI has only provided access to a small number of external people, currently on a trial basis
  5. OpenAI intends to eventually sell access to the API
  6. OpenAI has now exclusively licensed GPT-3 to Microsoft — Microsoft will be the only company outside of OpenAI to have the ability to directly access the code and the model. Others will still be able to access GPT-3 through the API. It is unclear at this time whether the OpenAI API itself will be discontinued.

GPT-3 and other very very large models created at Microsoft and Google are very concerning in how they affect “democratization” of AI. The issue was raised at least as early as 2016 by an Aspen Institute report. Since then we have seen models get larger and more expensive.

One of the many concerns is replicability of very large models. From a scientific perspective, it is desirable for outside groups to replicate the results in research papers. However there are very few groups that can expend the resources for replicability purposes. These groups, being predominantly corporate don’t necessarily have the incentives to do replication. The development of Grover as a quasi-replication of GPT-2 is an interesting case, the Allen Institute being a well-funded non-profit.

One good thing is that Google, Microsoft, OpenAI, Facebook, Salesforce, and others have made their models publicly available. The average person could not recreate models of this size from scratch, but the models can run on a single machine with a single GPU. This already presents a risk to democratization of AI because even the requirement of having modern GPUs can be a barrier. This is somewhat mitigated by online cloud services such as Google Colab (and other cloud services like Google Cloud, Amazon Web Services, Azure, etc. though these can be costly).

GPT-3 represents a new circumstance. For the first time, a model is so big it cannot be easily moved to another cloud and certainly does not run on a single computer with a single or small number of GPUs. Instead OpenAI is providing an API so that the model can be run on their cloud. Right now there are free trials and academic access to the API, but this may not last as OpenAI has advertised intentions to monetize access to GPT-3.

This creates a risk to democratization in a number of ways. First, OpenAI will have control over who has access. If we believe that the road to better AI in fact is a function of larger models, then OpenAI becomes a gatekeeper of who can have good AI and who cannot. They will be in a position to exert influence (explicitly or implicitly) over wide swaths of the economy. Unlike GPT-2 or BERT or other models, which can be downloaded without any terms of service agreements, OpenAI could just turn off API access.

Further, depending on the pricing model, certain groups would be more severely impacted. Would academics be able to afford access? Individual hobbyists? Startups that don’t yet have revenue streams? Non-profit humanitarian organizations? Activists? (Activism is a particularly interesting case. Suppose there are ways to use AI to help promote pro-democracy causes. That put OpenAI in a position of potentially harming pro-democracy movements around the world. What about activism that promotes regulation of the tech industry or anything else that might be arguably beneficial to Americans but unfavorable to industry?)

Other questions about the power that OpenAI could choose to assume (or will be thrust upon them):

  • If OpenAI decides to limit uses, what are the governing principles they use to decide who can use GPT-3 and who gets cut off?
  • Will OpenAI look at the outputs and try to make judgement calls about whether their technology is being used appropriately? This seems to be a critical question given OpenAI’s mission statement and how it is at odds with their new for-profit mode. Can they even monitor at scale?
  • If they sell GPT-3 to another company that uses it for malicious (intentional or unintentional) purposes, will they provide recourse to those who are adversely affected? There is some precedent with other cloud services who have tried to prohibit certain uses, such as trying to identify and stop bitcoin mining on AWS.
  • Will people negatively affected by something generated by GPT-3 even be able to tell content is from GPT-3 so they seek recourse from OpenAI?
  • Would OpenAI, now in a for-profit mode, be tempted to monitor its use and take ideas for new services that they can monetize, thus taking an unfair advantage over competitors. They will have the prompts and the outputs so they would not have to reinvent anything to know what will work well.

Up until now, the big 4 have furthered the democratization of AI. Interestingly, fine-tuning BERT is relatively easy and can be done with small datasets, meaning that users of the models can train good models with less resources and less time. GPT-3 presents a potentially huge mode shift in accessibility of state of the art AI technologies. Part of this is that GPT-3 actually looks useful for a lot of things that are not just research. In contrast, DOTA-playing agents and StarCraft playing agents are large and unwieldy too but their use is rather limited so there hasn’t been a large demand for them.

From the research side of things, we continue to see a strong correlation between the size of the model and the performance on many popular benchmarks. Universities can no longer compete on this. Academics can still make valuable research contributions by showing new ways to do things that work on smaller, publicly available models. However, it does put a number of research goals out of the realm of possibility without collaborating with large tech firms. Universities are a major source of democratization of AI because research results are generally made publicly available through research papers (and increasingly code repositories).

Companies are under no such incentives to make their research publicly available. Up until now they have done it (to our knowledge, of course). They have proprietary engineering systems, but the general knowledge has been put out. At some point it is possible that economic pressures make it so that the major tech firms no longer write papers or release models. In that sense, GPT-3 and its exclusive licensing by Microsoft could be seen as a sign of things to come.

Further, If one were to believe the “windfall” theory of the singularity (I do not subscribe to this), then the first to develop true artificial general intelligence will basically dominate almost all sectors of the world economy and become a single global monopoly. If that were to become the prevailing belief, then it would be incumbent for a tech firm to cut off all access to AI research and development produced in-house.

There is a parallel with the challenges that Facebook and Twitter face as they have become de facto public services while still being for-profit companies. They are in a position to decide what is normative or not for public discourse, who gets punished and who is allowed to break the rules. What is ethical and moral? The answer to that is often in conflict with what is good for a company’s financial bottom line. OpenAI may soon find itself in a similar situation where it is the defacto arbiter of ethics and morality with regard to the deployment of AI services. Are they prepared? Do we trust them to take on this role? And if not, how can academics and practitioners fight for the continued democratization of AI, as some of its most important techniques become as hard to replicate as GPT-3?

This is an updated version of a piece originally released on Medium.


Author Bio
Mark Riedl is an Associate Professor in the College of Computing and Associate Director of the Machine Learning Center at the Georgia Institute of Technology.  Dr. Riedl's research is on human-centered artificial intelligence and machine learning, specifically focusing on ethics, safety, explainability, storytelling, and computer games. Dr. Riedl earned a PhD degree in 2004 from North Carolina State University. From 2004 to 2007, Dr. Riedl was a Research Scientist at the University of Southern California Institute for Creative Technologies.  Dr.  Riedl joined the Georgia Tech College of Computing in 2007.

Acknowledgments
The main image of this piece originated from this OpenAI blog post.

Citation
For attribution in academic contexts or books, please cite this work as

Mark Riedl, "AI Democratization in the Era of GPT-3", The Gradient, 2020.

BibTeX citation:

@article{rield2020democratizationgpt3,
 author = {Riedl, Mark},
 title = {AI Democratization in the Era of GPT-3},
 journal = {The Gradient},
 year = {2020},
 howpublished = {\url{https://thegradient.pub/transformers-are-gaph-neural-networks/ } },
}

If you enjoyed this piece and want to hear more, subscribe to the Gradient and follow us on Twitter.