OpenAI

From AI Wiki
See also: Organizations

Introduction

OpenAI is an Artificial Intelligence (AI) research company founded in 2015. Originally a non-profit, intended to be free from the need to generate financial return, it has since also founded OpenAI LP, a for-profit corporation [1] [2] [3]. The stated mission is to promote and develop friendly AI beneficial to all humanity [1] [2]. It received financial support by it's founders members Sam Altman, Greg Brockman and Elon Musk, and also from Jessica Livingston, Peter Thiel, Amazon Web Services, Infosys and YC Research with a $1 billion investment in total [3]. In 2020, Microsoft, another investor in OpenAI, announce an exclusive license agreement for GPT-3 [2].

As of August 2022, the company announced that they would reduce the price of its GPT-3 API by more than 50%, effective from September 1st. This price drop makes access to AI base models more affordable than CohereAI, although still more expensive than GooseAI [4] [5].

Principles

To achieve it's goal of ensuring that artificial general intelligence (AGI) benefits all of humanity, the company has several stated principles on their official website [6]:

  1. Broadly Distributed Benefits: where they commit to try to influence AGI's deployment to ensure it's for the benefit of all, avoiding applications of AI and AGI that are harmful to humanity or lead to a concentration of power while also balancing the need of resources with their mission;
  2. Long-Term Safety: addresses the concern of AGI safety and, therefore, doing research and driving broad adoption of such measure to the AI community. Also, they commit to provide assistance other value-aligned, safety-conscious projects that come close to building AGI before OpenAI;
  3. Technical Leadership: OpenAI must be on the cutting edge of AI capabilities on order to effectively address AGI's impact on society.
  4. Cooperative Orientation: To address AGI's global challenges, OpenAI wants to actively cooperate with other research and policy institutions while also providing public goods that will help society operate on the path of AGI by publishing AI, safety, policy and standards research.

Applications

OpenAI Gym

OpenAI Gym is a toolkit for developing and comparing reinforcement learning (RL) algorithms. While best suited for RL research, it is not restrictive to only that, being able to support alternative methods such as hard coded game solver or other deep learning strategies [7] [8] [9]. The environments are written in Python but, in the future, they'll be easy  to use from any language. OpenAI GYM is compatible with algorithms written in any framework, like Tensorflow and Theano [9]. The platform includes a collection of benchmark problems using a common interface and website where the community can share results and compare the performance of algorithms. Users are also encouraged to share the source code and instructions on how to reproduce their results [7].

DALL-E

Figure 1. Example of Outpainting of August Kamp's "Girl With Earring". Source: DPReview.

According to OpenAI's official website, DALL-E is an "AI system that can create realistic images and art from a description in natural language." It can combine concepts, attributes and styles [10]. In July 2022, the current version, DALL-E 2, entered public beta. This version has a new feature called Outpainting that allows users to continue an image beyond its original borders (figure 1). This new system expands on the "Inpainting" feature, where edits could be made to generated or uploaded images, although confined to preexisting borders. For example, Outpainting can extend original images and create larger ones in different aspect ratios while also considering existing shadows, reflections and textures [11].

DALL-E 2 also allows the editing of images with human features, something that was previously unavailable due to concerns over abuse. Users can upload a photograph of a person and edit specific features like clothing or hairstyle. The implementation of this feature came after OpenAI improved filters to remove images containing sexual, political and violent content [12].

OpenAI has granted users with "full usage rights to commercialize the images they create with DALL-E, including the right to reprint, sell, and merchandise [13]."

GPT-1 and GPT-2

GPT or Generative Pre-trained Transformer was introduced in the paper Improving Language Understanding by Generative Pre-Training in June 2018. GPT-1 combined the transformers architecture with unsupervised learning to create a model with 117 million parameters and trained on 7000 books. GPT-2, released in February 2019 with the paper Language Models are Unsupervised Multitask Learners, had 1.5 billion parameters and was trained 40GB of web text from 8 million documents.

GPT-3

GPT-3 (Generative Pre-trained Transformer 3) is the third generation of a computational system that generates text, code or other data, starting from a source input, the prompt. This system uses deep learning to produce human-like text [2] [14](2, 14). According to Zhang & Li (2021), GPT-3 is the "language model with the most parameters, the largest scale, and the strongest capabilities. Using a large amount of Internet text data and thousands of books for model training, GPT-3 can imitate the natural language patterns of humans nearly perfectly. This language model is extremely realistic and is considered the most impressive model as of today [15]."

To produce relevant results, the system needs to be trained on an unlabeled dataset with texts from Wikipedia and others sites, both in English and other languages. In 2018, the first version of GPT used 110 million leaning parameters, meaning the values that a neural network tries to optimize during training. Version 2 used 1.5 billion and GPT-3 uses 175 billion parameters. This training is costly (around $12 million) and runs on Microsoft's Azure's AI supercomputer [2].

In 2021, MIT Technology Review selected GPT-3 as one of the "Top 10 Breakthrough Technologies" due to its "extensive parameter scale, exceptional modeling ability, multi-task generalization performance, and few-shot learning ability [15]."

Universe

OpenAI Universe is a software platform for measuring and training AI's general intelligence with video games, applications and websites. With this software, an AI agent can use a computer like a human, looking at screen pixels and using a virtual keyboard and mouse, allowing the training of a single agent in tasks that a human can complete with a computer [16] [17]. According to OpenAI, the goal is to "develop a single AI agent that can flexibly apply its past experience on Universe environments to quickly master unfamiliar, difficult environments, which would be a major step towards general intelligence [16]." To achieve that, it was released with Atari 2600 games, 1000 flash games and 80 browser environments [17].

Copilot

GitHub Copilot is an AI software developed by OpenAI and GitHub that suggests code and entire functions in real-time for programmers [18] [19]. It has been described as an autocomplete for software development. Released during the summer of 2021, the tool has been used by tens of thousands of programmers. While it makes erros, feedback provided by professionals has been positive, observing that it accelerates their pace. According to data provided by GitHub and OpenAI, Copilot writes 35 percent of it's users' newly posted code [19].

Whisper

OpenAI Whisper is an open-source automatic speech recognition (ASR) system that enables transcription in multiple languages and translation into English [20] [21] [22]. OpenAI stated that the system has been "trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language [22]." The primary users of Whisper are intended to be AI researchers "studying robustness, generalization, capabilities, biases and constraints of the current model." Several downloadable versions of Whisper are available at Github and, according to OpenAI, have shown strong ASR results in more or less 10 languages [20].

ChatGPT

ChatGPT is a large language model (LLM) trained to generate human-like text based on user input. The technology behind ChatGPT is based on an upgraded version of GPT-3 called GPT-3.5. In addition to text, ChatGPT was trained on programming code, which gives it better reasoning and allows it to have logic in its responses. Additionally, it received supervised learning by fine-tuneing the model on labeled content like labeled dialogue and human-ranked answers in QnA to give it conversational abilities. ChatGPT was introduced in November 2022 and quickly became extremely popular. Millions of people signing up to try it out. ChatGPT has several distinctive features, including the ability to generate text in response to a prompt, interact in a conversational way, provide information on a wide range of topics, and be embedded in various applications. The potential applications of ChatGPT are vast and could revolutionize many different commercial and non-commercial fields, such as content creation and programming.[23]

AI Text Classifier

AI Text Classifier is a AI content detector trained to distinguish between text written by a human and text written by artificial intelligence from various sources. The classifier correctly identifies 26% of AI-written texts as “likely AI-written” while incorrectly labeling 9% of the human-written texts as such. OpenAI made this available for feedback on whether imperfect tools like this one are useful or not. Limitations of the classifier include short input length (below 1000 characters), incorrect labeling of confident predictions, only works in English language, predictable patterns cannot be reliably identified, and editing can evade detection. The training data includes pairs of human and AI-generated responses to prompts submitted to InstructGPT, along with pretraining data sets collected from multiple sources.[24]

References

  1. 1.0 1.1 OpenAI. About. OpenAI. https://openai.com/about/
  2. 2.0 2.1 2.2 2.3 2.4 Floridi, L and Chiriatti, M (2020). GPT-3: Its Nature, Scope, Limits, and Consequences. Minds and Machines 30:681-694.
  3. 3.0 3.1 Olanoff, D (2015). Artificial Intelligence Nonprofit OpenAI launches With Backing of Elon Musk and Sal Altman. TechCrunch. https://techcrunch.com/2015/12/11/non-profit-openai-launches-with-backing-from-elon-musk-and-sam-altman/
  4. Pandey, M (2022). OpenAI Slashes its Prices by 50%. Analytics India Magazine. https://analyticsindiamag.com/openai-slashes-its-prices-by-50/
  5. Dickson, B (2022). OpenAI is reducing the price of the GPT-3 API — here’s why it matters. Venture Beat. https://venturebeat.com/ai/openai-is-reducing-the-price-of-the-gpt-3-api-heres-why-it-matters/
  6. OpenAI. OpenAI Charter. OpenAI. https://openai.com/charter/
  7. 7.0 7.1 Brockman et al. (2016). OpenAI Gym. arXiv:1606.01540v1.
  8. Sonawane, B (2018). Getting started with OpenAI Gym. Towards Data Science. https://towardsdatascience.com/getting-started-with-openai-gym-d2ac911f5cbc
  9. 9.0 9.1 OpenAI (2016). OpenAI Gym Beta. OpenAI. https://openai.com/blog/openai-gym-beta/#rl
  10. OpenAI. DALL-E 2. OpenAI. https://openai.com/dall-e-2/
  11. Gray, J (2022). OpenAI adds 'Outpainting' Feature to its AI System, DALL-E, Allowing Users to Make AI Images of Any Size. Digital Photography Review. https://www.dpreview.com/news/8850471712/openai-adds-outpainting-feature-dall-e-allowing-users-to-make-ai-images-of-any-size
  12. Vincent, J (2022). OpenAI’s Image Generator DALL-E can Now Edit Human Faces. The Verge. https://www.theverge.com/2022/9/20/23362631/openai-dall-e-ai-art-generator-edit-realistic-faces-safety
  13. Rizo, J (2022). Who Will Own the Art of the Future? Wired. https://www.wired.com/story/openai-dalle-copyright-intellectual-property-art/
  14. Wilhelm, A (2021). Ok, the GPT-3 Hype Seems Pretty Reasonable. TechCrunch. https://techcrunch.com/2021/03/17/okay-the-gpt-3-hype-seems-pretty-reasonable/
  15. 15.0 15.1 Zhang, M and Li, J (2021). A Commentary of GPT-3 in MIT Technology Review 2021. Fundamental Research 1(6):831-833.
  16. 16.0 16.1 OpenAI (2016). Universe. OpenAI. https://openai.com/blog/universe/
  17. 17.0 17.1 Mannes, J (2016). OpenAI's Universe is the Fun Parent Every Artificial Intelligence Deserves. TechCrunch. https://techcrunch.com/2016/12/05/openais-universe-is-the-fun-parent-every-artificial-intelligence-deserves/
  18. Github. Your AI Pair Programmer. Github. https://github.com/features/copilot
  19. 19.0 19.1 Thompson, C (2022). It’s Like GPT-3 but for Code—Fun, Fast, and Full of Flaws. Wired. https://www.wired.com/story/openai-copilot-autocomplete-for-code/
  20. 20.0 20.1 Wiggers, K (2022). OpenAI open-sources Whisper, a Multilingual Speech Recognition System. TechCrunch. https://techcrunch.com/2022/09/21/openai-open-sources-whisper-a-multilingual-speech-recognition-system/
  21. Bureau, ET (2022). OpenAI Open-Sources Multilingual Speech Recognition System, Dubbed Whisper. EnterpriseTalk. https://enterprisetalk.com/quick-bytes/openai-open-sources-multilingual-speech-recognition-system-dubbed-whisper/
  22. 22.0 22.1 OpenAI (2022). Introducing Whisper. OpenAI. https://openai.com/blog/whisper/
  23. OpenAI (2022). ChatGPT: Optimizing Language Models for Dialogue. OpenAI. https://openai.com/blog/chatgpt/
  24. Hendrik, J (2023). New AI classifier for indicating AI-written text. OpenAI. https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text/