AI Chatbot Has All the Answers
An AI chatbot chatting with the elderly at a care service center in Heifei, capital of east China's Anhui province. (PHOTO: XINHUA)
By Staff Reporters
An AI chatbot ChatGPT has become an Internet sensation after it was launched recently. It is a sibling model to InstructGPT, which is trained to follow an instruction in a prompt and provide a detailed response, according to its developer-an AI company OpenAI, which was co-funded by Elon Musk and others.
ChatGPT is described as an alternative to Google's search engine, as it can provide thoughtful and thorough responses to questions and prompts, no matter how easy or complex. For example, it can write lyrics or a poem, write computer code and fix broken code, gather highly specific information, provide recommendations, as well as create automated chatbots.
ChatGPT has already impressed many technologists with its ability to mimic human language and speaking styles while also providing coherent and topical information, according to NBC news. For example, a user asked ChatGPT to "explain zero point energy but in the style of a cat". It responded, "Meow, meow, meow, meow! Zero point energy is like the purr-fect amount of energy ..."
The OpenAI team trained this model using Reinforcement Learning from Human Feedback, using the same methods as InstructGPT. They trained an initial model using supervised fine-tuning: human AI trainers provided conversations in which they were the user and an AI assistant. The team gave the trainers access to model-written suggestions to help them compose their responses.
To create a reward model for reinforcement learning, the team collect comparison data, which consisted of two or more model responses ranked by quality. Using these reward models, the team can fine-tune the model using Proximal Policy Optimization.