According to a research blog post, OpenAI today released GPT-4, the next-generation AI language model that can interpret photographs and describe what's in them.
The world has been captivated by Chat GPT-3, but the deep learning language model only supported text inputs until recently. GPT-4 will also accept picture prompts.
“It generates text outputs given inputs consisting of interspersed text and images,” OpenAI write today. “Over a range of domains — including documents with text and photographs, diagrams, or screenshots — GPT-4 exhibits similar capabilities as it does on text-only inputs.”
In real life, this means that the AI chatbot will be able to assess the contents of an image. As an illustration, it can explain to the user what odd about the image below of a man ironing his clothing while linked to a taxi is.
Andreas Braun, chief technical officer for Microsoft Germany, stated last week that GPT-4 will "offer completely different possibilities — for example, videos."
Nevertheless, according to today's release, GPT-4 does not mention video, and the sole multimodal component is the input of photos, which is far less than anticipated.
Kosmos-1, a multi-modal language model from Microsoft that works with several formats, has previously been introduced.
The AI in the Kosmos-1 presentation can read pictures in addition to photos. For instance, the AI is asked, "What time is it now?" after receiving a picture of a clock displaying the time as 10:10. The AI responds, "10:10 on a gigantic clock," to that.
Additionally, it can identify a specific hairdo that a woman is sporting or a movie poster and notify the user when the movie will be released.
The "iPhone Moment"
The CEO of Microsoft Germany, Marianne Janik, joined Braun at the "AI in Focus — Digital Kickoff" event in Germany. Janik calls ChatGPT "an iPhone moment."
She claims that the goal is to perform repetitious work in a new way than before rather than to eliminate jobs, according to Heise.
She asserts that "disruption does not inevitably mean job losses." To make AI use valuable, numerous professionals would be needed.
With 100 million users, Chat GPT is the consumer app with the fastest growth rate in history.
Elon Musk, who co-founded the business with OpenAI and runs DALL-E, criticized it and quit it in 2018.
On February 17, he wrote, "OpenAI has changed from being an open source, maximum-profit company to being a closed source, maximum-profit company effectively controlled by Microsoft. OpenAI was founded as an open source (which is why I named it "Open" AI), non-profit organization to serve as a counterweight to Google. Not at all what I had in mind.
Update 14/3: This post has been revised in light of OpenAI's GPT-4 release, which stated there is absolutely no video in the model and that images can only be entered, rather being generated as originally believed.
Source : https://petapixel.com/2023/03/14/chat-gpt-4-will-let-you-turn-text-into-video-and-is-coming-next-week/
No comments
Post a Comment