• Home
  • Sofware
  • Google’s Gemini AI demo turned out to be “fictionalized”

Google’s Gemini AI demo turned out to be “fictionalized”

Google recently introduced the Gemini AI model as a response to ChatGPT and the GPT models behind it. Stating that Gemini surpasses GPT-4 in almost every field, Google also shared various demo videos. Shared six...
 Google’s Gemini AI demo turned out to be “fictionalized”
READING NOW Google’s Gemini AI demo turned out to be “fictionalized”
Google recently introduced the Gemini AI model as a response to ChatGPT and the GPT models behind it. Stating that Gemini surpasses GPT-4 in almost every field, Google also shared various demo videos. A six-minute video shared showed Gemini’s multimodal capabilities (e.g. spoken speech prompts combined with image recognition). However, it turned out that this video was not actually completely real.

Did Google deceive people with its Gemini video?

In the 6-minute demo video, Gemini was able to recognize images, respond within seconds, accurately track the paper hidden under the cup at the cup trick, and more. But this video was a little too good to be true. As a matter of fact, Google accepts this. While this video spread quickly around the world, the description below it seems to have gone unnoticed: “For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.” This explanation is not found in other videos.

On the other hand, this situation is not something to cause outrage. Because companies often do this type of thing in their demo videos. “All user commands and output in the video are real and have been truncated for brevity,” Oriol Vinyals, vice president of research and deep learning leader at Google DeepMind, said in a statement to X. So, according to Google, the capabilities shown in the video are real, just not that reactive. On the other hand, Vinyals states that they have prepared such a video to show what multi-modal user experiences created with Gemini can look like and to inspire developers.

Additionally, it was stated that Gemini was given images and texts and asked to respond by predicting what would happen next. Google also said, “We created the demo by shooting images to test Gemini’s capabilities on a wide range of challenges. We then guided Gemini using frames from the images and through text.” makes his statement. So while Gemini appeared to be doing the things Google showed in the video (with immediate reactivity), it didn’t, and perhaps couldn’t, do them live and in the way they implied.

Google calls Gemini the most advanced AI model, and perhaps it really is. We don’t know that yet, but the most important thing is that this model is basically “multimodal”. In other words, it can process input such as photos, videos, audio and text. ChatGPT and others do this with plugins, they’re basically not true multimodel. Besides these, it might be better for Google to launch a small beta version to understand the true potential of Gemini. This way, people can challenge the model in real-world conditions and experience how powerful it is.

Comments
Leave a Comment

Details
141 read
okunma43582
0 comments