{"id":89,"date":"2024-05-14T20:42:05","date_gmt":"2024-05-14T20:42:05","guid":{"rendered":"https:\/\/artificial-intelligence.news\/?p=89"},"modified":"2024-05-14T20:42:05","modified_gmt":"2024-05-14T20:42:05","slug":"gpt-4o-the-new-gpt-on-the-block","status":"publish","type":"post","link":"https:\/\/artificial-intelligence.news\/?p=89","title":{"rendered":"GPT-4o &#8211; The New GPT on the Block"},"content":{"rendered":"\n<p>OpenAI released, GTP-4o, their latest GPT model yesterday. The &#8220;o&#8221; stands for omni, and hints at the models ability to process audio, image, video and text seamlessly. The demo was performed live, where the GPT responded using audio and with a rich personality. Amongst the showcased talents were voice acting, singing, real time translation and recognising hand written notes.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"Live demo of GPT-4o vision capabilities\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/RI-BxtCx32s?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p>One of the focal points was when it solved a linear algebra problem in real time, as it was written down on paper. This contrasts with the Google Gemini demo where it recognised a hand drawn duck, but was later controversially revealed to have been staged.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">AI Companion<\/h2>\n\n\n\n<p>The main strength of the model appears to lie in it&#8217;s holistic abilities of combining video, audio images and text. Where it can seamlessly switch between any of these as a source of input. This is similar to what the Humane AI Pin tried to achieve, which Marques Brownless famously roasted as &#8220;bad at almost everything&#8221;:<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"The Worst Product I&#039;ve Ever Reviewed... For Now\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/TitZV6k8zfA?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p>However, GPT-4o avoids many of the pitfalls of the Humane AI pin. It is packaged as a convenient mobile and desktop app, is accurate and has quick response times. Which could make it a viable real life assistant for people in their day to day activities.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Performance<\/h2>\n\n\n\n<p>The new model is capable of conversing with response times matching human levels. It matches GPT4 performance for English text and code, improves on it for non-English text, is faster and the API is 50% cheaper.<\/p>\n\n\n\n<p>On the 0-shot COT MLU benchmark, for genereal knowledge text questions, GTP-4o comes ahead of the Claude (Anthropic AI), Gemini (Google) and Llama (Facebook) models. With a record breaking 88.7% accuracy. <\/p>\n\n\n\n<p>It also paves the way with new high scores for audio translation, standardised exams answering and visual recognition. Although it lags behind Whisper-v3 for audio recognition, but with that being an OpenAI model too it may be integrated with the API.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Availability<\/h2>\n\n\n\n<p>The new video and voice capabilities of the model have been recognised as a potential security risk, so it will only be rolled out to select red-teams for vulnerability analysis for now.<\/p>\n\n\n\n<p>Text and image capabilities are being made available as part of the free and paid tiers, with plans to make voice available to ChatGPT Plus subscribers in the coming weeks.<\/p>\n\n\n\n<p>For developers, text and vision are currently available via the API, boasting a 5x higher rate limit, 2x the speed and 1\/2 the price of GPT4. The video and audio abilities will be rolled out more gradually, starting with a small group of trusted users.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>OpenAI released, GTP-4o, their latest GPT model yesterday. The &#8220;o&#8221; stands for omni, and hints at the models ability to process audio, image, video and text seamlessly. The demo was performed live, where the GPT responded using audio and with a rich personality. Amongst the showcased talents were voice acting, singing, real time translation and [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":92,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-container-style":"default","site-container-layout":"default","site-sidebar-layout":"default","disable-article-header":"default","disable-site-header":"default","disable-site-footer":"default","disable-content-area-spacing":"default","footnotes":""},"categories":[1],"tags":[],"class_list":["post-89","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"_links":{"self":[{"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=\/wp\/v2\/posts\/89","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=89"}],"version-history":[{"count":1,"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=\/wp\/v2\/posts\/89\/revisions"}],"predecessor-version":[{"id":93,"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=\/wp\/v2\/posts\/89\/revisions\/93"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=\/wp\/v2\/media\/92"}],"wp:attachment":[{"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=89"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=89"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/artificial-intelligence.news\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=89"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}