{"id":1775,"date":"2024-03-14T11:39:47","date_gmt":"2024-03-14T11:39:47","guid":{"rendered":"https:\/\/geneea.com\/news\/?p=1775"},"modified":"2026-01-27T20:47:22","modified_gmt":"2026-01-27T20:47:22","slug":"geneeas-ai-spotlight-9","status":"publish","type":"post","link":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9","title":{"rendered":"Geneea&#8217;s AI Spotlight #9"},"content":{"rendered":"\n<p>The ninth edition of our newsletter on Large Language Models is here.&nbsp;<\/p>\n\n\n\n<p>In this edition, we look at<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>the just-approved European AI Act,<\/li>\n\n\n\n<li>challenges encountered in the wild real world,<\/li>\n\n\n\n<li>the adoption of AI in newsrooms,<\/li>\n\n\n\n<li>new models and their emerging uses,<\/li>\n\n\n\n<li>model evaluation, and<\/li>\n\n\n\n<li>improved training and prompting methods.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">European regulation<\/h2>\n\n\n\n<p>The <strong>European AI Act (<\/strong><a href=\"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-3\/#aia\">see Spotlight #3<\/a><strong>) <\/strong>was <strong>approved<\/strong> unanimously by the <a href=\"https:\/\/www.euractiv.com\/section\/artificial-intelligence\/news\/eu-countries-give-crucial-nod-to-first-of-a-kind-artificial-intelligence-law\/\">Council of the EU<\/a> (= the ministers from all member states) in February and by the <a href=\"https:\/\/www.euractiv.com\/section\/artificial-intelligence\/news\/europes-landmark-ai-act-passes-parliament-vote\/\">parliament yesterday<\/a> (85% of MEPs voted for it). The newly created <a href=\"https:\/\/digital-strategy.ec.europa.eu\/en\/policies\/ai-office\">European AI Office<\/a> started to hire experts and should soon publish tools, methodologies, and benchmarks for the evaluation of AI systems.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Challenges<\/h2>\n\n\n\n<p><strong>We cannot trust the LLMs<\/strong><\/p>\n\n\n\n<p>Both Google and Anthropic <a href=\"https:\/\/www.wsj.com\/articles\/google-and-anthropic-are-selling-generative-ai-to-businesses-even-as-they-address-its-shortcomings-ff90d83d\">acknowledged to WSJ<\/a> that <strong>hallucinations<\/strong> are a serious <strong>problem in the adoption<\/strong> of LLMs. According to Eli Collins, a VP at Google DeepMind: &#8220;We\u2019re not in a situation where you can just trust the model output.&#8221; But as Jared Kaplan, a co-founder of Anthropic, says, we cannot simply make the models more cautious because they would always answer: &#8220;I don\u2019t know the context.&#8221; Both Collins and Kaplan talked about the<strong> importance of<\/strong> users <strong>validating<\/strong> any LLM <strong>response<\/strong>, and the providers should make it easy by <strong>identifying<\/strong> the <strong>sources<\/strong> of any answer.&nbsp;<\/p>\n\n\n\n<p><strong>Chatbot responsibility<\/strong><\/p>\n\n\n\n<p>However, <a href=\"https:\/\/www.bbc.com\/travel\/article\/20240222-air-canada-chatbot-misinformation-what-travellers-should-know\">a court in Canada disagrees with Collins and Kaplan<\/a>, at least in the case of customer-service chatbots. Air Canada&#8217;s <strong>chatbot provided<\/strong> a passenger with <strong>incorrect<\/strong> instructions for a <strong>discount<\/strong>. It did not matter that the bot provided a link to the correct policy. The <strong>airline<\/strong> was <strong>ordered to pay<\/strong> anyway. According to the court, the airline &#8220;is <strong>responsible for<\/strong> all the <strong>information on its website<\/strong>. It makes no difference whether the information comes from a static page or a chatbot.&#8221;<\/p>\n\n\n\n<p><strong>Microsoft AI Bot might not be worth the money<\/strong><\/p>\n\n\n\n<p><strong>Microsoft<\/strong> is <strong>pushing AI to<\/strong> all of its <strong>products<\/strong>, even Notepad. It has been testing AI Copilot in MS Office (now called Microsoft 365): it summarizes emails, creates presentations, writes memos, and so on. Selected companies were able to test the tool, and <a href=\"https:\/\/www.wsj.com\/articles\/early-adopters-of-microsofts-ai-bot-wonder-if-its-worth-the-money-2e74e3a2\">according to the Wall Street Journal<\/a>, they are not really persuaded. The <strong>employees<\/strong> were <strong>eager to test it<\/strong>, but their enthusiasm waned quickly as the tool made <strong>frequent mistakes<\/strong>. The expected price of $30\/month\/user does <strong>not<\/strong> seem <strong>worth it<\/strong>.<\/p>\n\n\n\n<p><strong>Don&#8217;t let AI do your taxes yet<\/strong><\/p>\n\n\n\n<p>TurboTax and H&amp;R Block, two major <strong>tax service<\/strong> providers in the U.S., offer their users a <strong>chatbot<\/strong>-based tax <strong>expert<\/strong>. Or at least that&#8217;s what they claim. <strong>Washington Post warns<\/strong> its readers to <strong>avoid the chatbots<\/strong> unless they want to get into trouble. In their tests, the <a href=\"https:\/\/www.washingtonpost.com\/technology\/2024\/03\/04\/ai-taxes-turbotax-hrblock-chatbot\/\">AI &#8220;experts&#8221; were wrong<\/a> in more than 50% of answers for TurboTax and 30% for H&amp;R Block.<\/p>\n\n\n\n<p><strong>ASCII art<\/strong><\/p>\n\n\n\n<p>Even though AI still cannot help you with taxes, it can help you build a bomb.<\/p>\n\n\n\n<p>LLMs are trained on all kinds of data, so if left unchecked, they can advise on all kinds of questionable activities. Various <strong>techniques<\/strong>, such as data filtering and supervised fine-tuning, have been designed <strong>to prevent<\/strong> the models from doing it. However, all these methods <strong>have <\/strong>their <strong>limitations<\/strong>.&nbsp;<\/p>\n\n\n\n<p>Researchers from the University of Washington <a href=\"https:\/\/arxiv.org\/abs\/2402.11753\">managed to use ASCII art<\/a> to <strong>get past the defenses<\/strong> of GPT-3.5, GPT-4, Gemini, Claude, and Llama2.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">AI and news media<\/h2>\n\n\n\n<p><strong>NYT\u2019s Zach Seward on AI-powered journalism&nbsp;<\/strong><\/p>\n\n\n\n<p>Zach Seward, the new editorial director of AI initiatives at The New York Times, recently gave a <a href=\"https:\/\/www.niemanlab.org\/2024\/03\/ai-news-thats-fit-to-print-the-new-york-times-editorial-ai-director-on-the-current-state-of-ai-powered-journalism\/\">talk about AI<\/a> at the SXSW Conference in Austin, Texas. He provides concrete examples of how AI is used in journalism, both bad (such as Sports Illustrated publishing awful articles automatically written by made-up journalists; see <a href=\"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-7\/#news&amp;media\">Spotlight #7<\/a>) and good (e.g., searching large troves of documents with embeddings, rephrasing prison policies and audits for public consumption).&nbsp;<\/p>\n\n\n\n<p><strong>Columbia Journalism Review \u2013 Report on AI in the news<\/strong><\/p>\n\n\n\n<p>Columbia Journalism Review, a magazine for journalists by Columbia University, published a detailed report by <a href=\"https:\/\/twitter.com\/_FelixSimon_\">Felix Simon<\/a> on the use of AI in the news. From July 2021 to September 2023, he <strong>interviewed<\/strong> more than 130 news workers, including journalists, data scientists, and product managers in the US, UK, and German media, including The Guardian, Bayerischer Rundfunk, the Washington Post, The Sun, and the Financial Times. He also interviewed 36 independent American and European experts.<\/p>\n\n\n\n<p>In general, newsrooms have become <strong>more open to AI<\/strong>. The adoption is driven mostly by <strong>economic pressure<\/strong>, <strong>technology readiness<\/strong>, and <strong>hype<\/strong>.<\/p>\n\n\n\n<p>For now, AI has brought <strong>no fundamental change<\/strong> and instead only helped make some old approaches more effective. Many of the most beneficial AI applications are quite mundane (transcription, search, content categorization, etc.).<\/p>\n\n\n\n<p>There are some important challenges:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Reputational risks due to the <strong>unreliability of AI output<\/strong><\/li>\n\n\n\n<li>Increasing <strong>dependence on major tech companies<\/strong> (Google, Amazon, and Microsoft), especially for smaller publishers that cannot afford in-house AI development<\/li>\n\n\n\n<li>Increasing <strong>inequality among news organizations<\/strong>, with large international publishers having an advantage<\/li>\n<\/ul>\n\n\n\n<p>You might also want to look at NiemanLab\u2019s <a href=\"https:\/\/www.niemanlab.org\/2024\/02\/ai-adoption-in-newsrooms-presents-a-familiar-power-imbalance-between-publishers-and-platforms-new-report-finds\/\">summary<\/a> or the <a href=\"https:\/\/www.cjr.org\/tow_center_reports\/artificial-intelligence-in-the-news.php\">full report<\/a>.<\/p>\n\n\n\n<p><strong>BBC Generative AI pilots<\/strong><\/p>\n\n\n\n<p>BBC announced they are <a href=\"https:\/\/www.bbc.co.uk\/mediacentre\/articles\/2024\/update-generative-ai-and-ai-tools-bbc\">starting 12 generative AI pilots<\/a>. Unlike Air Canada or HR Block, BBC is more <strong>cautious<\/strong>: most of the <strong>pilots<\/strong> are <strong>internal<\/strong> only, and the resulting content won\u2019t be made public. The pilots can be organized into <strong>three groups<\/strong>:<\/p>\n\n\n\n<p>1) <strong>Maximizing<\/strong> the <strong>value<\/strong> of <strong>existing<\/strong> content, for example, translating or reformatting existing content, e.g., writing an article based on a live sports radio commentary.<\/p>\n\n\n\n<p>2) <strong>New audience experiences<\/strong>, such as a chatbot providing personalized learning to students.<\/p>\n\n\n\n<p>3) <strong>SImplifying and speeding up <\/strong>processes, for example, suggesting headlines and summaries.<\/p>\n\n\n\n<p><strong>AI at Ippen Digital<\/strong><\/p>\n\n\n\n<p><a href=\"https:\/\/www.linkedin.com\/in\/nikitaroy\/\">Nikita Roy<\/a> from <a href=\"https:\/\/www.newsroomrobots.com\/p\/how-germanys-ippen-digital-is-fine\">Newsroom Robots interviewed<\/a> <a href=\"https:\/\/www.linkedin.com\/in\/alessandro-alviani\/\">Alessandro Alviani<\/a>, the product lead for AI at Ippen Digital, a part of the German Ippen Media Group. In the second part of the interview, they discuss <strong>finetuning<\/strong> language models <strong>on<\/strong> the corpus of Ippen\u2019s <strong>local<\/strong> German <strong>news<\/strong> content so that it can <strong>assist journalists<\/strong> with writing headlines, lead paragraphs, summaries, etc. Another interesting topic they discuss is the role of AI in the context of <strong>modular journalism<\/strong>. It can break a <strong>story into modules<\/strong> that can be selected and mixed to create <strong>personalized content<\/strong>.<\/p>\n\n\n\n<p><strong>Training Data &amp; Intellectual Property<\/strong><\/p>\n\n\n\n<p>News publishers take different positions regarding their articles being used by AI companies to train their models. While <a href=\"https:\/\/www.theverge.com\/2023\/12\/27\/24016212\/new-york-times-openai-microsoft-lawsuit-copyright-infringement\">The New York Times<\/a> has <strong>sued<\/strong> OpenAI and Microsoft (<a href=\"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-8\/#Business\">see Spotlight #8<\/a>), <a href=\"https:\/\/www.theguardian.com\/media\/2024\/feb\/08\/news-corp-in-advanced-negotiations-with-ai-companies-over-access-to-content-ceo-says\">News Corp<\/a> (the parent of Dow Jones, The Times, and The Sun) is <strong>negotiating<\/strong>,<strong> <\/strong>and <a href=\"https:\/\/www.axelspringer.com\/en\/ax-press-release\/axel-springer-and-openai-partner-to-deepen-beneficial-use-of-ai-in-journalism\">Axel Springer<\/a> has already closed a <strong>partnership<\/strong> with OpenAI.&nbsp;<\/p>\n\n\n\n<p>In the meantime, more and more newspapers are <strong>blocking AI crawlers<\/strong>. <a href=\"https:\/\/reutersinstitute.politics.ox.ac.uk\/how-many-news-websites-block-ai-crawlers\">According to the Reuters Institute<\/a>, 48% of 150 top news sites across 10 countries block OpenAI. Compare that with 33% of the top 1000 websites, as <a href=\"https:\/\/originality.ai\/ai-bot-blocking\">reported by <\/a><a href=\"http:\/\/originality.ai\/\">Originality.ai<\/a> (<a href=\"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-5\/#trainingdata\">see also Spotlight #5<\/a>). Fewer websites block Common Crawl (18% of the top websites), Google AI crawler (24% of the top news sites and 10% of the top websites), and Anthropic (4% of the top websites).<\/p>\n\n\n\n<p><strong>In short<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The web is now <a href=\"https:\/\/www.vice.com\/en\/article\/y3w4gw\/a-shocking-amount-of-the-web-is-already-ai-translated-trash-scientists-determine?\">flooded with low-quality<\/a> <strong>translated<\/strong> and generated clickbait <strong>articles<\/strong>. However, <a href=\"https:\/\/searchengineland.com\/human-vs-ai-generated-content-survey-437062\">a survey found<\/a> that the AI generation may be <strong>good<\/strong> enough for <strong>marketing copies<\/strong>, and people liked how quickly it got to the point.&nbsp;<\/li>\n\n\n\n<li>Newsquest uses a few <a href=\"https:\/\/www.theguardian.com\/technology\/2023\/dec\/28\/how-one-of-the-worlds-oldest-newspapers-is-using-ai-to-reinvent-journalism\">AI-assisted reporters<\/a> to report on \u201cmundane but necessary\u201d content, <strong>freeing reporters<\/strong> to go into the field. That&#8217;s something we know well from our <a href=\"https:\/\/geneea.com\/case-studies\/ctk\">collaboration with the Czech News Agency<\/a>. Hopefully, this approach can help with the ongoing <a href=\"https:\/\/europeanjournalists.org\/blog\/2024\/02\/28\/news-deserts-on-the-rise-a-first-comparative-study-indicates-the-fragile-situation-for-local-media-across-the-eu\/\">local news problems<\/a>.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">New models and assistants<\/h2>\n\n\n\n<p>The <strong>competition<\/strong> among AI leaders <strong>intensifies<\/strong>. Google <a href=\"https:\/\/techcrunch.com\/2024\/02\/15\/googles-new-gemini-model-can-analyze-an-hour-long-video-but-few-people-can-use-it\/\">unveils Gemini 1.5<\/a> with a huge 1 million token context window and Gemini Advanced. Anthropic <a href=\"https:\/\/tech.co\/news\/chatgpt-vs-claude-3\">debuts Claude 3<\/a>, Mistral <a href=\"https:\/\/techcrunch.com\/2024\/02\/26\/mistral-ai-releases-new-model-to-rival-gpt-4-and-its-own-chat-assistant\/\">introduces Mistral Large<\/a>, and Inflection AI <a href=\"https:\/\/www.marktechpost.com\/2024\/03\/09\/inflection-ai-presents-inflection-2-5-an-upgraded-ai-model-that-is-competitive-with-all-the-worlds-leading-llms-like-gpt-4-and-gemini\/\">launches Inflection-2.5<\/a>. All are said to be <strong>GPT-4 class models<\/strong>. OpenAI counters this surge of models with their <strong>text-to-video<\/strong> <a href=\"https:\/\/techcrunch.com\/2024\/02\/15\/openais-newest-model-can-generate-videos-and-they-look-decent\/\">Sora model<\/a>, overshadowing Google&#8217;s text-to-video <a href=\"https:\/\/pub.towardsai.net\/lumiere-googles-amazing-video-breakthrough-158587d68dfe\">Lumiere model<\/a> and the other LLM releases. On the other hand, Gemini Advanced gathers some unwanted attention for <a href=\"https:\/\/www.theverge.com\/2024\/2\/22\/24079876\/google-gemini-ai-photos-people-pause\">generating too inclusive images<\/a>.<\/p>\n\n\n\n<p>We recommend reading <a href=\"https:\/\/www.linkedin.com\/in\/emollick\/\">Ethan Mollick<\/a>&#8216;s <a href=\"https:\/\/www.oneusefulthing.org\/p\/google-gemini-advanced-tasting-notes\">post about Gemini Advanced<\/a>. He dismisses the value of standard benchmarks and offers a high-level comparison with GPT-4. He also feels the model exhibits so-called &#8220;sparks&#8221; or &#8220;ghosts&#8221; of general intelligence, such as when playing a Dungeons and Dragons type of game with it.<\/p>\n\n\n\n<p><strong>Search and assistants<\/strong><\/p>\n\n\n\n<p>A popular use for LLMs is <strong>AI-powered multimodal search<\/strong>, as seen with platforms like <a href=\"http:\/\/you.com\/\">You.com<\/a>, <a href=\"http:\/\/perplexity.ai\/\">Perplexity.ai<\/a>, and the AI <a href=\"https:\/\/www.theverge.com\/2024\/1\/28\/24053882\/arc-search-browser-web-app-ios\">browser Arc<\/a>. Google responds to this by <a href=\"https:\/\/nymag.com\/intelligencer\/2024\/01\/new-ai-powered-google-chrome-browser-end-of-human-internet.html\">integrating <\/a>an LLM <strong>writing assistant into the Chrome <\/strong>browser. Meanwhile, OpenAI is <a href=\"https:\/\/www.reuters.com\/technology\/openai-developing-software-that-operates-devices-automates-tasks-information-2024-02-07\/\">developing an AI agent<\/a> to <strong>automate tasks<\/strong> on user devices, while HuggingFace is <a href=\"https:\/\/venturebeat.com\/ai\/hugging-face-launches-open-source-ai-assistant-maker-to-rival-openais-custom-gpts\/\">challenging their <\/a>GPTs with Assistants, an open-source alternative for creating <strong>customized chatbots<\/strong>. Microsoft aims higher, <a href=\"https:\/\/arxiv.org\/pdf\/2402.05929.pdf\">constructing an Agent Foundation Model<\/a> encompassing language, image\/video encoders, and an <strong>action encoder<\/strong> trained on robotics and games.<\/p>\n\n\n\n<p>Following OpenAI&#8217;s example, <a href=\"https:\/\/www.theverge.com\/2024\/2\/26\/24083510\/microsoft-mistral-partnership-deal-azure-ai\">Mistral partners with Microsoft<\/a> to launch <strong>Le Chat<\/strong> chatbot. Nvidia <a href=\"https:\/\/www.computerworld.com\/article\/3712921\/nvidia-unveils-chat-with-rtx-a-personal-ai-chatbot-for-windows.html\">releases<\/a> a <strong>locally run chatbot<\/strong> Chat with RTX, free from privacy concerns.&nbsp;<\/p>\n\n\n\n<p>Amazon <a href=\"https:\/\/medium.com\/ai-frontier-x\/amazon-launches-rufus-genai-shopping-a901dbc3633b\">debuts its shopping assistant<\/a> <strong>Rufus<\/strong>. It will be interesting to see how it handles the <a href=\"https:\/\/arstechnica.com\/ai\/2024\/01\/lazy-use-of-ai-leads-to-amazon-products-called-i-cannot-fulfill-that-request\/\">wrongly AI-generated product descriptions<\/a>.&nbsp;<\/p>\n\n\n\n<p><strong>Chips<\/strong><\/p>\n\n\n\n<p>Meta <a href=\"https:\/\/www.reuters.com\/technology\/meta-deploy-in-house-custom-chips-this-year-power-ai-drive-memo-2024-02-01\/\">purchases 350,000 H100 graphic cards<\/a> while developing their <strong>own \u201cArtemis\u201d chip<\/strong>, and OpenAI has ambitions to <a href=\"https:\/\/beebom.com\/openai-sam-altman-raising-money-ai-chip-factories\/\">open its own chip<\/a> <strong>factories<\/strong>. Meanwhile, <strong>Groq <\/strong><a href=\"https:\/\/blog.cubed.run\/groq-inference-engine-18x-faster-than-gpus-25a4319b6984\">unveils Language Processing Units<\/a><strong> <\/strong>that enable Mixtral to <a href=\"https:\/\/analyticsindiamag.com\/groqs-lpu-demonstrates-remarkable-speed-running-mixtral-at-nearly-500-tok-s\/\">operate at a speed<\/a> of <strong>500 tokens per second<\/strong>.<\/p>\n\n\n\n<p><strong>Open source<\/strong><\/p>\n\n\n\n<p>Significant developments also occurred in the open-source domain, as the Allen Institute for Artificial Intelligence <a href=\"https:\/\/arxiv.org\/pdf\/2402.00838.pdf\">introduced OLMo<\/a>, <strong>a genuine open-source model<\/strong>. The release includes it all: weights, inference code, training data, and evaluation code. Google <a href=\"https:\/\/blogs.vreamer.space\/googles-gemma-open-models-a-paradigm-shift-in-the-ai-landscape-c73d55b8d0a2\">contributed with Gemma<\/a> family models, while the RWKV Project <a href=\"https:\/\/blog.rwkv.com\/p\/eagle-7b-soaring-past-transformers\">released the Eagle 7B<\/a>,<strong> <\/strong>an <strong>attention-free model<\/strong>. <a href=\"http:\/\/abacus.ai\/\">Abacus.ai<\/a> <a href=\"https:\/\/medium.com\/@ignacio.de.gregorio.noblejas\/behold-the-power-of-smaug-the-new-open-source-king-6987e6da2606\">launched Smaug<\/a>, holding the <strong>top spot<\/strong> on HugginFace&#8217;s Open LLM Leaderboard.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Model evaluation<\/h2>\n\n\n\n<p><strong>Testing LLM can be fun<\/strong><\/p>\n\n\n\n<p>Every AI developer would tell you how <strong>important evaluation<\/strong> is, but typically it is quite boring. Sometimes, however, the results are <strong>unexpected<\/strong>\u2026<\/p>\n\n\n\n<p>A common approach to testing LLM is the so-called <strong>needle-in-the-haystack<\/strong> evaluation: You put a very specific statement (the needle) into a random context (the haystack) and ask a question that can only be answered using the information in the needle.&nbsp;<\/p>\n\n\n\n<p>When Anthropic researchers <a href=\"https:\/\/www.anthropic.com\/news\/claude-3-family\">tested Claude 3 Opus<\/a> using a needle about pizza toppings, it answered correctly, but it also added: &#8220;This <strong>sentence<\/strong> seems very <strong>out of place<\/strong> and unrelated to the rest [..]. I suspect this pizza topping &#8220;fact&#8221; may have been <strong>inserted as a joke or<\/strong> to <strong>test if I was paying attention<\/strong>.&#8221; (see <a href=\"https:\/\/twitter.com\/alexalbert__\/status\/1764722513014329620\">Alex Albert&#8217;s tweet<\/a>).&nbsp;<\/p>\n\n\n\n<p>When a model detects it is being tested, we have a problem. But it&#8217;s not the only challenge we face. Additional issues include inadequately annotated benchmarks (<a href=\"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-7\/#gemini\">see Spotlight #7<\/a>), and <a href=\"https:\/\/ehudreiter.com\/2024\/03\/12\/data-contamination-worries\/\">contamination during LLM training<\/a>. For a deeper analysis, check out <a href=\"https:\/\/www.linkedin.com\/in\/rkris\/\">Rohit Krishnan<\/a>&#8216;s <a href=\"https:\/\/www.strangeloopcanon.com\/p\/evaluations-are-all-we-need\">comprehensive post<\/a>. We do need to design better evaluations.\u00a0<\/p>\n\n\n\n<p><strong>Evaluation and leaderboards<\/strong><\/p>\n\n\n\n<p>HuggingFace introduced several <strong>new leaderboards<\/strong>, including the <a href=\"https:\/\/huggingface.co\/blog\/leaderboard-patronus\">Enterprise Scenarios Leaderboard<\/a> <strong>for real-world applications<\/strong>, the <a href=\"https:\/\/huggingface.co\/blog\/leaderboard-decodingtrust\">LLM Safety Leaderboard<\/a>, and the <a href=\"https:\/\/huggingface.co\/blog\/leaderboard-hallucinations\">Hallucinations Leaderboard<\/a>. To <strong>battle contamination,<\/strong> they released a dynamic benchmark, the <a href=\"https:\/\/huggingface.co\/blog\/leaderboard-nphardeval\">NPHardEval Leaderboard<\/a>.<\/p>\n\n\n\n<p>The University of Berkeley <a href=\"https:\/\/gorilla.cs.berkeley.edu\/leaderboard.html\">unveiled a Function-Calling Leaderboard<\/a>, together with their Gorilla OpenFunctions-v2 model. It is particularly useful given the <strong>growing integration of function calls<\/strong> into LLMs. Finally, there is an <a href=\"https:\/\/arxiv.org\/pdf\/2401.13178.pdf\">AgentBoard benchmark<\/a> for LLM <strong>agents<\/strong> that isn&#8217;t only about the final success rate but evaluates also the <strong>intermediate steps<\/strong>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">New training methods<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Stanford University researchers have <strong>simplified the alignment of LLMs<\/strong> through Reinforcement Learning with Human Feedback (RLHF) using <a href=\"https:\/\/arxiv.org\/abs\/2305.18290\">Direct Preference Optimization<\/a> (DPO). The method employs a simple classification loss. This enhances training stability and <strong>reduces<\/strong> extensive <strong>hyperparameter tuning<\/strong> while <strong>improving<\/strong> summarization <strong>quality<\/strong> and sentiment control, and enhancing the quality of single-turn dialogue. Traditionally, Proximal Policy Optimization (PPO) uses human-labeled answer preferences to define a preference loss, trains a reward model on it, and then uses it to train a policy. In contrast, <strong>DPO<\/strong> <strong>skips<\/strong> the <strong>reinforcement learning loop<\/strong> and explicit reward fitting by <strong>transforming<\/strong> the <strong>reward loss <\/strong>function <strong>into a policy loss function<\/strong>. The DPO policy network then represents both the language model and implicit reward.<\/li>\n\n\n\n<li>At the University of California, researchers <strong>surpassed DPO<\/strong> with <a href=\"https:\/\/arxiv.org\/abs\/2401.01335\">Self-Play Fine-Tuning<\/a> (SPIN) across various benchmarks and <strong>reduced the need for human preference data<\/strong> or advanced LLM feedback. SPIN operates similarly to Generative Adversarial Networks, the trained <strong>model<\/strong> (generator) <strong>generates responses<\/strong>, while its previous version (discriminator) <strong>distinguishes between LLM and human<\/strong> responses, and training pushes the generator to <strong>create less distinguishable responses<\/strong>. This iterative process yields comparable results to DPO in the initial steps and surpasses it with additional iterations, offering diminishing improvements. They also showed that the global optimum in training is reached when LLM policy aligns with the target data distribution.<\/li>\n\n\n\n<li>DeepMind&#8217;s <a href=\"https:\/\/arxiv.org\/pdf\/2312.06585.pdf\"><strong>ReSTEM<\/strong> <strong>self-training<\/strong><\/a><strong> <\/strong>method <strong>reduces reliance on human<\/strong>-labeled data. The model <strong>generates multiple output<\/strong> samples (solutions) for each input; those are <strong>filtered<\/strong> with a binary reward function to <strong>create new training data<\/strong> on which the model is iteratively trained. The researchers demonstrated <strong>favorable scaling for larger<\/strong> (PaLM-2) <strong>models<\/strong>, contrasting with Alibaba DAMO Academy&#8217;s observations of <a href=\"https:\/\/arxiv.org\/abs\/2308.01825\">diminishing returns for larger models<\/a> with increased training data. The use of synthetic data likely contributes to significant performance gains and enhanced results on related held-out benchmarks.<\/li>\n\n\n\n<li>Similarly, Microsoft showed the <strong>usefulness of synthetic data<\/strong> by fine-tuning the <a href=\"https:\/\/arxiv.org\/pdf\/2401.00368.pdf\">Mistral 7B text embedding model<\/a>, <strong>achieving state-of-the-art<\/strong> results in <strong>under 1k training steps<\/strong>. They showed that Mistral&#8217;s pre-training produced robust text representations, requiring minimal fine-tuning of the embedding model. Using GPT-4, they generated a <strong>diverse dataset<\/strong> of 500k examples with 150k unique instructions, covering <strong>symmetric<\/strong> (semantically related but non-paraphrased pairs) and <strong>asymmetric tasks<\/strong> (pairs with similar semantic meanings but different surface forms). Then they employed standard contrastive loss (bringing similar examples closer together) and a mixture of synthetic and labeled data.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Improvements without training<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>In a collaborative study on the <strong>cost-effectiveness of post-training methods<\/strong>, researchers introduced the <a href=\"https:\/\/arxiv.org\/pdf\/2312.07413.pdf\"><strong>Compute-Equivalent Gain<\/strong><\/a> (CEG) metric to assess <strong>additional training<\/strong> needs for models <strong>to match post-training<\/strong> results. Overall, post-training has proven highly beneficial, with <strong>tool-based<\/strong> methods offering substantial gains at minimal added cost (tens of CEG). <strong>Prompting<\/strong> and <strong>solution selection<\/strong> methods (best of n) are similarly effective but introduce extra inference costs. <strong>Fine-tuning<\/strong> shows a diverse range of usefulness from no gain to thousands of CEG, and evaluating <strong>scaffolding enhancements,<\/strong> like <a href=\"https:\/\/arxiv.org\/abs\/2305.10601\">Tree-of-Thought<\/a> prompting and agents, proved too challenging.<\/li>\n\n\n\n<li>Researchers from the Mohamed bin Zayed University of AI <a href=\"https:\/\/arxiv.org\/abs\/2312.16171\">evaluated 26 prompt engineering principles<\/a> across GPT and LLaMA models. <strong>Larger<\/strong> scale <strong>models<\/strong> generally <strong>benefit more<\/strong> from these principles in terms of <strong>correctness<\/strong>. Especially <strong>useful<\/strong> principles include instructing the model to <strong>request more details<\/strong>, providing a <strong>style sample<\/strong> or specifying the <strong>intended audience,<\/strong> and using <strong>output priming<\/strong>. Tipping and using delimiters offer lesser advantages, but they impact longer and more complex prompts, according to Singapore&#8217;s <a href=\"https:\/\/towardsdatascience.com\/how-i-won-singapores-gpt-4-prompt-engineering-competition-34c195a93d41\">GPT-4 prompting competition<\/a> winner <a href=\"https:\/\/www.linkedin.com\/in\/sheila-teo\/\">Sheila Teo<\/a>. She also recommends <strong>segmenting complex tasks<\/strong> into smaller steps &#8211;&nbsp; supported by the study as enhancing correctness.<\/li>\n\n\n\n<li>Similar observations apply to<strong> Chain-of-Thought<\/strong> prompting, as shown in a paper exploring the <a href=\"https:\/\/arxiv.org\/pdf\/2401.04925.pdf\">impact of reasoning length<\/a>. <strong>More steps<\/strong> generally (adjusted for task complexity) <strong>enhance correctness<\/strong>, while shortened reasoning decreases it, despite the identical information input. Intriguingly, longer reasoning tends to yield correct outcomes <strong>even with minor mistakes<\/strong>. Useful additional steps may involve adding context, summarizing previous reasoning, or self-verification.<\/li>\n\n\n\n<li>As prompts lengthen, Microsoft researchers develop the <a href=\"https:\/\/arxiv.org\/pdf\/2310.05736.pdf\">LLMLingua technique<\/a> to achieve up to <strong>20x prompt compression<\/strong> with minimal performance loss. This builds on the <a href=\"https:\/\/arxiv.org\/pdf\/2304.12102.pdf\">Selective-Context algorithm<\/a>, which uses a <strong>smaller model<\/strong> to <strong>remove<\/strong> low-perplexity <strong>tokens with a minor impact<\/strong> on LLM comprehension. To better <strong>preserve token inter-dependencies<\/strong>, they introduce a budget controller that allocates varying compression ratios for different prompt parts, with a token-level iterative algorithm for fine-grained compression. They also address the discrepancy between the two models through an instruction-tuning-based method <strong>aligning<\/strong> their <strong>token distributions<\/strong>.<\/li>\n\n\n\n<li>Meta researchers <strong>enhanced<\/strong> LLaMA2 70B&#8217;s <strong>factuality<\/strong> and <strong>objectivity<\/strong> using the <a href=\"https:\/\/arxiv.org\/pdf\/2311.11829.pdf\">System 2 Attention prompt<\/a>. This prompt regenerates key parts of the input, <strong>excluding<\/strong> <strong>user biases and opinions<\/strong>, and then responds to the debiased prompt.<\/li>\n<\/ul>\n\n\n\n<p>Please <a href=\"https:\/\/www.linkedin.com\/newsletters\/geneea-s-ai-spotlight-7064632112443727873\/\">subscribe<\/a> and stay tuned for the next issue of Geneea\u2019s AI Spotlight newsletter!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The ninth edition of our newsletter on Large Language Models is here. <\/p>\n<p>In this edition, we look at<br \/>\n\u25aa the just-approved European AI Act,<br \/>\n\u25aa challenges encountered in the wild real world,<br \/>\n\u25aa the adoption of AI in newsrooms,<br \/>\n\u25aa new models and their emerging uses,<br \/>\n\u25aa model evaluation, and<br \/>\n\u25aa improved training and prompting methods.<\/p>\n","protected":false},"author":15,"featured_media":1776,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[378,374],"tags":[244,240,242],"class_list":["post-1775","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-large-language-models","category-newsletter","tag-ai","tag-generativeai","tag-newsletter"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Geneea&#039;s AI Spotlight #9 - Geneea News<\/title>\n<meta name=\"description\" content=\"LLM newsletter #9: European AI Act, adopting AI in newsrooms, new models and their emerging uses, model evaluation, training and prompting methods.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Geneea&#039;s AI Spotlight #9 - Geneea News\" \/>\n<meta property=\"og:description\" content=\"LLM newsletter #9: European AI Act, adopting AI in newsrooms, new models and their emerging uses, model evaluation, training and prompting methods.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9\" \/>\n<meta property=\"og:site_name\" content=\"Geneea News\" \/>\n<meta property=\"article:published_time\" content=\"2024-03-14T11:39:47+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-01-27T20:47:22+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/geneea.com\/news\/wp-content\/uploads\/2024\/03\/Newsletter-9-robot-na-web-1024x575.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"575\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Marcela Soukupova\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Marcela Soukupova\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9\"},\"author\":{\"name\":\"Marcela Soukupova\",\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/#\\\/schema\\\/person\\\/69c8751a4c026723f4bac2e892f52cd8\"},\"headline\":\"Geneea&#8217;s AI Spotlight #9\",\"datePublished\":\"2024-03-14T11:39:47+00:00\",\"dateModified\":\"2026-01-27T20:47:22+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9\"},\"wordCount\":2663,\"publisher\":{\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/geneea.com\\\/news\\\/wp-content\\\/uploads\\\/2024\\\/03\\\/Newsletter-9-robot-na-web.png\",\"keywords\":[\"AI\",\"generativeAI\",\"newsletter\"],\"articleSection\":[\"Large language models\",\"Newsletter\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9\",\"url\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9\",\"name\":\"Geneea's AI Spotlight #9 - Geneea News\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/geneea.com\\\/news\\\/wp-content\\\/uploads\\\/2024\\\/03\\\/Newsletter-9-robot-na-web.png\",\"datePublished\":\"2024-03-14T11:39:47+00:00\",\"dateModified\":\"2026-01-27T20:47:22+00:00\",\"description\":\"LLM newsletter #9: European AI Act, adopting AI in newsrooms, new models and their emerging uses, model evaluation, training and prompting methods.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9#primaryimage\",\"url\":\"https:\\\/\\\/geneea.com\\\/news\\\/wp-content\\\/uploads\\\/2024\\\/03\\\/Newsletter-9-robot-na-web.png\",\"contentUrl\":\"https:\\\/\\\/geneea.com\\\/news\\\/wp-content\\\/uploads\\\/2024\\\/03\\\/Newsletter-9-robot-na-web.png\",\"width\":2561,\"height\":1439},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/geneeas-ai-spotlight-9#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/geneea.com\\\/news\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Geneea&#8217;s AI Spotlight #9\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/#website\",\"url\":\"https:\\\/\\\/geneea.com\\\/news\\\/\",\"name\":\"Geneea News\",\"description\":\"Learn more about what&#039;s happening at Geneea: new NLP features, newest case studies, tutoring projects, conferences we attended, etc.\",\"publisher\":{\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/geneea.com\\\/news\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/#organization\",\"name\":\"Geneea News\",\"url\":\"https:\\\/\\\/geneea.com\\\/news\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/geneea.com\\\/news\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/cropped-geneea-logo-50pc.png\",\"contentUrl\":\"https:\\\/\\\/geneea.com\\\/news\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/cropped-geneea-logo-50pc.png\",\"width\":242,\"height\":64,\"caption\":\"Geneea News\"},\"image\":{\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/#\\\/schema\\\/logo\\\/image\\\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/geneea.com\\\/news\\\/#\\\/schema\\\/person\\\/69c8751a4c026723f4bac2e892f52cd8\",\"name\":\"Marcela Soukupova\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/44f35824640c6a5b31bfef2f478d704874dc3d81bfad511c158ab12274072e16?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/44f35824640c6a5b31bfef2f478d704874dc3d81bfad511c158ab12274072e16?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/44f35824640c6a5b31bfef2f478d704874dc3d81bfad511c158ab12274072e16?s=96&d=mm&r=g\",\"caption\":\"Marcela Soukupova\"},\"sameAs\":[\"http:\\\/\\\/Marcela%20Soukupova\"],\"url\":\"https:\\\/\\\/geneea.com\\\/news\\\/author\\\/marcela-soukupova\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Geneea's AI Spotlight #9 - Geneea News","description":"LLM newsletter #9: European AI Act, adopting AI in newsrooms, new models and their emerging uses, model evaluation, training and prompting methods.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9","og_locale":"en_US","og_type":"article","og_title":"Geneea's AI Spotlight #9 - Geneea News","og_description":"LLM newsletter #9: European AI Act, adopting AI in newsrooms, new models and their emerging uses, model evaluation, training and prompting methods.","og_url":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9","og_site_name":"Geneea News","article_published_time":"2024-03-14T11:39:47+00:00","article_modified_time":"2026-01-27T20:47:22+00:00","og_image":[{"width":1024,"height":575,"url":"https:\/\/geneea.com\/news\/wp-content\/uploads\/2024\/03\/Newsletter-9-robot-na-web-1024x575.png","type":"image\/png"}],"author":"Marcela Soukupova","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Marcela Soukupova","Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9#article","isPartOf":{"@id":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9"},"author":{"name":"Marcela Soukupova","@id":"https:\/\/geneea.com\/news\/#\/schema\/person\/69c8751a4c026723f4bac2e892f52cd8"},"headline":"Geneea&#8217;s AI Spotlight #9","datePublished":"2024-03-14T11:39:47+00:00","dateModified":"2026-01-27T20:47:22+00:00","mainEntityOfPage":{"@id":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9"},"wordCount":2663,"publisher":{"@id":"https:\/\/geneea.com\/news\/#organization"},"image":{"@id":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9#primaryimage"},"thumbnailUrl":"https:\/\/geneea.com\/news\/wp-content\/uploads\/2024\/03\/Newsletter-9-robot-na-web.png","keywords":["AI","generativeAI","newsletter"],"articleSection":["Large language models","Newsletter"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9","url":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9","name":"Geneea's AI Spotlight #9 - Geneea News","isPartOf":{"@id":"https:\/\/geneea.com\/news\/#website"},"primaryImageOfPage":{"@id":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9#primaryimage"},"image":{"@id":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9#primaryimage"},"thumbnailUrl":"https:\/\/geneea.com\/news\/wp-content\/uploads\/2024\/03\/Newsletter-9-robot-na-web.png","datePublished":"2024-03-14T11:39:47+00:00","dateModified":"2026-01-27T20:47:22+00:00","description":"LLM newsletter #9: European AI Act, adopting AI in newsrooms, new models and their emerging uses, model evaluation, training and prompting methods.","breadcrumb":{"@id":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9#primaryimage","url":"https:\/\/geneea.com\/news\/wp-content\/uploads\/2024\/03\/Newsletter-9-robot-na-web.png","contentUrl":"https:\/\/geneea.com\/news\/wp-content\/uploads\/2024\/03\/Newsletter-9-robot-na-web.png","width":2561,"height":1439},{"@type":"BreadcrumbList","@id":"https:\/\/geneea.com\/news\/geneeas-ai-spotlight-9#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/geneea.com\/news"},{"@type":"ListItem","position":2,"name":"Geneea&#8217;s AI Spotlight #9"}]},{"@type":"WebSite","@id":"https:\/\/geneea.com\/news\/#website","url":"https:\/\/geneea.com\/news\/","name":"Geneea News","description":"Learn more about what&#039;s happening at Geneea: new NLP features, newest case studies, tutoring projects, conferences we attended, etc.","publisher":{"@id":"https:\/\/geneea.com\/news\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/geneea.com\/news\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/geneea.com\/news\/#organization","name":"Geneea News","url":"https:\/\/geneea.com\/news\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/geneea.com\/news\/#\/schema\/logo\/image\/","url":"https:\/\/geneea.com\/news\/wp-content\/uploads\/2022\/02\/cropped-geneea-logo-50pc.png","contentUrl":"https:\/\/geneea.com\/news\/wp-content\/uploads\/2022\/02\/cropped-geneea-logo-50pc.png","width":242,"height":64,"caption":"Geneea News"},"image":{"@id":"https:\/\/geneea.com\/news\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/geneea.com\/news\/#\/schema\/person\/69c8751a4c026723f4bac2e892f52cd8","name":"Marcela Soukupova","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/44f35824640c6a5b31bfef2f478d704874dc3d81bfad511c158ab12274072e16?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/44f35824640c6a5b31bfef2f478d704874dc3d81bfad511c158ab12274072e16?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/44f35824640c6a5b31bfef2f478d704874dc3d81bfad511c158ab12274072e16?s=96&d=mm&r=g","caption":"Marcela Soukupova"},"sameAs":["http:\/\/Marcela%20Soukupova"],"url":"https:\/\/geneea.com\/news\/author\/marcela-soukupova"}]}},"_links":{"self":[{"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/posts\/1775","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/users\/15"}],"replies":[{"embeddable":true,"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/comments?post=1775"}],"version-history":[{"count":2,"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/posts\/1775\/revisions"}],"predecessor-version":[{"id":1778,"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/posts\/1775\/revisions\/1778"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/media\/1776"}],"wp:attachment":[{"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/media?parent=1775"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/categories?post=1775"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/geneea.com\/news\/wp-json\/wp\/v2\/tags?post=1775"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}