Menu
Geneea News
  • back to Geneea homepage
  • Čeština
Geneea News
Teide

RAG over Large Archives

Posted on 2026-04-302026-05-01

Last week at the FIBEP Tech Day conference, our Jiri Hana spoke about the practical challenges of building RAG systems for news publishers and media monitoring companies.

Jiri Hana at FIBEP 2026 Tech Day
Jiri Hana at FIBEP 2026 Tech Day
FIBEP 2026 Tech Day
FIBEP 2026 Tech Day
FIBEP 2026 Tech Day
FIBEP 2026 Tech Day
FIBEP 2026 Tech Day
FIBEP 2026 Tech Day
FIBEP 2026 Tech Day
FIBEP 2026 Tech Day

RAG for news monitoring is a different beast than the typical tutorial examples. We deal with datasets of hundreds of millions or even billions of articles.

A few things we’ve learned at Geneea:

  • Smart Filtering: Using filters over metadata and named entities to narrow down the dataset improves quality tremendously. But users rarely do this manually, so we derive the filters automatically.
  • Accuracy over prose: For power users, correctness and clear source attribution are far more important than how nicely the answer is written.
  • Attribution: It’s not just about what was said, but who said it — distinguishing between the journalist, the publication, and a quoted politician.

For those interested in the technical details here are the slides, but they are only a summary — the most interesting parts were the live discussions.

2026-04 FIBEP RAGDownload

Hiking in Tenerife

Also, Tenerife is a great place for hiking: one can climb Teide (the highest mountain in Spain), explore the moon-like landscape around it, the rainforest in the North, or Masca — the “Machu Picchu” of Tenerife in the South.

Tenerife: Masca
Tenerife: Masca
Teide
Teide
Tenerife: Mirador Los Catalanes
Tenerife: Mirador Los Catalanes

Explore Topics

  • Customer stories
  • Events & Shows
  • Large language models
  • Life at Geneea
  • Mentioned us
  • Newsletter
  • Partnerships
  • Product news
  • Research & Innovation
  • Resources
  • Solutions for media
  • Voice of the customer

Popular Posts

  • Teide
    RAG over Large Archives
  • Geneea at WAN-IFRA AI Forum in Frankfurt
  • Jirka Hana on Why Semantic Tagging Matters for Publishers
  • The JEDI Project: Rethinking Interview Workflows for Modern Newsrooms
  • News Is Getting Personal – And So Are We
  • Customer stories
  • Events & Shows
  • Large language models
  • Life at Geneea
  • Mentioned us
  • Newsletter
  • Partnerships
  • Product news
  • Research & Innovation
  • Resources
  • Solutions for media
  • Voice of the customer
©2022 Geneea