AI Tools for Digital Libraries: Enhancing User Experience and Trust
Petr Žabička1, Jan Rychtář2, Martin Lhoták3, Filip Jebavý1, Filip Kersch3
1Moravian Library in Brno, Czech Republic; 2Trinera s.r.o., Czech Republic; 3Library of the Academy of Sciences, Czech Republic
Abstract
As generative AI and semantic technologies mature, digital libraries face growing expectations to provide intuitive LLM-based features alongside traditional search tools. This presentation explores how large language models (LLMs), multimodal AI, and semantic search can transform user-facing services in digital libraries — and the crucial design decisions that determine their usefulness, reliability, and trustworthiness.
Czech Digital Library
In the Czech Republic, most digital libraries are based on the open-source Kramerius digital library system. A wide array of digital libraries exists, established by both large libraries directly under the Ministry of Culture, such as the National Library and the Moravian Library, as well as specialized libraries, university libraries, regional libraries, and even some smaller institutions. With nearly 50 separate installations, the landscape remains fragmented. The Czech Digital Library was conceived as a common index and user centric front-end with the aim to provide one point of entry to the users. Currently, the Czech Digital Library provides access to 350,000 documents represented by 150 million pages (82% are copyright protected). Since 2019, it has also served as the official national aggregator for modern library documents, forwarding data to the Europeana Digital Library.
Using LLMs in Czech Digital Library
For over a year, the Czech Digital Library has enhanced its public-domain content by integrating external AI services for translation, page summaries, and text-to-speech. These features work for both scanned documents (JPG/JPEG2000) and born-digital PDFs, allowing users to translate content into more than ten languages, listen to pages, or quickly review short summaries. Although limited in scope, these tools help users navigate materials in unfamiliar languages or access content through alternative modalities, which is especially valuable for users with special needs.
In addition to these production features, we are testing an LLM-based interface for querying either whole documents or individual pages. This includes open-ended questions about page content, summarizing entire books or periodical issues, and comparing outputs across different external AI models. These enhanced options are not yet publicly available, but the testing environment allows us to evaluate accuracy, user experience, and model behaviour before integrating them into the production system.
For querying, we have been testing a range of models from OpenAI, Anthropic and Google to get a sense of how different (and differently priced) models respond to user queries. For translations, Google Translate and DeepL has been tested first but as DeepL does not support Latin we decided to use solely Google Translate, even though there are some disadvantages there as well. Early in testing, we discovered that creating summaries in a language different from the original is more effective when the document is first translated. This approach prevents models from inadvertently switching back to the original language partway through the summary, ensuring consistency and accuracy. For text to speech we have tested services from Google, OpenAI and ElevenLabs. As each of these services has its own advantages and disadvantages, we are allowing the user to pick a model and a voice for each target language. During the presentation, we will discuss our findings and experience in greater detail.
Currently, all these services are implemented solely in the digital library front-end to accelerate user testing and interface improvements, and have not yet been integrated into backend systems where some of these features might ultimately belong. Since all significant online AI services require payment for extensive use, we require user authentication and route all AI service requests through a common proxy, monitoring token usage and setting usage limits. This gives us valuable data on the real use of the AI services as well as protects us from the numerous LLM crawlers that ignore the robots.txt settings.
Retrieval Augmented Generation – the Newspaper Memory
The presentation will then concentrate on a recent Newspaper memory project. We indexed 25 newspaper titles dating from 1880 to 1914, totaling approximately 500,000 pages.Because precise page segmentation was unavailable, we divided the text into roughly 10 million chunks using heuristic methods and generated vector representations for each chunk. Then we gave the users the possibility to ask questions in a natural language and used LLM to give consistent answers based on the most relevant texts with references to the original articles to allow users to check the sources themselves.
Semantic Metadata Search
Another experiment involved bibliographic data aggregated by the Moravian Library for its Central Portal for Libraries. We enriched MARC records of monographs with publisher and additional annotations, testing the effectiveness of natural language searches within library catalogues. Early testing indicates a clear need for AI-generated summaries for digitized books that currently lack annotations.
Preprocessing using AI
Beyond user-facing features, several AI-based preprocessing workflows are being developed to streamline internal digitization and cataloguing processes. One of them is an AI-assisted cataloguing tool that analyzes key document pages and generates a draft bibliographic record, including fields such as title, author, publisher, year of publication, and extent. The system also proposes potential author matches from VIAF. All suggestions are presented to catalogers, who verify or adjust them before finalization.
We are also experimenting with computer vision models to automate the cropping of scanned pages. Currently, this work is done manually, and our goal is to replace these repetitive tasks with AI-driven automation requiring only light supervision.
A similar approach is applied to the generation of structural metadata. AI models are being tested to detect page numbers, classify page types, and extract publication dates for newspapers and serials—tasks that traditionally require substantial manual effort.
Finally, multimodal models support the identification of images and other non-textual elements within documents and generate concise descriptive captions. These enrichments aim to improve discoverability of visual materials and to support future accessibility-focused services.
Hybrid interface
All the above-mentioned experiments have influenced the design of a new user interface for the Czech Digital Library. Our goal is to combine the precision of traditional keyword search with the semantic flexibility provided by modern AI models. The hybrid interface will also support multimodal search, enabling queries based on textual descriptions of images and allowing users to explore image similarity across the collection. To improve accessibility and navigation, we plan to identify named entities and generate document-level summaries in advance.
A major challenge lies in applying retrieval-augmented search effectively. Our experience shows that RAG-based querying works best within coherent, well-defined subsets of the collection—rather than across the entire digital library. Introducing a conversational interaction mode is expected to further simplify user workflows, provided that system behaviour remains predictable and transparent.
During development we must also account for legal constraints on copyrighted materials. These restrictions affect not only direct access to documents but also limit how extensively they may be translated, summarized, or processed by external AI services. In some cases, AI-generated summaries or other derived outputs may be used solely to support search and retrieval, without being displayed directly to users. Depending on copyright status, we may also need to rely on locally deployed models or restrict the scope of generated excerpts.
Our next steps include comprehensive testing of the new interface and integrating advanced AI functionalities into production. At the same time, we aim to collect usage data that will help us refine these features and align them with the evolving expectations of digital library users.