Language and Personalization

LAP

Language and Personalization

The purpose for this work package is to develop personalization techniques and Scandinavian language processing capabilities to provide personalized content generation and:

Develop truly explainable, fair and transparent personalization techniques
Enable proactivity in customer relations
Provide an individualized experience that provably respects privacy concerns
Develop individualized content
Develop large-scale Scandinavian language models
Enable human-like content creation and conversations

Personalization and contextualization have been successfully employed in diverse applications over the past decade, and currently see an extended usage, for instance in proactive interaction with customers and individualization of news stories. LAP will contribute to developing such systems while ensuring that the system usage will be ethical and respecting users’ requirements for privacy, fairness and accountability.

Building Scandinavian language models requires the compilation of large-scale reusable language resources, including general-purpose corpora from public sources (e.g., news and social media) as well as industry- and domain-specific text collections. We will address the scarcity of the latter by pre-training on the former and developing transfer learning methods. These large-scale language models will then be utilized in real-life scenarios by formulating a number of specific summarization, explanation, and conversational tasks based on our partners’ use-cases. LAP will develop appropriate evaluation methodology with user-oriented evaluation measures and objectives. It will thus contribute to providing measurable quantification of the amount of domain-specific training material needed in order to provide a language service that is of sufficiently high quality.

Projects&Results

LAP projects

NorLLM Language Models

NorLLM Store Norske Språkmodeller

NorLLM Norwegian Large Language Models

The models are available to test, for representatives from organizations based in the Nordic countries and students at Nordic universities. Apply for access to the NorLLM models on Hugging Face: https://huggingface.co/NorwAI

Associated projects

TrustLLM - funded by Horizon Europe, coordinated by Linköping University

TrustLLM - Democratize Trustworthy and Efficient Large Language Model Technology for Europe.

TrustLLM brings together leading European research institutions to lead European development in NLP and AI, and to lay the foundation for a broader European collaboration effort on LLMs and large-scale AI. The project envisions to build a series of LLMs to represent specific language families, in particular the Germanic language family.

Aifal - AI for Allmennleger (AI for General Practitioners)

The Aifal project is a research initiative that focuses on integrating generative AI models into the primary healthcare service to improve general practitioners' everyday work. The goal is to develop AI services that free up time for patient meetings, improve decision-making processes, and speed up diagnosis. The project tests prototypes for summarizing patient records, knowledge support, transcribing consultations, and medical coding. Collaboration partners include the Antibiotic Center at UiO and NorwAI at NTNU. For more information, visit the Aifal project.

The Mimir Project

The Mímir project, led by the National Library of Norway, aims to assess the impact of copyrighted material on the performance of large generative language models for Norwegian languages. It involves collaboration with the University of Oslo, NTNU/NorwAI, and Sigma2, focusing on training models with both copyrighted and non-copyrighted content. The findings suggest that models trained with a mix of these materials generally perform better, emphasizing the importance of high-quality, curated content.

Press release from the National Library: Forskningsprosjekt viser: Rettighetsbelagt innhold gir norske språkmodeller høy kvalitet | Nasjonalbiblioteket (nb.no)

Technical report from the Mimir project

People

person-portlet

WP Leader

Krisztian Balog
Professor, UiS

person-portlet

Researchers

Stories

A call for Nordic collaboration

Leading lingvistic researchers and data engineers from the greater Nordic region was gathered to share competence and challenges. To keep up with the technology pace, even more cross border cooperation is needed.

Women presenting on stage in front of an audience — Divvun works on Sami language technologi at the Artic University in Tromsø together with Giellatekno. Photo: NorwAI

2024-11-29

Where does the road ahead lead for NorwAI and language models? A small roadmap for NorLLM

The demand and curiosity for Norwegian generative language models has been notable. The six models NorwAI published this summer have been downloaded more than 10 000 times. Plans for what comes next is taking shape.

The NorLLM logo

2024-11-26

Nordic cooperation to protect minority languages in the age of AI

Nordic Language Technology Get-together to Include Minority Languages

As society is becoming increasingly digitized, professionals in the Nordics want to ensure that the new solutions that language technology and artificial intelligence can offer, are available for all languages in the Nordics.

The organizers for the Get-together in Trondheim on November 5th and 6th hope to foster cooperation across national borders and languages. They will also launch a new language technology platform for small languages and have a poster session.

The conference is a cooperation between ASTIN (Arbetsgruppen för språkteknologi i Norden) with members UiT The Artic University, of Norway, Språkrådet vid Institutet för språk och folkminnen in Sweden, Dansk Sprognævn in Denmark and Språkrådet, NorwAI in cooperation with NorwAI, Norwehian Research Center for AI Innovation.

2024-10-29

Medbric to assist doctors in the primary healthcare service

Jon Espen Ingvaldsen of NorwAI and Jorunn Thaulow of the University of Oslo has joined hands to bring AI state-of-the-art-technology into medical practice. The test period showed very good results after more than 100 GPs participated.

Jon Espen Ingvaldsen and Jorunn Thaulow on stage — With the healthcare sector facing a growing demand for manpower and resources, innovative technologies like Medbric are highly sought after to enhance efficiency.
Photo: Kai T. Dragland, NTNU

2024-09-30

Learn with NorwAI's experts

Lær med NorwAIs eksperter

Portrait of Jon Atle Gulla — Jon Atle Gulla, sjef for NorwAI, vil være fagansvarlig.

NorwAI is preparing a special course on language models for important decision-makers and developers who want to use the technology in their own applications. In the course "Innovation with generative language models", NorwAI's experts will share knowledge and their skills with those who want to lead the way in Norwegian AI utilization. (Full article in Norwegian)

2024-08-29

Introducing IAI MovieBot

IAI MovieBot is an on-going project by the IAI group at the University of Stavanger, and is a conversational recommender system for movies.

Throughout a conversation, IAI MovieBot asks you questions related to your preferences, such as the genre and the release year of the movie you are looking for. Based on your answers, IAI MovieBot tries to recommend you a movie that matches your preferences and reply to your question on it.

Screenshot of the MovieBot

2024-08-05

Award at ICTIR '24

Best Paper Honorable Mention Award at ICTIR '24

We are thrilled to announce that the paper, "Towards a Formal Characterization of User Simulation Objectives in Conversational Information Access" by Nolwenn Bernard and Krisztian Balog, has received the Best Paper Honorable Mention Award at the 14th International Conference on the Theory of Information Retrieval (ICTIR '24)!

Portrait Nolwenn Bernard — Nolwenn Bernard, University of Stavanger

Portrait of Krisztian Balog — Krisztian Balog, University of Stavanger

2024-08-16

How can (Norw)AI protect personal data?

Protecting personal information is challenging with complex AI models that are hungry for data. NorwAI’s pledge to provide an individualized AI experience that provably respects privacy concerns is therefore more important than ever.

Portrait of Anders Løland, NR — Anders Løland, Research Director, Norwegian Computing Center (NR)

The project “MIMIR” on copyrighted content

At the end of 2023, an initiative emerged that brought the three most active environments in Norway with expertise in language models to collaborate more closely. The “Mimir” project united the National Library of Norway, the University of Oslo, and NorwAI in a joint effort.

Portrait picture of Aslak Sira Myhre — Aslak Sira Myhre

Large language models at University of Oslo

Many have declared 2023 as the year of Large Language Models (LLM), and it’s hard to disagree. In the Language Technology Group (LTG) at the University of Oslo (UiO), developing language models for Norwegian has been an important priority for several years. While also a NorwAI partner, LTG has not been involved in the LLM efforts of the center. Nonetheless, language modeling has defined our activities in several other collaborations.

-Vi treng generative norske språkmodeller

- Treng me generative språkmodellar på norsk? Svaret er eit rungande ja! Modellane må kunna formidla norske verdiar og haldningar, og dei må formidla god norsk - både bokmål og nynorsk.

Det sa Språkrådets direktør Åse Wetås da hun innledde på Trondheim Tech Port og NorwAI’s innovasjonsfrukost om språkmodeller og innovasjon 14. februar -24.

Åse Wetås on stage during a talk — Åse Wetås, Språkrådet. Foto: Kai T. Dragland

2024-02-27

New language model for public use this winter

- NorwAI will meet the immense interest to work with language models with an open model, smaller thant that of the research model NorGPT-23 but will still be fully operational, says Jon Atle Gulla, professor and director at NorwAI.

Portrait Jon Atle Gulla — Jon Atle Gulla, Director of NorwAI

2023-10-06

Upgrading infrastucture is critical to meet AI demands

Upgrading the national infrastructures are critical steppingstones to be prepared for the quantum leap technology now is facing.

Researcher by the Idun cluster — The Idun cluster is upgraded to meet the new demands of AI research. Here, NorwAI researcher Lemei Zhang.
Photo: Kai TY. Dragland, NTNU

2023-08-11

Amerikanske språkmodeller påvirker ChatGPT. Det er problematisk.

-Flere problemer melder seg ved kunstig intelligens. Nå trenger vi å ta kontroll over infrastrukturen, sier Sven Størmer Thaulow, EVP og Chief Data and Technology Officer i Schibsted ASA.

Portrettbilde Sven Størmer Thaulow — -Flere problemer melder seg ved kunstig intelligens. Nå trenger vi å ta kontroll over infrastrukturen, sier Sven Størmer Thaulow, EVP og Chief Data and Technology Officer i Schibsted ASA.

2023-07-05

Schibsted reports on their AI results

Sven Størmer Thaulow on stage — Schibsted EVP, Chief Technology and Data Officer, Sven Størmer Thaulow was invited to the main media executive meeting in Norway to report on their initiatives in AI.

2023-05-15

A national call for cooperation - Media to contribute to NorwAI’s LLM

A national call for cooperation Media to contribute to NorwAI’s LLM

2023-05-15

Large language models as public goods

Large language models have huge potential for value creation - but there is a strong need to address issues of control and risk mitigation.

Portrait Eiviind THrondsen — Eivind Throndsen
Academic coordinator
Schibsted Products & Technology

We are now moving towards a huge change in intellectual value creation, powered by the weird and surprisingly sophisticated mimicry of intelligence powered by large language models (LLMs).

These models have unleashed a wave of creativity. They, and their model cousins that can process, transform and generate sound, images and any digitizable data, have enabled previously impossible products and services along with a torrent of hype.

Due to the enormous amounts of data, compute and brain power required, these important platforms are now mostly developed and controlled by a few very large private technology companies in the US. This is problematic, because along with all the interesting new functionality, large language models also suffer from serious and complicated challenges such as bias, hallucinations and toxicity. Private companies will invariably balance mitigating these issues with the need for profit. They are likely to do the bare minimum required to avoid regulatory retribution and public relations backlash.

2023-03-28

ChatGPT and its inner workings

Media insiders from seven countries got a lecture from NorwAI researcher Benjamin Kille as the interest to know more about the new language models dominated the discussions at a media lab day in Hamburg, Germany.

Portrait Benjamin Kille — Benjamin Kille, Post doctoral fellow, Department of Computer Science, NTNU

2023-01-31

The Kahoot Test of the AI Summary

The Kahoot test of the AI summary

Participants at the NxtMedia Conference 2022 were able to test journalistically written articles against summaries written by a language robot.

-The Kahoot game came in handy to choose the winners in the three examples, says adjunct associate professor Jon Espen Ingvaldsen who did the test.

Jon Espen Ingvaldsen holding a presentation — Photo: Kai T. Dragland, NTNU

2022-12-13

Norwegian GPT model to be introduced

NorwAI to introduce large Norwegian GPT model

NorwAI GPT Language Modeling Project is currently building its version of a large Norwegian model. The model will go into training this spring and will be ready for demonstrations for interested partners, says NorwAI head professor Jon Atle Gulla.

Jon Atle Gulla speaking on stage — - This will be a major result for NorwAI so far, says Professor Jon Atle Gulla.
Photo: Kai T. Dragland, NTNU

2023-01-31

A new team of research assistants has started at NorwAI

A new team of research assistants will continue our work with Kaia-The Social Robot

NorwAI will continue the research on Social Robotics. This semester three new research assistants have joined us and will develop new features and conduct extensive benchmarks to test Kaia against the state of the art.

Group picture of the research assistants — The team consists of Håkon Høgset (left), Alexander Gerlach (center), and Marte Eggen (right). PostDoc Benjamin Kille and Professor Jon Atle Gulla will guide their work.

2022-09-02

Language experts to report on speech and text to the Storting

Portrait Jonas Engestøl Wettre in the Technology Council — Jonas Engestøl Wettre,
project manager
in the Norwegian Board
of Technology

The Norwegian Board of Technology (Teknologirådet) councels lawmakers and government. By starting with speech and later continuing on large language models, expert groups will disseminate the complex language technology step by step. NorwAI's director Jon Atle Gulla is part of the expert group.

2022-05-31

ECIR Conference

Norway may take a world-leading AI role

Kjetil Nørvåg and Krisztian Balog at ECIR 2022 Conference — GENERAL CHAIRS –
professors Kjetil Nørvaag, NTNU (left)
and Krisztian Balog, UiS, headed the ECIR forum
for Information Retrieval in April. Photo: NorwAI

-One particular area where the Norwegian AI stands out is the genuine interest in fairness, transparency and explainability, which align with societal values in Norway. Therefore, I can see Norwegian AI research taking a world-leading role in these areas, says professor Krisztian Balog at the University of Stavanger and Staff Research Scientist at Google.

Krisztian Balog heads NorwAIs work package for language technologies. He cooperates with NorwAI's research director, professor Kjetil Nørvaag. The two professors joined their skills as general chairs of the successful ECIR conference in Stavanger during the week before Easter, giving an international audience insight on new research results in the broadly conceived area of Information Retrieval.

2022-04-28

A silent challenge

The Language Council of Norway has contacted NorwAI about current research on sign languages. There is ongoing research in Europe on AI-driven sign language processing, and NorwAI is considering looking into the use of machine learning for interpreting the Norwegian sign language. The visual and silent language is an official minority language in NorwAI. Research will face some very special challenges if a project materializes.

Logo Språkrådet

2022-03-30

Tailoring news content: How Scandinavian mediahouses have tested recommender systems

Scandinavian newspapers were early adapters to online services 25 years ago. Gradually some of them explored how recommender systems would enable individually tailored news streams. In an article in AI Magazine recently NorwAI associates, headed by Center director Jon Atle Gulla (picture) explore how Scandinavian media organizations are coping with these new technological opportunities.

2021-12-20

New Language Models in NorwAI

Photo. Jon Atle Gulla

The NorwAI center is determined to provide new Norwegian language models that are significantly larger and better than what is available to-day and can easily be employed in advanced Norwegian NLP applications for industrial use, says center director professor Jon Atle Gulla.

2021-04-20

LAP - Language and Personalization