ARC@ORU: Alignment of large language models through the lens of data and algorithms

28 maj 2025 09:00 – 10:00 Visual Lab i ARC, Örebro universitet, eller digitalt via Zoom

Seminarieserien ARC\@ORU syftar till att öka medvetenheten om den bredd av perspektiv på AI, robotik och cybersäkerhet som finns vid Örebro universitet, samt att underlätta och inspirera till nya samarbeten. Dagens gäst är Dr. Radha Poovendran, professor vid University of Washington, Seattle.

In English:

ARC@ORU research seminar series: “Alignment of large language models through the lens of data and algorithms”

Speaker: Dr. Radha Poovendran, Professor in the Department of Electrical and Computer Engineering at the University of Washington, Seattle

Host and moderator: Alberto Giaretta, Associate Senior Lecturer in Computer Science, Örebro University

We have the great pleasure to welcome Professor Radha Poovendran, who is visiting Sweden and Örebro University during this week.

About the seminar

As large language models (LLMs) become increasingly integrated into real-world applications (e.g., code generation and chatbot assistants), it is crucial to align these models with human values. This talk will focus on the alignment of LLMs, particularly emphasizing the safety and robustness of LLMs, identifying new vulnerabilities, and scalable synthetic alignment data generation. We will describe our attack-agnostic defenses, named SafeDecoding and CleanGen, new decoding strategies to enhance the safety of LLMs at the inference time. We will also demonstrate that rich information beyond semantics embedded in texts unveil new vulnerabilities which make LLMs susceptible to jailbreak attacks. We further investigate the vulnerabilities of emerging large reasoning models such as DeepSeek-R1.

We will finally present our method, named Magpie, to generate large-scale synthetic data to improve LLM alignment. A highlight of all these efforts is that they do not require re-training or modifying LLM parameters, making them easily deployable with minimal overhead. Our research will help ensure that LLMs are better aligned with human values, thereby providing enhanced quality-of-service to users.

About the speaker

Dr. Radha Poovendran is a Professor in the Department of Electrical and Computer Engineering at the University of Washington, Seattle, where he directs the Network Security Lab (NSL@UW) since 2001. He received a B.S. degree in Electrical Engineering and an M.S. degree in Electrical and Computer Engineering from the Indian Institute of Technology Bombay and the University of Michigan, Ann Arbor, respectively. He received a Ph.D. in Electrical and Computer Engineering from the University of Maryland, College Park. His research interests are in the areas of Adversarial Modeling, Resilient cyber-physical systems, Safety and Security of LLMs and LRMs, Synthetic Data for LLM, Coding and Math. Recent public Synthetic Data contributions on GitHub include Magpie Align, Kodcode, and SafeDecoding.

The ARC@ORU research seminar series

The aim with the ARC@ORU research seminar series is to raise awareness of the breadth of perspectives on AI found at Örebro University, and to facilitate and inspire new collaborations. The seminars are primarily aimed at researchers from all disciplines interested in research and collaborations related to the AI field. However, anyone who finds the topic of interest is welcome to attend!

Welcome!

Lägg till i din kalender