Russian AI Learns to Sense Emotions: How HSE Is Setting a New Global Standard

Linguists at HSE University in St. Petersburg have created a pioneering emotional dictionary to train artificial intelligence, offering a new benchmark for the future of empathetic AI.
Inside the “Emotional Dictionary
For artificial intelligence to communicate effectively with people, it must recognize human emotions. Without this, technological progress in the field remains limited.
Researchers at HSE University’s Laboratory for Language Convergence in St. Petersburg have unveiled a groundbreaking multimodal “emotional dictionary.” This dataset is designed as a reference point for AI systems tasked with recognizing emotional states. It addresses the longstanding shortage of high-quality Russian-language data and represents a significant milestone in Russia’s development of next-generation AI.

The resource contains 909 video recordings totaling nearly 173 minutes. Each fragment is labeled according to six core human emotions: joy, surprise, anger, fear, sadness, and disgust. The dataset’s uniqueness lies in its multimodality: every emotional expression is captured in four forms — full video, audio, transcript, and silent video. This enables researchers to test models across data types and evaluate the performance of multimodal systems.
Why This Matters for Russia and Beyond
Until now, high-quality multimodal datasets for the Russian language were extremely rare. Western analogues, such as CMU-MOSEI, focus primarily on English, overlooking cultural differences in how emotions are expressed. For Russia to advance its own AI technologies, resources tailored to its language and cultural context are indispensable.
The new dictionary not only fills this gap but also lays the foundation for building empathetic AI—systems capable of grasping the nuances of Russian speech and nonverbal communication. This advancement highlights Russia’s growing independence in technological innovation and positions the country to shape global standards in emotional intelligence research.
From Research to Real-World Applications
This initiative builds on earlier projects, including the bimodal Dusha corpus — the largest open dataset for emotion recognition in Russian speech — and the multimodal RAMAS base. However, the new dataset was conceived from the start as a tool for evaluation and standardization.

Its potential is already being demonstrated. Pilot projects in the cultural sector, for instance, use the dictionary to power interactive solutions at the Hermitage Museum. There, chatbots adapt their responses based on the visitor’s emotional state, making cultural experiences more engaging and personalized.
The Road Ahead: Toward Truly Empathetic AI
Looking forward, researchers plan to expand the dataset with more complex combinations of emotions, as well as regional, age-based, and social variations. Such diversity is critical to training AI that can interact naturally across different contexts. Scholars are confident that the resource will attract global attention, as demand for emotionally aware AI grows in fields such as education, healthcare, and customer service.

The creation of this emotional dictionary is more than a scientific breakthrough — it is a declaration of technological sovereignty. By capturing the rich emotional spectrum of the Russian language, this project ensures Russia a prominent role in shaping the cultural and technological future of artificial intelligence.