09:44, 16 April 2026

Zhanna Flags the “Workarounds”: Sirius Lets AI Judge Hackathon Projects

At the Ctrl+Vibe hackathon at Sirius University, students were evaluated not only by human experts but also by a neural network trained on large language models.

The AI agent has a very human name – Zhanna. While students slept ahead of their final presentations, Zhanna spent the night inside their projects. It analyzed code and reviewed documentation. It also inspected pipelines and generated “cold” recommendations. The final score combined machine analysis with human judgment. This is a full-scale pilot of embedding generative AI into project-based learning.

“Results Were Dramatic”

Students were tasked with solving three applied challenges from partner companies. One team worked on an internal task distribution platform for the university, another built an AI assistant for cooking, while a third developed an AI-driven project management platform. The experiment was supported by the Yandex School of Data Analysis, the YandexCloud Center for Technologies for Society and the university itself.

The most interesting aspect of this format is the division of roles. Typically, AI writes code and humans review it. This time, the model flipped. Students wrote the code, while both humans and the machine evaluated it. Zhanna’s assessments were strict and pragmatic. The agent drilled into each project, examining code quality, the accuracy of workflow visualization and the rigor of documentation. It could not be impressed by polished presentations alone. Instead, it looked for bugs and technical flaws. In practice, Zhanna acted like a detective. It identified “generated workarounds,” questionable code segments that students tried to hide. “I ran tests, and the results were dramatic: some solutions failed, while others proved solid,” the AI agent said.

Photo - Zhanna Flags the “Workarounds”: Sirius Lets AI Judge Hackathon Projects

What Is Catching Up Now

Human judges were left with the most nuanced part of the evaluation. They assessed presentations, decision-making logic and teamwork. Equally important was the ability to justify architectural choices. Students had to explain why their architecture worked better and respond to challenging questions.

The hackathon is built around the concept of vibe coding, where developers actively use AI tools to generate solutions quickly. Students operated in this mode for a full week. They delegated code writing to neural networks and then defended their work. As one participant, second-year student Varvara Oleynikova, noted: “We often reject new technologies simply because we do not understand them. But it is important to keep pace with change and accept reality as it is. This hackathon is about working with what is already catching up with us.”

Millions of AI Code Reviews

AI has been tested as a judge before. In 2024, the Advanced Engineering School at Innopolis University ran a similar hackathon focused on building new features for the Agniya AI agent. That marked a shift, Russian universities were no longer just discussing AI but building applied competitions around it. The key difference with Sirius is that there AI was the development goal, while here it serves as the evaluation tool.

While Russian students were competing, GitHub published research showing that code written with the Copilot AI assistant demonstrates better functionality, readability and a higher approval rate in reviews. By 2026, GitHub had recorded tens of millions of AI-driven code reviews. Copilot now analyzes entire project contexts rather than isolated lines of code. That makes the Sirius experiment a reflection of real-world practice. What students are testing today is already becoming the default in the industry.

Control Framework

The educational model itself is shifting. Previously, students were trained in manual operations, writing loops and adjusting style. Now, at Universitet Sirius, with support from YandexCloud, teaching will focus on development within a “human + AI + agent-based quality control” framework.

The impact will be tangible, even if not immediate. The market will gain specialists who can build digital services using AI. The quality of applied products will improve faster, from university platforms to mobile food delivery apps. This will shorten time to market, as AI will catch errors in code and documentation at early stages, before they reach end users.

These practices are likely to scale. First in project intensives, then across formal education. Automated code and pipeline checks before thesis defense could become a new standard. At the same time, it is critical not to overestimate the machine. The winners will not be universities that simply allow ChatGPT to solve tasks, but those that design a robust control framework, a hybrid model where AI accelerates routine work, while humans remain responsible for meaning, architecture and product maturity.

We chose to focus on multi-agent systems. At Yandex today, administrators manually review all submitted requests, assessing risks and potential benefits. Our solution automates this process through a network of AI agents. They independently evaluate technical feasibility, timelines and budgets. If approved, they analyze research sources, check intellectual property, determine optimal architecture and identify the required professionals. The task was complex, but our team succeeded because of our knowledge base and our understanding of system architecture

Vladimir Fedoseev

second-year IT student, Sirius University

Education