• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Specialists from the HSE Institute of Education Confirm GigaChat’s Erudition in Social Sciences

Specialists from the HSE Institute of Education Confirm GigaChat’s Erudition in Social Sciences

© iStock

A multimodal neural network model by Sber, under the supervision of HSE University’s expert commission, has successfully passed the Unified State Exam in social studies. GigaChat completed all exam tasks and scored 67 points.

This result exceeds the minimum score for applying to university (45 points) and the average score for the subject in 2023 (56.4 points).

Denis Filippov, Vice President of Salyut Digital Surfaces, Sberbank, spoke about this at the AIJ 2023 conference.

The tests were held to check the capabilities of the updated version of GigaChat, which is based on a highly advanced model of the Russian language with 29 billion parameters.

To conduct the experiment, the chosen subject was social studies—a school discipline that covers the fields of economics, law, and social sciences. Thus, the model’s ability to successfully pass the exam demonstrates a high degree of maturity in its awareness of social norms, as well as economic and legal laws.

To test GigaChat’s knowledge, the team used only relevant test tasks for 2024, which were posted on the website of the Federal Institute of Pedagogical Measurements. Before the experiment, the team made sure that these tasks were not used to pre-train the model. GigaChat’s answers were checked first by an independent expert from HSE University, and then by an expert commission from HSE University’s Institute of Education. The correctness of the task setting and the reliability of GigaChat’s factual answers were assessed, as was the quality of performing creative (open) tasks. 

Denis Filippov, Vice President of Salyut Digital Surfaces, Sberbank

‘It is important for us to evaluate the effectiveness of GigaChat not only by technical metrics, but also from the viewpoint of an ordinary person: whether the service is able to help in any particular area of ​​​​knowledge and how smart and creative the model is. Tests used in the educational system, including the Unified State Exam, are well suited for such an assessment. The exam results show that GigaChat is “well-read” in the field of social sciences. This means that our artificial intelligence “understands” the basic laws of society and has a good sense of morality issues. This is further evidence that our service can be used to solve real-life fact-related problems: just ask a question in a natural way, and GigaChat will give you an accurate answer or help you understand a complex topic.’

Evgeniy Terentev, Director of HSE University’s Institute of Education

‘Our experts assessed GigaChat’s knowledge independently of Sber research and engineering teams. We checked the answers in the same way as if they were given by an ordinary high school graduate. The results show that the neural network model not only has a sufficient level of factual knowledge, but is also capable of thinking logically and choosing the best solution possible.’

Soon, anyone interested will be able to conduct an experiment similar to that conducted jointly with HSE University: GigaChat developers are preparing a special script for publication on GitHub. It will allow users to test how Sber’s neural network model passes the Unified State Exam with ‘one button’, without the need to manually enter the texts of tasks.