DeepSeek is working on a self-improving artificial intelligence model.
The system developed by DeepSeek achieves better results by running multiple evaluations simultaneously instead of using large models.
The Chinese AI initiative DeepSeek has introduced a new method to enhance the reasoning capabilities of large language models (LLMs). With this method, the company claims it can provide faster and more accurate answers to general questions than its competitors. DeepSeek garnered significant attention with its AI model and chatbot named R1, launched in January. The company asserted that this model is as successful as OpenAI's ChatGPT but operates at a much lower cost. Collaborating with Tsinghua University, one of China's prestigious institutions, DeepSeek announced in its latest academic paper published on Friday that it has developed a technique that allows AI models to self-improve. This new technology is called 'self-principled critique tuning' (SPCT). With this method, AI creates its own rules for evaluating content and then produces detailed feedback (critiques) based on these rules. The system developed by DeepSeek achieves better results by running multiple evaluations simultaneously instead of using large models. This approach is known as 'generative reward modeling' (GRM). This system evaluates the content generated by AI and checks how well it aligns with user expectations using the SPCT method. So, how does this system work? Generally, larger models need to be trained to improve AI. However, this requires significant computational power and considerable human labor. Instead, DeepSeek has established an integrated 'judge' system within the AI. This judge evaluates the answers provided by the AI in real-time. When a user asks a question, this internal judge system compares the answer with both the model's own rules and what an ideal answer should be. If the answer presents a sufficiently good match, the system provides positive feedback to the AI. Thus, the model improves itself over time. DeepSeek has named this self-improving system 'DeepSeek-GRM.' Researchers claim that this method will outperform competitor models like Google's Gemini, Meta's Llama, and OpenAI's GPT-4o. The company plans to offer these advanced AI models as open source, but no specific date has been provided. With the publication of this scientific paper, rumors have increased that DeepSeek is preparing to introduce its next-generation chatbot, R2. However, the company has not made any official statements on this matter so far.