Close Menu

    Subscribe to Updates

    Get the latest Tech news from SynapseFlow

    What's Hot

    Laptop performance and FPS drop after BIOS update

    March 14, 2026

    How to upgrade your car’s old audio system to work with Android Auto and Apple CarPlay

    March 14, 2026

    US Destroys All Military Targets on Kharg Island Which Is Iran’s Oil Export Hub

    March 14, 2026
    Facebook X (Twitter) Instagram
    • Homepage
    • About Us
    • Contact Us
    • Privacy Policy
    Facebook X (Twitter) Instagram YouTube
    synapseflow.co.uksynapseflow.co.uk
    • AI News & Updates
    • Cybersecurity
    • Future Tech
    • Reviews
    • Software & Apps
    • Tech Gadgets
    synapseflow.co.uksynapseflow.co.uk
    Home»AI News & Updates»Inside Ring-1T: Ant engineers solve reinforcement learning bottlenecks at trillion scale
    Inside Ring-1T: Ant engineers solve reinforcement learning bottlenecks at trillion scale
    AI News & Updates

    Inside Ring-1T: Ant engineers solve reinforcement learning bottlenecks at trillion scale

    The Tech GuyBy The Tech GuyOctober 25, 2025No Comments4 Mins Read0 Views
    Share
    Facebook Twitter LinkedIn Pinterest Email
    Advertisement



    Inside Ring-1T: Ant engineers solve reinforcement learning bottlenecks at trillion scale

    China’s Ant Group, an affiliate of Alibaba, detailed technical information around its new model, Ring-1T, which the company said is “the first open-source reasoning model with one trillion total parameters.”

    Advertisement

    Ring-1T aims to compete with other reasoning models like GPT-5 and the o-series from OpenAI, as well as Google’s Gemini 2.5. With the new release of the latest model, Ant extends the geopolitical debate over who will dominate the AI race: China or the US. 

    Ant Group said Ring-1T is optimized for mathematical and logical problems, code generation and scientific problem-solving. 

    “With approximately 50 billion activated parameters per token, Ring-1T achieves state-of-the-art performance across multiple challenging benchmarks — despite relying solely on natural language reasoning capabilities,” Ant said in a paper.

    Ring-1T, which was first released on preview in September, adopts the same architecture as Ling 2.0 and trained on the Ling-1T-base model the company released earlier this month. Ant said this allows the model to support up to 128,000 tokens.

    To train a model as large as Ring-1T, researchers had to develop new methods to scale reinforcement learning (RL).

    New methods of training

    Ant Group developed three “interconnected innovations” to support the RL and training of Ring-1T, a challenge given the model's size and the typically large compute requirements it entails. These three are IcePop, C3PO++ and ASystem.

    IcePop removes noisy gradient updates to stabilize training without slowing inference. It helps eliminate catastrophic training-inference misalignment in RL. The researchers noted that when training models, particularly those using a mixture-of-experts (MoE) architecture like Ring-1T, there can often be a discrepancy in probability calculations. 

    “This problem is particularly pronounced in the training of MoE models with RL due to the inherent usage of the dynamic routing mechanism. Additionally, in long CoT settings, these discrepancies can gradually accumulate across iterations and become further amplified,” the researchers said. 

    IcePop “suppresses unstable training updates through double-sided masking calibration.”

    The next new method the researchers had to develop is C3PO++, an improved version of the C3PO system that Ant previously established. The method manages how Ring-1T and other extra-large parameter models generate and process training examples, or what they call rollouts, so GPUs don’t sit idle. 

    The way it works would break work in rollouts into pieces to process in parallel. One group is the inference pool, which generates new data, and the other is the training pool, which collects results to update the model. C3PO++ creates a token budget to control how much data is processed, ensuring GPUs are used efficiently.

    The last new method, ASystem, adopts a SingleController+SPMD (Single Program, Multiple Data) architecture to enable asynchronous operations.  

    Benchmark results

    Ant pointed Ring-1T to benchmarks measuring performance in mathematics, coding, logical reasoning and general tasks. They tested it against models such as DeepSeek-V3.1-Terminus-Thinking, Qwen-35B-A22B-Thinking-2507, Gemini 2.5 Pro and GPT-5 Thinking. 

    In benchmark testing, Ring-1T performed strongly, coming in second to OpenAI’s GPT-5 across most benchmarks. Ant said that Ring-1T showed the best performance among all the open-weight models it tested. 

    The model posted a 93.4% score on the AIME 25 leaderboard, second only to GPT-5. In coding, Ring-1T outperformed both DeepSeek and Qwen.

    “It indicates that our carefully synthesized dataset shapes Ring-1T’s robust performance on programming applications, which forms a strong foundation for future endeavors on agentic applications,” the company said. 

    Ring-1T shows how much Chinese companies are investing in models 

    Ring-1T is just the latest model from China aiming to dethrone GPT-5 and Gemini. 

    Chinese companies have been releasing impressive models at a quick pace since the surprise launch of DeepSeek in January. Ant's parent company, Alibaba, recently released Qwen3-Omni, a multimodal model that natively unifies text, image, audio and video. DeepSeek has also continued to improve its models and earlier this month, launched DeepSeek-OCR. This new model reimagines how models process information. 

    With Ring-1T and Ant’s development of new methods to train and scale extra-large models, the battle for AI dominance between the US and China continues to heat up.   

    Advertisement
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    The Tech Guy
    • Website

    Related Posts

    Railway secures $100 million to challenge AWS with AI-native cloud infrastructure

    January 22, 2026

    Claude Code costs up to $200 a month. Goose does the same thing for free.

    January 20, 2026

    Listen Labs raises $69M after viral billboard hiring stunt to scale AI customer interviews

    January 16, 2026

    Salesforce rolls out new Slackbot AI agent as it battles Microsoft and Google in workplace AI

    January 13, 2026

    Converge Bio raises $25M, backed by Bessemer and execs from Meta, OpenAI, Wiz

    January 13, 2026

    Anthropic launches Cowork, a Claude Desktop agent that works in your files — no coding required

    January 13, 2026
    Leave A Reply Cancel Reply

    Advertisement
    Top Posts

    The iPad Air brand makes no sense – it needs a rethink

    October 12, 202516 Views

    ChatGPT Group Chats are here … but not for everyone (yet)

    November 14, 20258 Views

    Facebook updates its algorithm to give users more control over which videos they see

    October 8, 20258 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Advertisement
    About Us
    About Us

    SynapseFlow brings you the latest updates in Technology, AI, and Gadgets from innovations and reviews to future trends. Stay smart, stay updated with the tech world every day!

    Our Picks

    Laptop performance and FPS drop after BIOS update

    March 14, 2026

    How to upgrade your car’s old audio system to work with Android Auto and Apple CarPlay

    March 14, 2026

    US Destroys All Military Targets on Kharg Island Which Is Iran’s Oil Export Hub

    March 14, 2026
    categories
    • AI News & Updates
    • Cybersecurity
    • Future Tech
    • Reviews
    • Software & Apps
    • Tech Gadgets
    Facebook X (Twitter) Instagram Pinterest YouTube Dribbble
    • Homepage
    • About Us
    • Contact Us
    • Privacy Policy
    © 2026 SynapseFlow All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.