1.5 trillion parameter hardcore debut! XAI Grok V9 Medium training completed, programming AI race welcomes heavyweight new players

2026-05-28

The global AI big model competition continues to heat up. On May 25, 2026, Elon Musk officially announced that the new generation of xAI basic model Grok V9 Medium has completed full training and is about to enter the fine-tuning and online stage.

This ultra large scale model, positioned as a professional programming capability, has three core features: a parameter volume of 1.5 trillion, specialized training in cursor code data, and deep optimization of Blackwell architecture. It directly enters the field of code assistants and developer tools, and is expected to rewrite the current AI programming market landscape.

Core parameters and positioning: Triple scale upgrade, targeting complex programming scenarios directly

Grok V9 Medium is a mid to high end flagship model launched by xAI for production environments, with clear core parameters and positioning.

Model Name: Grok V9 Medium

Parameter scale: 1.5 trillion (1.5T)

Previous generation comparison: The main online version Grok V8 Small has 0.5 trillion parameters, which directly increases the scale by three times

Core scenarios: Complex code understanding, engineering programming, multi file project inference, automatic bug localization and fixing

Current stage: Basic training completed, entering supervised fine-tuning SFT, starting reinforcement learning RL within a few days

Release time: Expected to be open to the public within 2 to 3 weeks, and is likely to be officially launched in mid June

Musk has publicly stated that the V8 Small, which currently carries full online traffic, has obvious shortcomings in code data quality, diversity, and proportional balance, and V9 Medium is an engineering specific large model designed to address this pain point.

Technological breakthrough one: Cursor code data specialized training to create AI native engineers

The most disruptive design of Grok V9 Medium in this industry is the large-scale introduction of actual engineering data from Cursor during the supplementary training phase, which directly benchmarks the capabilities of top programming tools.

Why choose cursor data? Cursor is currently the mainstream AI programming editor, which has accumulated a large number of real engineering calling habits, debugging paths, and refactoring logic. The training data covers code completion, multilingual syntax, complex project structure understanding, automated testing, and error fixing, allowing the model to skip the general corpus learning stage and directly learn the real engineering thinking of professional developers.

This means that Grok V9 Medium is no longer just a large model that can write code, but an AI assistant with engineering and practical capabilities that can deeply understand large code repositories, cross file dependencies, complex business logic, and better meet the needs of real R&D scenarios.

Technological breakthrough 2: Deep adaptation to NVIDIA Blackwell architecture, significantly improving computing efficiency

To match the training and inference efficiency of 1.5T parameters, xAI performs deep optimization of the underlying architecture for Grok V9 Medium, fully customized and adapted for NVIDIA Blackwell GPU system.

The core improvement brought by hardware collaboration:

The computational efficiency has significantly improved. The Blackwell architecture surpasses Hopper in terms of throughput, energy efficiency, and distributed training speed.

The cost of reasoning has significantly decreased. The unit token cost has decreased by about 90% compared to the previous generation, and the cost of one million tokens has decreased to 1/35.

Distributed training enhances stability. Supports parallel training at the 10000 card level, suitable for fast iteration of ultra large scale models.

Collaboration between the end side and the cloud. To lay the foundation of computing power for future implementation in Tesla vehicles, X platform and other scenarios.

This optimization directly solves the industry problem that the larger the parameters of the large model, the higher the cost and the slower the speed, making the 1.5T level model have the conditions for large-scale commercialization.

Technological breakthrough three: Full process training system, clear rhythm from basic training to online deployment

Grok V9 Medium adopts a rigorous industrial training process to ensure stability and consistency of abilities.

Basic pre training: Completed full training of 1.5T parameters with good evaluation results.

Supplementary training: Inject cursor code data to enhance programming and engineering skills.

Supervise and fine tune SFT: Currently executing, aligning human instructions with engineering specifications.

Reinforcement learning RL: Launch within a few days to further enhance logical rigor and code reliability.

Public launch: After completing SFT and RL, it will be available for use within 2 to 3 weeks.

The entire process is highly transparent and has a clear rhythm, making it a typical delivery path for industrial grade large models.

Industry impact: Programming AI track reshuffled, facing competition from mainstream code assistants

With the upcoming launch of Grok V9 Medium, the AI code assistant market will enter a new round of high-intensity competition.

Core differences from previous models:

In terms of parameters, the Grok V8 Small is 0.5T and the Grok V9 Medium is 1.5T, which is a threefold increase.

In terms of training data, V8 Small focuses on general corpora, while V9 Medium is dedicated to general and Cursor engineering code.

In terms of core competencies, V8 Small focuses on general Q&A and simple code, while V9 Medium focuses on complex programming, engineering understanding, and project level reasoning.

In terms of hardware optimization, V8 Small is a universal adaptation, while V9 Medium deeply customizes the Blackwell architecture.

In terms of positioning, V8 Small is a universal large model, while V9 Medium is a production level large model specialized in programming.

Impact on the industry landscape:

Directly benchmark mainstream code models such as Claude, DeepSeek, GitHub Copilot, etc.

By leveraging the triple advantages of 1.5T parameters, engineering data, and efficient computing power, we aim to penetrate the high-end programming market.

Promote the transformation of large models from general capabilities to deep specialization in vertical scenes.

Reduce the threshold and cost for enterprises and developers to use large-scale code models.

The training of Grok V9 Medium has been completed, marking the official shift of xAI from general large-scale models to vertical engineering fields. The three core capabilities of 1.5 trillion parameters, Cursor engineering training, and Blackwell deep adaptation make it a highly competitive player in the current AI programming field.

For developers, this means that more powerful, stable, and realistic AI assisted tools are about to arrive. For the industry, the competition for large model parameters is accelerating towards competition in scenario capability and engineering efficiency.