Ấn Độ đặt cược lớn vào LLM nội địa: Tham vọng xây dựng ChatGPT phiên bản “Made in India”

  • Ấn Độ đang triển khai tham vọng xây dựng các mô hình ngôn ngữ lớn (LLM) nội địa theo sáng kiến IndiaAI Mission trị giá 10.037 crore INR (1,2 tỷ USD), với mục tiêu bảo vệ chủ quyền dữ liệu và phục vụ đặc thù ngôn ngữ, văn hóa Ấn Độ.

  • 4 startup đầu tiên được chọn gồm Sarvam AI, Soket AI, Gnani.aiGan.AI. Hơn 500 đề xuất khác đang chờ phê duyệt, tùy thuộc vào năng lực tính toán (GPU) hiện có.

  • Sarvam AI đã ra mắt Sarvam-1 (2B tham số) và Sarvam-M (24B), đồng thời phát triển các phiên bản cho thiết bị di động và ứng dụng tương tác thời gian thực.

  • Soket AI đang xây dựng mô hình mã nguồn mở 120B tham số; bản 7B sẽ ra mắt sau 6 tháng. Gnani.ai phát triển mô hình Voice AI 16B. Gan.AI tập trung vào mô hình đa ngôn ngữ 70B dành cho text-to-speech siêu thực.

  • Chính phủ cam kết hỗ trợ GPU, với năng lực hiện đạt 34.333 GPU (tăng 15.916 so với trước) nhưng nhiều GPU vẫn chưa đi vào hoạt động.

  • Các nhà sáng lập cảnh báo rằng chi phí huấn luyện cao và thiếu GPU tiên tiến là rào cản chính. Trong khi đó, vốn đầu tư mạo hiểm chủ yếu đổ vào ứng dụng AI, không phải mô hình nền tảng.

  • Pranav Mistry (TWO.AI) cho rằng vấn đề không chỉ là GPU mà còn là hạ tầng hỗ trợ: pipeline dữ liệu nhanh, quyền truy cập GPU linh hoạt và cấu hình huấn luyện tối ưu.

  • Thay vì cạnh tranh trực diện với ChatGPT hay LLaMA, Ấn Độ nên tập trung vào mô hình theo chiều dọc – AI chuyên biệt cho y tế, nông nghiệp, giáo dục, hành chính công – với ngữ cảnh và ngôn ngữ bản địa.


📌 Ấn Độ đang bước vào cuộc đua AI với mục tiêu xây dựng mô hình LLM chủ quyền, phục vụ ngôn ngữ và nhu cầu riêng. Dù thiếu GPU vẫn là rào cản lớn, 4 startup hàng đầu đã bắt đầu phát triển mô hình từ 2B đến 120B tham số. Nếu thực hiện thành công, Ấn Độ không chỉ sử dụng AI – mà còn định hình AI cho riêng mình, phục vụ từ bệnh viện nông thôn đến các hệ thống chính phủ.

https://inc42.com/features/inside-indias-high-stakes-bet-to-build-its-own-gpt/

Inside India’s High-Stakes Bet To Build Its Own GPT

22 Jun'25 | By Tapanjana Rudra
 
SUMMARY
•In India’s bid to build the country’s Sovereign LLMs, four AI startups have received approvals of their proposals from the Indian government, while more than 500 other proposals are yet to receive a nod
•The Indian government is still on the fence when it comes to deciding the full scope of its collaborations with AI startups. However, it’s clear that each of these companies will get computational support or access to graphics processing units (GPUs) for training their data
•Many startup founders agree that the availability of GPUs, particularly the latest GPUs, is the biggest challenge India faces right now, and the Centre is trying to bridge this demand-supply gap
  • With a collaborative push from the Centre and four generative AI startups — Sarvam AI, Soket AI, Gnani.AI, and Gan.AI — India is not far from launching its frontier AI models.
Until last year, India’s tech community had been debating whether the country should develop its very own foundational large language models (LLMs). 
From technocrat Nandan Nilekani to tech startup founders, including CRED’s Kunal Shah, and many VCs, have often questioned the viability of splurging on building desi foundational LLMs. 
“Let the big boys in the (Silicon) Valley do it, spending billions of dollars. We will use it to create synthetic data, build small language models quickly, and train them using appropriate data…” Nilekani said last year.  
However, the emergence of DeepSeek-R1, a foundational model developed by the Chinese company DeepSeek for under $6 Mn, challenged this notion in January this year. 
With costs no longer a stumbling block, at least what the DeepSeek showed to this world, industry experts changed their pitch, now calling it a pressing requirement.
The Centre, too, saw an opportunity and decided to be pound-wise, finally waking up to the idea of building Sovereign AI models to maintain data sovereignty, cater to the diverse language and culture of the country, and make India part of the global AI revolution. 
It announced plans to build the country’s own LLM as part of the INR 10,037 Cr IndiaAI Mission towards the end of January. More recently, it shortlisted Soket AIGnani.ai, and Gan.AI to build India-specific foundational LLMs. 
Even as the country has selected its AI cavalry, questions that may come to mind are — what are we developing and how far have we come to live our Indic LLM dream? This is precisely what we will try to comprehend today. 

So, What’s Being Served At India’s Big AI Feast?

While Soket AI is building a 120 Bn parameter open-source text model (the first iteration expected to be ready in 12 months, after it launches a 7 Bn parameter model in six months), Gnani.ai is working on a 16 Bn parameter Voice AI foundational model (expected to be ready in six to eight months). 
 
Similarly, Gan.AI is creating a 70 Bn parameter multilingual foundation model targeting ‘Superhuman TTS (text-to-speech)’.
Sarvam AI, which was the first startup to get selected by the India AI mission in April, has launched Sarvam-1, a 2 Bn parameter model, and Sarvam-M, a 24 Bn parameter model. Sarvam-M is a hybrid model built on Mistral Small (a versatile model designed to handle a wide range of generative AI tasks) and designed with a focus on Indian languages and advanced reasoning capabilities.
As part of its LLM building process, the Peak XV-backed startup is developing three model variants:
  • Sarvam-Large for advanced reasoning and generation
  • Sarvam-Small for real-time interactive applications
  • Sarvam-Edge for compact on-device tasks 
While industry leaders claim to have made significant progress in the last few months, the Indian government is still on the fence when it comes to deciding the full scope of its collaborations with AI startups.
However, one major aspect is clear — each of these companies will get computational support or access to graphics processing units (GPUs) for training their data. This would lower the cost of building LLMs.
Soket AI founder and CEO Abhishek Upperwal told Inc42 that its proposal to the government had two key facets — GPU support and a grant request of INR 14.5 Cr.
“GPU support is one thing, but we will still require a lot of money for curating, clearing, and training datasets. That’s why we asked for a small cash component. We don’t yet have a proper sanction letter stating that we will get the cash. My assumption is that the Centre will allow us GPU support… The cash part is uncertain,” he said.
Gnani.ai founder and CEO Ganesh Gopalan, too, is refraining from getting his hopes high. Well, half a loaf is better than none, anyway. However, he is confident about receiving computational support.
While Gopalan did not reveal what else the company’s proposal to the IndiaAI Mission entailed, he said that the government is looking to solve two major problems as part of these collaborations — making GPUs more accessible and bringing down the cost of building LLMs.

India And GPUs: Are We Still Playing Catch Up?

Along with inviting applications from startups to build foundational LLMs, the Centre had also floated tenders for companies to provide GPU support
22nd June, 2025
On May 30, when the government announced the names of the three beneficiaries — Soket AI, Gnani.ai and Gan.AI — Union minister Ashwini Vaishnaw said that the country’s compute capacity had crossed 34,000 GPUs. 
At least 15,916 GPUs were added to the existing 18,417 empanelled GPUs. However, these new GPUs are yet to go live.
In the current scheme of things, computational support becomes imperative to keep the development costs low, especially when VCs are more interested in funding AI applications and not foundational models due to their longer turnaround times and hefty investments.  
According to an Inc42 report, VCs have so far invested $1.2 Bn in AI applications compared to a mere $120 Mn in foundational models. 
Now, with the government allowing access to GPUs, the costs may significantly reduce, giving private funding a much-needed shot in the arm.
Despite this picturesque AI narrative, the ground reality is frayed. Why?  
Until April 30, the Centre received 506 proposals, and the selection of the total number of companies will be in sync with the final GPU capacity India is able to provide. 
Aakrit Vaish, former Haptik CEO and ex-advisor to the IndiaAI Mission, is wary of a situation where the Centre approves proposals from 12-13 companies and the existing compute capacity gets exhausted. 
“The compute capacity, too, has to come in on time to support proposals,” he said, adding that six to 10 more startups are likely to receive approvals by the end of 2025.
Besides, he said that specifics within the overall INR 10,000 Cr+ IndiaAI Mission budget also vary depending on multiple factors, which will ultimately decide how much of it the Centre wants to spend on GPUs or on building LLMs this year.

Why GPUs Are Still The Bottleneck For India?

Traditionally, compute makes up around 90% of the total cost of training LLMs. With compute requirements rising exponentially, training a single frontier model could cost over $1 Bn by 2027, according to projections by Epoch AI. This is one of the biggest challenges for India when it comes to building frontier models. 
However, despite the high cost, there is no dearth of demand, which has created a supply shortage of GPUs.
“GPUs are as valuable as gold. With some top global companies like OpenAI taking up the largest chunks of GPU clusters, there has been a struggle to get GPUs in markets like India, particularly, the latest GPUs,” said Gnani.ai’s Gopalan.
According to him, in the next 10 years or so, the number of AI components a nation would have or produce will determine its standing on the global pulpit.   
“Many companies are looking at alternatives to GPUs, but you can’t build anything great in AI without these chips,” he said.
However, former IndiaAI Mission advisor, Vaish, sees it from a demand and supply lens.
“For a long time, there were many GPUs available in the market, but not enough companies to use them. Now, there has been a sudden spike in demand for building LLMs. Right now, the Indian government is trying to bridge that supply gap,” he said.
But, increasing its GPU stack may not be the only challenge before India, believes the founder and CEO of TWO.AI, Pranav Mistry. TWO has built an Indic-language model and owns a GPU cluster. It is also one of those 500+ startups that have submitted a proposal to IndiaAI for building models for India.
“I don’t think compute is the limiting factor anymore. With new training techniques, you can now train state-of-the-art models with a tenth of the compute that was needed two years ago. We don’t necessarily need more compute, rather we need better supporting infrastructure: fast data pipelines, low-level GPU access, flexible training configurations. That’s where government support can make a bigger impact,” Mistry said.
US Leads The show in frontier model building

What’s Next For India’s Sovereign AI Efforts?

Given that India is already late to catch the AI bus, what we can do instead is remain steadfast in rolling out our first model. Once the first model is launched, bigger developments will follow. Experts believe India must focus on building smaller, vertical models. 
“If companies try to compete with the likes of Llama or ChatGPT, it would be very expensive, while smaller, vertical LLMs would comparatively be cheaper. So, the stakeholders need to think along those lines,” Vaish said. 
TWO.AI’s Mistry sees this as a once-in-a-lifetime opportunity for India to shape the future of fundamental technologies tailored for India.
For him, Sovereign AI should mean more than just owning infrastructure. It should be about shaping the intelligence that powers the future of Indians — from small clinics in rural India to the government.
“We can’t afford to be passive users of someone else’s models; we need to lead. But leading doesn’t mean building just another massive LLM. Now, the real next opportunity lies in building the next generation of models — models that go beyond just text and voice. Think world models, systems that understand and reason, models aligned with sectors like agriculture, healthcare, education, governance — built for our realities, in our languages, with our context.”
All in all, India’s AI moment is no longer a distant dream, even though the road towards an AI-ready India is still under construction. This allows us to build AI that understands not just language but also the complex realities of the country. 
If India’s LLM dream is realised, we won’t be playing by the rules of the West, but rather setting our own AI feast with the menu of our choice.

Không có file đính kèm.

23

Thảo luận

© Sóng AI - Tóm tắt tin, bài trí tuệ nhân tạo