DeepSeek, a relatively new AI company from China, has been making waves in the tech world, challenging the dominance of established US giants like OpenAI, Google and Meta. Founded in May 2023 by Liang Wenfeng, who also founded the quantitative hedge fund High-Flyer, DeepSeek is funded entirely by High-Flyer, allowing it to focus on long-term research and development without the pressures of external investors. This unique funding model has enabled DeepSeek to pursue ambitious AI projects.
DeepSeek’s team is primarily made up of young, talented graduates from top Chinese universities, and the company prioritises technical skills over traditional work experience. This approach has fostered a culture of innovation, allowing DeepSeek to develop cutting-edge AI models with remarkable efficiency.
Key Innovations and Models
DeepSeek began its journey with the release of DeepSeek Coder in November 2023, an open-source model for coding tasks. This was followed by DeepSeek LLM, and then DeepSeek-V2, which gained attention for its strong performance and low cost, sparking a price war in the Chinese AI market.
DeepSeek’s models include:
DeepSeek Coder: An open-source model designed for coding tasks.
DeepSeek LLM: A large language model designed to compete with other models.
DeepSeek-V2: Known for strong performance and low cost.
DeepSeek-Coder-V2: An advanced model with 236 billion parameters, designed for complex coding challenges and with a high context length.
DeepSeek-V3: A 671 billion parameter model, noted for its impressive performance while requiring fewer resources than its peers.
DeepSeek-R1: Released in January 2025, this model focuses on reasoning tasks and challenges OpenAI’s models.
DeepSeek-R1-Distill: Distilled models based on open-weight models like Llama and Qwen, fine-tuned on synthetic data generated by R1, offering different levels of performance and efficiency.
DeepSeek uses several innovative techniques in their model development:
Reinforcement Learning: Instead of relying on supervised fine-tuning, DeepSeek uses pure reinforcement learning, allowing models to learn through trial and error and self-improve. This has been effective in developing the reasoning capabilities of DeepSeek-R1.
Mixture-of-Experts (MoE) Architecture: This allows models to activate only a small fraction of their parameters for each task, significantly reducing computational costs and increasing efficiency.
Multi-Head Latent Attention: This is used in DeepSeek-V3 to improve the model’s ability to process data and understand complex relationships.
Distillation: DeepSeek transfers the knowledge of larger models into smaller ones, making powerful AI accessible to more users and devices.
Cost-Efficiency and Open-Source Approach
DeepSeek is committed to cost-efficiency. By using reinforcement learning and efficient architectures like MoE, they have significantly reduced training costs. For example, DeepSeek-V3 was trained for a fraction of the cost of comparable models from Meta. DeepSeek’s API pricing is also lower than its competitors, making its models more accessible. The open-source nature of many of its models further increases cost-efficiency by removing licensing fees and encouraging community-driven development.
Impact on the AI Landscape
DeepSeek’s entry into the AI market has created significant competitive pressure on established companies, forcing them to lower prices and enhance their offerings. The company’s open-source approach is democratising access to advanced AI tools, encouraging innovation and collaboration. DeepSeek’s focus on efficiency highlights the importance of resource optimisation in AI, suggesting that high performance doesn’t always require massive resources.
The timing of DeepSeek’s recent product launches, particularly DeepSeek-R1, appears to be strategic, coinciding with geopolitical events, perhaps to challenge the perceived dominance of the US in AI. Hugging Face has launched the Open R1 project to replicate the DeepSeek-R1 training pipeline, further democratising access to AI development techniques.
Challenges Ahead
Despite its achievements, DeepSeek faces challenges, including a compute disadvantage compared to its US counterparts, especially given export controls on advanced chips. The company needs to build trust and recognition, and must also maintain a rapid pace of development to stay ahead in a competitive landscape. Additionally, the fact that DeepSeek’s models are subject to censorship could limit its global adoption.
The Future of DeepSeek
DeepSeek is undoubtedly a disruptive force in the AI landscape. Its innovative techniques, cost-efficient approach and focus on open-source collaboration have the potential to reshape the AI industry. The company’s progress will be closely watched as the AI race continues.