Is it really DeepSeek FTW?

So, DeepSeek just dropped their latest AI models, and while it’s exciting, there are some cautions to consider. Because of the US export controls around advanced hardware, DeepSeek has been operating under a set of unique constraints that have forced them to get creative in their approach. This creativity seems to have yielded real progress in reducing the amount of hardware required for training high-end models in reasonable timeframes and for inferencing off those same models. If reality bears out the claims, this could be a sea change in the monetary and environmental costs of training and hosting LLMs.

In addition to the increased efficiency, DeepSeek’s R1 model is continuing to swell the innovation curve around reasoning models. Models that follow this emerging chain of thought paradigm in their responses, providing an explanation of their thinking first and then summarizing into an answer, are providing a step change in response quality. Especially when paired with RAG and a library of tools or actions in an agentic framework, baking this emerging pattern into the models instead of including it in the prompt is a serious innovation. We’re going to see even more open-source model vendors follow OpenAI and DeepSeek in this.

Key Considerations

One of the key factors in considering the adoption of DeepSeek models will be data residency requirements for your business. For now, self-managed private hosting is the only option for maintaining full US, EU, or UK data residency with these new DeepSeek models (the most common needs for our clients). The same export restrictions limiting the hardware available to DeepSeek have also prevented OpenAI from offering their full services with comprehensive Chinese data residency. This makes DeepSeek a compelling offering for businesses needing an option within China. It’s yet to be seen if the hyperscalers or other providers will offer DeepSeek models on their platforms (Before I managed to get his published, Microsoft made a move and is offering DeepSeek-R1 in Azure AI Foundry). The good news is that the models are highly efficient, and self-image hosting is feasible and not overly expensive for inferencing with these models. The downside is managing provisioned capacity when workloads can be uneven, which is why pay-per-token models are often the most cost efficient.

We are expecting that these new models and the reduced prices associated with them will have serious downward pressure on per-token costs for other models hosted by the hyperscalers. We’ll be paying specific attention to Microsoft as they are continuing to diversify their offerings beyond OpenAI, especially with their decision to make DeepSeek-R1 available. We also expect to see US-based firms replicate DeepSeek’s successes, especially given that Hugging Face has already started work within their Open R1 project to take the research behind DeepSeek’s announcements and make it fully open source.

What to Do Now

This is a definite leap forward and progress in the direction of what we have long said is the destination—more and smaller models targeted at specific use cases. For now, when looking at our clients, we advise a healthy dose of “wait and see.” As has been the case for the last three years, this technology is evolving rapidly, and we expect there to be further developments in the near future from other vendors. Our perpetual reminder to our clients is that security and privacy always outweigh marginal cost savings in the long run.

The comprehensive FAQ from Stratechery is a great resource for more information.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

SteelSeries reveals new Arctis Nova 3 Wireless headset series for Xbox, PlayStation, Nintendo Switch, and PC

The Witcher 4 looks absolutely amazing in UE5 technical presentation at State of Unreal 2025

Razer’s having another go at making it so you never have to charge your wireless gaming mouse, and this time it might have nailed it

Alienware’s rumored laptop could be the first to feature NVIDIA’s revolutionary Arm-based APU

easy-live2d – About Make your Live2D as easy to control as a pixi sprite! Live2D Web SDK based on Pixi.js.

easy-live2d – About Make your Live2D as easy to control as a pixi sprite! Live2D Web SDK based on Pixi.js.

From Kitchen To Conversion

Perficient Included in Forrester’s AI Technical Services Landscape, Q2 2025

SteelSeries reveals new Arctis Nova 3 Wireless headset series for Xbox, PlayStation, Nintendo Switch, and PC

SteelSeries reveals new Arctis Nova 3 Wireless headset series for Xbox, PlayStation, Nintendo Switch, and PC

The Witcher 4 looks absolutely amazing in UE5 technical presentation at State of Unreal 2025

Razer’s having another go at making it so you never have to charge your wireless gaming mouse, and this time it might have nailed it

Is it really DeepSeek FTW?

The Open Source LLM Agent Handbook: How to Automate Complex Tasks with LangGraph and CrewAI

Markus Buehler receives 2025 Washington Award

Last Week in AI #270: DeepMind releases AlphaFold 3, Marines test robot dogs with rifles, DeepSeek-V2, TikTok to label AI-generated content, and more!

New Data Management Experience in the Atlas UI

The Freelance Advantage: Turning Website Fixes into Opportunities

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum

Ubuntu 25.04: Ripresi gli aggiornamenti dopo il blocco per bug su Kubuntu

Pinout with Dan Johnson

Webaggr â€“ handpicked collection of landing page design

Grok 3 AI is now free to all X users – here’s how it works

Is it really DeepSeek FTW?

Related Posts