Reasons Smaller Generative Models Are Gaining Adoption

Explore top LinkedIn content from expert professionals.

Summary

More companies are adopting smaller generative AI models due to their cost efficiency, faster processing times, and ability to handle targeted tasks effectively. These models, while less complex than their larger counterparts, are proving capable of meeting enterprise needs without breaking the bank.

  • Focus on cost control: Smaller models significantly reduce expenses, making them an attractive choice for businesses aiming to manage AI-related costs while still achieving strong performance.
  • Prioritize speed: These models deliver lower latency, providing quicker results and an improved user experience compared to larger models.
  • Customize for precision: Smaller models can be fine-tuned for specific tasks or domains, offering tailored solutions that enhance productivity in areas like customer service or content generation.
Summarized by AI based on LinkedIn member posts
  • View profile for Matthew Lynley

    Data analyst and writer with 15 years of experience across multiple industries.

    5,356 followers

    I’ve been asking the dozens of people I’ve talked to in the past few months the same thing: how have the questions enterprises are asking about AI changed in the last 6-9 months? As it turns out, they have: instead of just jumping on the hype train, they’re being more careful and deliberate about what they are looking for. In the last nine months we’ve seen an incredible race to the bottom in terms of APIs and top-end model performance. And now enterprises that I speak with have determined they can’t just slap a wrapper on OpenAI and call it a day. Instead they are now thinking on multi-year horizons, figuring out what the infrastructure they need looks like for more advanced use cases like RAG, and most importantly, how to keep costs under control. In many cases, these companies saw their first bill from OpenAI at the front of the wave and immediately got sticker shock. That’s made them, for example, evaluate smaller models with a focus on making them work with proprietary data. Even the difficulty curve of putting together these smaller models is looking much more manageable when compared to the ease of use of the APIs, even when including OpenAI and Gemini fine-tune products. (Though, to be clear, these products very much have the eyes of enterprises.) Part of all this is also a recognition that the fastest pathway to generate a return on investment is saving costs by automating much smaller and contained tasks—and doing that a lot. It means customizing a smaller or cheaper model through fine-tuning or RAG, for example, to just summarize sales calls and extract information. Another example that comes up often is managing the flow of customer service tickets, and when to escalate. It feels like, based on all these conversations, the era of thinking you could wave a OpenAI-shaped wand and generate completely new lines of business or completely replace an entire class of workers is effectively gone. But at the same time, there’s enough signal that companies aren’t just throwing the idea of deploying a generative AI tool into the trash or writing it off as an experiment. #ai #bigdata #openai https://lnkd.in/gfrmrjmT

  • View profile for Tomasz Tunguz
    Tomasz Tunguz Tomasz Tunguz is an Influencer
    402,360 followers

    77% of enterprise AI usage are using models that are small models, less than 13b parameters. Databricks, in their annual State of Data + AI report, published this survey which among other interesting findings indicated that large models, those with 100 billion perimeters or more now represent about 15% of implementations. In August, we asked enterprise buyers What Has Your GPU Done for You Today? They expressed concern with the ROI of using some of the larger models, particularly in production applications. Pricing from a popular inference provider shows the geometric increase in prices as a function of parameters for a model. But there are other reasons aside from cost to use smaller models. First, their performance has improved markedly with some of the smaller models nearing their big brothers’ success. The delta in cost means smaller models can be run several times to verify like an AI Mechanical Turk. Second, the latencies of smaller models are half those of the medium sized models & 70% less than the mega models. Higher latency is an inferior user experience. Users don’t like to wait. Smaller models represent a significant innovation for enterprises where they can take advantage of similar performance at two orders of magnitude, less expense and half of the latency. No wonder builders view them as small but mighty. Note: I’ve abstracted away the additional dimension of mixture of experts models to make the point clearer. There are different ways of measuring latency, whether it’s time to first token or inter-token latency.

  • View profile for Vin Vashishta
    Vin Vashishta Vin Vashishta is an Influencer

    AI Strategist | Monetizing Data & AI For The Global 2K Since 2012 | 3X Founder | Best-Selling Author

    204,276 followers

    Microsoft’s small model AI strategy is paying off, putting it miles ahead of other hyperscalers. The company is slowing the pace of its data center build-outs and will reap the benefits of faster deployments and higher AI product margins for years. Smaller generative AI models have between 250 million and 8 billion parameters, while larger models like GPT and Claude have hundreds of billions or even trillions. The size difference creates an equally significant cost difference. Small models target a skill or domain. By chaining these models together, the AI platform can support a range of user intents, processes, and workflows. It also better aligns with the agentic AI model. Skills models support tasks like generating marketing content or recommending products and solutions during sales calls. Domains target more granular expertise like building marketing content specifically for social media. Last year, Microsoft acqui-hired Inflection’s founder, Mustafa Suleyman, and put him in charge of its advanced AI division to guide the pivot from OpenAI’s one massive model approach to a more efficient small model AI strategy. It’s an excellent case study in the benefits of revenue-centric AI. Microsoft realized earlier than most that the costs of a one massive model approach scale faster than its returns. It followed the foundational tenets of technical strategy and pivoted to more revenue-centric AI product implementations. Salesforce has also embraced the small model AI strategy. When I talked to its Chief Scientist, Silvio Savarese, he indicated that Agentforce is also built on smaller, domain and skill-specific AI models. Other hyperscalers are still stuck in a revenue-agnostic AI strategy, putting them at a massive disadvantage.

Explore categories