The Role of Memory in Artificial Intelligence

Explore top LinkedIn content from expert professionals.

Summary

Memory plays a vital role in advancing artificial intelligence by enabling AI systems to retain, retrieve, and adapt to past information, much like humans. It allows AI to maintain context across sessions, make informed decisions, and improve over time, thereby unlocking more personalized and efficient user interactions.

  • Focus on long-term memory: Incorporate systems that can store and retrieve historical data, user preferences, or domain knowledge to improve continuity and coherence in AI interactions over extended periods.
  • Adopt memory optimization techniques: Use tools like vector databases, semantic search, and context window management to ensure that AI can process and access relevant information quickly and accurately.
  • Implement memory security: Prioritize the protection of sensitive user data stored in AI memory layers by establishing robust security measures and considering on-premises storage solutions for enterprise use.
Summarized by AI based on LinkedIn member posts
  • View profile for Sohrab Rahimi

    Partner at McKinsey & Company | Head of Data Science Guild in North America

    20,419 followers

    The biggest limitation in today’s AI agents is not their fluency. It is memory. Most LLM-based systems forget what happened in the last session, cannot improve over time, and fail to reason across multiple steps. This makes them unreliable in real workflows. They respond well in the moment but do not build lasting context, retain task history, or learn from repeated use. A recent paper, “Rethinking Memory in AI,” introduces four categories of memory, each tied to specific operations AI agents need to perform reliably: 𝗟𝗼𝗻𝗴-𝘁𝗲𝗿𝗺 𝗺𝗲𝗺𝗼𝗿𝘆 focuses on building persistent knowledge. This includes consolidation of recent interactions into summaries, indexing for efficient access, updating older content when facts change, and forgetting irrelevant or outdated data. These operations allow agents to evolve with users, retain institutional knowledge, and maintain coherence across long timelines. 𝗟𝗼𝗻𝗴-𝗰𝗼𝗻𝘁𝗲𝘅𝘁 𝗺𝗲𝗺𝗼𝗿𝘆 refers to techniques that help models manage large context windows during inference. These include pruning attention key-value caches, selecting which past tokens to retain, and compressing history so that models can focus on what matters. These strategies are essential for agents handling extended documents or multi-turn dialogues. 𝗣𝗮𝗿𝗮𝗺𝗲𝘁𝗿𝗶𝗰 𝗺𝗼𝗱𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 addresses how knowledge inside a model’s weights can be edited, updated, or removed. This includes fine-grained editing methods, adapter tuning, meta-learning, and unlearning. In continual learning, agents must integrate new knowledge without forgetting old capabilities. These capabilities allow models to adapt quickly without full retraining or versioning. 𝗠𝘂𝗹𝘁𝗶-𝘀𝗼𝘂𝗿𝗰𝗲 𝗺𝗲𝗺𝗼𝗿𝘆 focuses on how agents coordinate knowledge across formats and systems. It includes reasoning over multiple documents, merging structured and unstructured data, and aligning information across modalities like text and images. This is especially relevant in enterprise settings, where context is fragmented across tools and sources. Looking ahead, the future of memory in AI will focus on: • 𝗦𝗽𝗮𝘁𝗶𝗼-𝘁𝗲𝗺𝗽𝗼𝗿𝗮𝗹 𝗺𝗲𝗺𝗼𝗿𝘆: Agents will track when and where information was learned to reason more accurately and manage relevance over time. • 𝗨𝗻𝗶𝗳𝗶𝗲𝗱 𝗺𝗲𝗺𝗼𝗿𝘆: Parametric (in-model) and non-parametric (external) memory will be integrated, allowing agents to fluidly switch between what they “know” and what they retrieve. • 𝗟𝗶𝗳𝗲𝗹𝗼𝗻𝗴 𝗹𝗲𝗮𝗿𝗻𝗶𝗻𝗴: Agents will be expected to learn continuously from interaction without retraining, while avoiding catastrophic forgetting. • 𝗠𝘂𝗹𝘁𝗶-𝗮𝗴𝗲𝗻𝘁 𝗺𝗲𝗺𝗼𝗿𝘆: In environments with multiple agents, memory will need to be sharable, consistent, and dynamically synchronized across agents. Memory is not just infrastructure. It defines how your agents reason, adapt, and persist!

  • View profile for Nir Diamant

    Gen AI Consultant | Public Speaker | Building an Open Source Knowledge Hub + Community | 60K+ GitHub stars | 30K+ Newsletter Subscribers | Open to Sponsorships

    18,707 followers

    Building AI agents that actually remember things 🧠 Got this excellent tutorial from Redis in my "Agents Towards Production" repo that tackles a real problem - how to give AI agents proper memory so they don't forget everything between conversations. The tutorial uses a travel agent as an example, but the memory concepts apply to any AI agent you want to build. It shows how to create agents that remember: - User preferences - Past interactions - Important context - Domain-specific knowledge Two types of memory: Short-term memory handles the current conversation, while long-term memory stores things across sessions. They use Redis for the storage layer with vector search for semantic retrieval. The travel agent example shows the agent learning someone prefers Delta airlines, remembers their wife's shellfish allergy, and can recall a family trip to Singapore from years back - but you could apply this same approach to customer service bots, coding assistants, or any other agent type. Tech stack covered: - Redis for memory storage - LangGraph (Harrison Chase) for agent workflows - RedisVL for vector search - OpenAI for the LLM Includes working code, error handling, and conversation summarization to keep context windows manageable. Part of the collection of practical guides for building production-ready AI systems. Check it out and give it a ⭐ if you find it useful: https://lnkd.in/dkjGZGiw What approaches have you found work well for agent memory? Always interested in different solutions. ♻️ Repost to let your network learn about this too! Credit to Tyler Hutcherson for creating this wonderful tutorial!

  • View profile for Armand Ruiz
    Armand Ruiz Armand Ruiz is an Influencer

    building AI systems

    202,065 followers

    “The main function of memory is to predict the future.” Let's learn how to handle AI Agents' memory and why it matters. It’s tempting to think of agents as “one-and-done” responders. Still, the real power comes from their ability to remember past interactions, understand long-term objectives, and adapt to changing conditions. By getting memory right, you’ll create agents that feel more coherent, purposeful, and valuable to users over time. This is a list of good techniques for Effective Memory Management: 1/ Short-Term (Session) Memory: Tracks recent queries, user intents, and context within the current session. 2/ Long-Term Memory: Stores historical data, user preferences, and domain knowledge that persists across sessions and reboots, ensuring continuity over days, weeks, or months. 3/ Vector Databases and Semantic Search: By converting text data into vector embeddings, agents can quickly search through large knowledge bases for relevant information. This semantic search capability helps the agent find the most contextually similar data points, supporting more nuanced and accurate responses. 4/ Chunking and Context Windows: For large inputs (like long documents or conversation histories), agents break the data into smaller “chunks.” This approach ensures that the agent can handle complex inputs without getting lost, enabling it to zero in on the most relevant pieces of information. 5/ Metadata and Tagging: Storing metadata—like timestamps, user IDs, or categories—helps the agent quickly filter what it needs. Instead of sifting through all past data, the agent can jump straight to relevant tags, speeding up retrieval and reducing the risk of inaccurate or stale information. 6/ Retrieval-Augmented Generation (RAG): RAG techniques involve querying a knowledge store for relevant context before the agent formulates its response. This ensures that the agent’s output is always grounded in the most up-to-date, accurate information, making it more reliable and consistent. Modern AI tools & frameworks such as CrewAI, Bee, LangGraph, or LangFlow simplify memory management, letting you focus on user experience and strategy.

  • View profile for Ravit Jain
    Ravit Jain Ravit Jain is an Influencer

    Founder & Host of "The Ravit Show" | Influencer & Creator | LinkedIn Top Voice | Startups Advisor | Gartner Ambassador | Data & AI Community Builder | Influencer Marketing B2B | Marketing & Media | (Mumbai/San Francisco)

    166,154 followers

    What does it really take to build an AI agent with memory? After speaking with countless AI practitioners, one thing is clear—building AI agents that retain context and improve over time is a game changer. But the challenge is not just in the AI itself. It is about designing a system that can remember, retrieve, and refine information efficiently This roadmap outlines the key steps that go into building an AI agent with memory, and it aligns with the insights I have gathered from experts working on real-world implementations - Define the Objective – A great AI agent starts with a clear purpose. Whether it's a personal assistant or a customer support bot, defining the goal is crucial - Plan Memory Needs – Short-term vs. long-term memory. Knowing what to retain and retrieve is what makes an agent truly smart - Set Up and Connect LangChain or Pinecone – AI agents need efficient ways to store and access past interactions. Vector databases like Pinecone or frameworks like LangChain make this seamless - Create User Profiles & Sessions – Personalized experiences rely on tracking and continuity, ensuring users feel like they are interacting with an intelligent system, not starting from scratch every time - Establish a Workflow – Memory management is more than just storage. Defining structured processes helps optimize AI behavior - Leverage Knowledge Graphs – Context matters. By structuring knowledge in a graph format, AI can make deeper connections between past interactions - Design and Test Prompts – Memory retrieval is useless without the right prompts. The best AI agents adapt their responses based on stored knowledge - Protect & Secure Data – AI agents deal with sensitive information. Security should be a priority, not an afterthought - Monitor, Enhance, and Scale – AI is not a one-time build. Continuous learning and improvement are what make a system truly intelligent I have seen firsthand how AI agents with memory transform user experiences, making interactions smoother, smarter, and more intuitive. The difference between a basic chatbot and a real AI agent comes down to how well it can retain, retrieve, and refine knowledge! If you are working on AI, what challenges have you faced in building memory into your systems? Join The Ravit Show Newsletter — https://lnkd.in/dCpqgbSN #data #ai #agents #theravitshow

  • View profile for Harish Santhanalakshmi Ganesan

    Security Engineer at Cisco | LLM threat intelligence analyst | MS in Cyber Security @UTD | Speaker at BSides Nashville 2024

    16,975 followers

    Hey folks! Hope you are doing well! In this post I am gonna share my recent research work to build memory layer for LLMs based on Jeff Hawkins thousand brain theory which outperforms vectorDB and embedding based approaches in terms of recall efficacy , cost and temporal reasoning. Based on my previous experiments! I believe we cannot build super advanced AI agents until we solve memory problems of LLMs ie LLMs cannot update its weights during inference. To solve this problem I read various papers based on brain science and neuroscience and books such as thousand brain theory from Jeff Hawkins This gave me an initial idea to build memory layer which mimics human brains neocortex. HawkinsDB supports semantic , episodic and procedural memory HawkinsDB uses Corticol columns is Just like your brain processes information from multiple perspectives (visual, tactile, conceptual), our system stores knowledge in different "columns." This means an object isn't just stored as a single definition - it's understood from multiple angles. and Reference frames are like Smart containers for information that capture what something is, its properties, relationships, and context. This enables natural handling of complex queries like "Find kitchen items related to coffee brewing." Imagine "Cup" as reference frame then hawkinsDB contains all properties related in an reference frame if a "Cup" might be associated with "tea" another reference frame it will have all information associated with tea such how it tastes etc If user enabled autoenrich flag then HawkinDB enrich reference frame with common sense knowledge from conceptnet. This makes LLM using hawkinsDB as memory layer to give very comprehensive answer to users query. In below attached screenshort RAG application built using HawkinsDB gave more comprehensive answer compared to RAG application built using VectorDB. PyPi:https://lnkd.in/g6xNcwXd repo: https://lnkd.in/gRJsxg9x

  • View profile for Brij kishore Pandey
    Brij kishore Pandey Brij kishore Pandey is an Influencer

    AI Architect | Strategist | Generative AI | Agentic AI

    690,001 followers

    We often think of AI agents as black boxes: you give a prompt, and it replies. But 𝗯𝗲𝗵𝗶𝗻𝗱 𝘁𝗵𝗲 𝘀𝗰𝗲𝗻𝗲𝘀, there’s a complex, multi-layered orchestration of memory, reasoning, tool use, and learning. This visual captures the 𝗳𝘂𝗹𝗹 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗟𝗶𝗳𝗲𝗰𝘆𝗰𝗹𝗲 — 𝗳𝗿𝗼𝗺 𝗣𝗿𝗼𝗺𝗽𝘁 𝘁𝗼 𝗔𝗰𝘁𝗶𝗼𝗻 𝘁𝗼 𝗙𝗲𝗲𝗱𝗯𝗮𝗰𝗸, across 𝗺𝗮𝗻𝘆 𝗶𝗻𝘁𝗲𝗿𝗰𝗼𝗻𝗻𝗲𝗰𝘁𝗲𝗱 𝘀𝘁𝗮𝗴𝗲𝘀. 1. 𝗜𝘁 𝘀𝘁𝗮𝗿𝘁𝘀 𝘄𝗶𝘁𝗵 𝗻𝗮𝘁𝘂𝗿𝗮𝗹 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 – but that’s just the trigger.      The agent immediately cleans, tokenizes, and checks readiness before doing anything else. 2. 𝗜𝗻𝘁𝗲𝗻𝘁 𝗖𝗹𝗮𝘀𝘀𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝗰𝗿𝗶𝘁𝗶𝗰𝗮𝗹.      Without knowing 𝘸𝘩𝘢𝘵 the user actually wants (search vs summarize vs act), the agent can’t plan effectively. 3. 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 + 𝗠𝗲𝗺𝗼𝗿𝘆 = 𝗣𝗲𝗿𝘀𝗼𝗻𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻.      Episodic, long-term, and semantic memory shape the agent’s decision-making to feel more human. 4. 𝗥𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴 𝗶𝘀 𝘁𝗵𝗲 𝗯𝗿𝗮𝗶𝗻 𝗼𝗳 𝘁𝗵𝗲 𝗼𝗽𝗲𝗿𝗮𝘁𝗶𝗼𝗻.      Using techniques like ReAct and CoT, the agent creates a plan before touching any tools. 5. 𝗧𝗼𝗼𝗹 𝗨𝘀𝗲 𝗶𝘀 𝗻𝗼𝘁 𝗼𝗽𝘁𝗶𝗼𝗻𝗮𝗹 𝗮𝗻𝘆𝗺𝗼𝗿𝗲.      Search, APIs, bots, file systems—agent capabilities are tightly coupled with external execution layers. 6. 𝗙𝗲𝗲𝗱𝗯𝗮𝗰𝗸 𝗰𝗹𝗼𝘀𝗲𝘀 𝘁𝗵𝗲 𝗹𝗼𝗼𝗽.      Real-time signals and user feedback aren't just for metrics—they 𝘶𝘱𝘥𝘢𝘵𝘦 𝘮𝘦𝘮𝘰𝘳𝘺 𝘢𝘯𝘥 𝘰𝘱𝘵𝘪𝘮𝘪𝘻𝘦 𝘣𝘦𝘩𝘢𝘷𝘪𝘰𝘳. This architecture isn’t theoretical. It’s what powers real-world agentic systems today—using frameworks like: LangGraph   CrewAI   AutoGen   AgentOps   Custom LLM orchestration stacks The future of AI isn’t just bigger models—it’s better agents. Agents that 𝗼𝗯𝘀𝗲𝗿𝘃𝗲, 𝗿𝗲𝗮𝘀𝗼𝗻, 𝗮𝗰𝘁, 𝗮𝗻𝗱 𝗹𝗲𝗮𝗿𝗻. I'd love to hear—how are you using agents in your work?

  • View profile for Gajen Kandiah

    Chief Executive Officer Rackspace Technology

    21,870 followers

    I finally had the chance to dive into "Titans: Learning to Memorize at Test Time"—and wow, what a complex, yet inspiring, read. This paper from Google Research introduces a groundbreaking approach to AI memory, and while it’s deeply technical, it raises important questions for business leaders navigating the #AI revolution. Here’s why this innovation is exciting—and why we should approach it thoughtfully: • Solving AI’s memory problem: Current models like Transformers struggle with handling vast amounts of sequential data efficiently. Titans, with their long-term neural memory module, handle over 2 million tokens effortlessly. • Human-like memory: Inspired by how we remember important moments, Titans focus on "surprising" or key information, ensuring the system prioritizes what matters most. • Practical breakthroughs: Titans excel in tasks like language modeling, long-term reasoning, and massive data analysis. Think medical histories, legal case analysis, or market trend prediction—all processed more effectively than ever before. Yet, alongside the excitement comes healthy skepticism. As the machine learning community debates Titans’ potential, there’s a call for more real-world testing and comparative analysis. How will Titans perform outside the lab? Can they consistently deliver measurable value at scale? For business leaders, this is where the opportunity lies: balancing optimism about new capabilities with clear-eyed evaluations of ROI and feasibility. Titans represent a bold step forward, but like any innovation, their true impact will only emerge with time and rigorous testing. Paper: https://lnkd.in/e_CHRyCa Video: https://lnkd.in/e-Q-9uDz

  • View profile for Jing Xie

    Building the missing piece in AI apps: Real memory.

    10,897 followers

    Last week I gave a talk at AICAMP NYC and had a really long line of questions around AI memory. It seemed like many founders and developers are struggling to have meaningful conversations about memory, because they there is a lot of fundamental misunderstanding about memory architecture. There are actually three distinct layers of memory in generative AI: 𝗟𝗮𝘆𝗲𝗿 𝟭 - 𝗧𝗵𝗲 𝗙𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻 𝗠𝗼𝗱𝗲𝗹 𝗟𝗮𝘆𝗲𝗿: This is the lowest level: model parameters stored in server DRAM that define how an LLM behaves and what it "remembers" from training. ___________ 𝗟𝗮𝘆𝗲𝗿 𝟮 - 𝗞𝗩 𝗖𝗮𝗰𝗵𝗲: This is the middle layer automatically generated during inference. The KV cache is responsible for helping LLMs respond faster to follow-up questions. It's stored on GPU HB (highbandwidth) memory and CPU DRAM but it is rapidly expanding in size and creating new hardware challenges as there is not enough memory capacity on these two tiers. This is also creating a need for projects like NVIDIA Dynamo that have distributed, shared multi-node memory architectures. ___________ 𝗟𝗮𝘆𝗲𝗿 𝟯 - 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗠𝗲𝗺𝗼𝗿𝘆: The top layer. This is the layer that users experience most directly. Context Memory is your conversation history, context windows, and persistent memory. You see it on the left hand side in the form of historical convos in an app like ChatGPT which allows you to pick up from where you last left off. If you haven't tried yet, ask ChatGPT what it knows about you...you'll be amazed. This is the context memory layer and it is separate and distinct from KV Cache and the associated LLMs themselves. ___________ 𝗞𝗘𝗬 𝗧𝗔𝗞𝗘𝗔𝗪𝗔𝗬 Layer 3 is also where your sensitive data lives and where data portability and privacy concerns matter most — especially for the enterprise: When you use ChatGPT, all your sensitive information gets stored in ChatGPT's memory layer. Even OpenAI's new standalone "ChatGPT Memory" is still running on OpenAI's servers, and not under your control. The Context Memory layer is where I see some enterprises and financial services firms being the most trusting of 3rd parties to own and store sensitive trade secrets. I might even characterize some approaches as borderline careless or reckless, because process knowledge and even IP in the form of code snippets and sensitive enterprise data are being shared with services. I think the reason this is happening is that most people don't know how to build and manage their own AI memory and context layer. When you're building your next AI product, make sure you're making decisions that protect your enterprise's edge in today's AI race.

  • View profile for Aishwarya Naresh Reganti

    Founder @ LevelUp Labs | Ex-AWS | Consulting, Training & Investing in AI

    113,608 followers

    😵 Woah, there’s a full-blown paper on how you could build a memory OS for LLMs. Memory in AI systems has only started getting serious attention recently, mainly because people realized that LLM context lengths are limited and passing everything every time for complex tasks just doesn’t scale. This is a forward-looking paper that treats memory as a first-class citizen, almost like an operating system layer for LLMs. It’s a long and dense read, but here are some highlights: ⛳ The authors define three types of memory in AI systems: - Parametric: Knowledge baked into the model weights - Activation: Temporary, runtime memory (like KV cache) - Plaintext: External editable memory (docs, notes, examples) The idea is to orchestrate and evolve these memory types together, not treat them as isolated hacks. ⛳ MemOS introduces a unified system to manage memory: representation, organization, access, and governance. ⛳ At the heart of it is MemCube, a core abstraction that enables tracking, fusion, versioning, and migration of memory across tasks. It makes memory reusable and traceable, even across agents. The vision here isn't just "memory", it’s to let agents adapt over time, personalize responses, and coordinate memory across platforms and workflows. I definitely think memory is one of the biggest blockers to building more human-like agents. This looks super well thought out, it gives you an abstraction to actually build with. Not totally sure if the same abstractions will work across all use cases, but very excited to see more work in this direction! Link: https://lnkd.in/gtxC7kXj

Explore categories