There’s been a lot of talk about making LLM outputs more deterministic – especially surrounding agents. What’s often overlooked in the push for deterministic outputs is the input itself: context. In most enterprise AI systems, “context” is still treated as raw data. But to answer complex, multi-hop questions like “How is engineering project Y tracking against its OKRs?”, agents need a deeper understanding of cross-system relationships, enterprise-specific language, and how work actually gets done. LLMs aren’t built to infer this on their own. They need a machine-readable map of enterprise knowledge – something consumer search systems have long relied on: the knowledge graph. But applying that in the enterprise brings a new set of challenges: the graph must enforce data privacy, reason over small or fragmented datasets without manual review, and do so using scalable algorithms. At Glean, we’ve built a knowledge graph with thousands of edges, recently expanded into a personal graph that captures not just enterprise data, but how individuals work. This foundation sets the stage for personalized, context-aware agents that can anticipate needs, adapt to organizational norms, and guide employees toward their goals, far beyond the limits of chat session history. We break this down in more detail in our latest engineering blog on how knowledge graphs ground enterprise AI and why they’re foundational to the future of agentic reasoning. https://lnkd.in/g-rVJPri
Understanding Graph Technologies
Explore top LinkedIn content from expert professionals.
-
-
Enterprises will need knowledge graphs for agentic AI quality and autonomy, no matter how good models get at context window processing. To be clear — it’s fall-off-chair exciting watching our latest (shhh) Palmyra models with NO RAG beat previous generations of models WITH RAG for use cases where there’s a lot of data / context to sift through (millions of words) to get an accurate and high quality response. Anyone who’s followed Writer for a while knows we’re all about that graph-based RAG, but even better models doesn’t mean that “RAG is dead,” as I have been seeing banded about a bit online by generative AI companies waking up to the limitations of vector-based RAG. Why? Because we’ve seen that a graph-based RAG approach is really efficient at helping self-orchestrating agentic AI get good at making the same kind of decisions a human would. Enterprise knowledge is embedded in the complex, dynamic network of people, processes, and tools that make organizations function. It’s not just Sharepoint pages and data warehouses where “knowledge” sits — it’s everywhere. Agentic AI must learn, reason, and operate within this complexity. For truly successful and self-orchestrating agentic AI in the enterprise, we need: -Continuous and autonomous knowledge integration, where relationships are understood, and knowledge here is data +++ -Process blueprints -LLMs with continuous learning + memory (large context window + self-evolving) -Supervision and control What’s the best way to build knowledge graphs that you can use with agentic AI? There’s a shortcut — we’ve trained a specialized LLM that can build graphs (ie the topic nodes and relationship edges) in real time on customer data, and it can be enriched with a customer’s own ontologies.
-
I added a Knowledge Graph to Cursor using MCP. You gotta see this working! Knowledge graphs are a game-changer for AI Agents, and this is one example of how you can take advantage of them. How this works: 1. Cursor connects to Graphiti's MCP Server. Graphiti is a very popular open-source Knowledge Graph library for AI agents. 2. Graphiti connects to Neo4j running locally. Now, every time I interact with Cursor, the information is synthesized and stored in the knowledge graph. In short, Cursor now "remembers" everything about our project. Huge! Here is the video I recorded. To get this working on your computer, follow the instructions on this link: https://lnkd.in/eeZ_4dkb Something super cool about using Graphiti's MCP server: You can use one model to develop the requirements and a completely different model to implement the code. This is a huge plus because you could use the stronger model at each stage. Also, Graphiti supports custom entities, which you can use when running the MCP server. You can use these custom entities to structure and recall domain-specific information, which will tenfold the accuracy of your results. Here is an example of what these look like: https://lnkd.in/efv7kTaH By the way, knowledge graphs for agents are a big thing. A few ridiculous and eye-opening benchmarks comparing an AI Agent using knowledge graphs with state-of-the-art methods: • 94.8% accuracy versus 93.4% in the Deep Memory Retrieval (DMR) benchmark. • 71.2% accuracy versus 60.2% on conversations simulating real-world enterprise use cases. • 2.58s of latency versus 28.9s. • 38.4% improvement in temporal reasoning. You'll find these benchmarks in this paper: https://fnf.dev/3CLQjBK
-
🌟 Microsoft's open-source repo on GraphRAG is gaining significant traction, If you're wondering if it suits your needs, here's an overview. ⛳ What? GraphRAG uses LLMs to construct knowledge graphs from data and answer user queries based on private datasets. Unlike traditional RAG methods that rely on vector similarity, GraphRAG leverages LLM-generated knowledge graphs to significantly enhance question-and-answer performance, especially for complex document analysis. ⛳ Why graphs? Graphs excel at connecting different pieces of information through their relationships, making them ideal for synthesizing insights and performing complex analytical tasks, unlike flat, unconnected data. ⛳ When should you use it? 👉 GraphRAG could be ideal for scenarios requiring deep information discovery and analysis across multiple noisy documents or datasets containing mis/dis-information. 👉 GraphRAG can also be beneficial in scenarios where a straightforward semantic search based solely on the query may not suffice. The answer may reside in related information rather than the query term alone. ⛳ When to skip it? 👉 GraphRAG's effectiveness depends on well-structured indexing, particularly in unique datasets with domain-specific concepts. Indexing can be resource-intensive, requiring careful setup and testing before extensive use. 👉 GraphRAG's effectiveness also hinges on its ability to accurately identify nodes related to a query, a task that is super straightforward in semantic search. You can always combine both semantic and graph-based methods to leverage the strengths of each approach. Link: https://lnkd.in/ee6G7-dS
-
One year ago today, Dean Allemang Bryon Jacob and I released our paper "A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases" and WOW! Early 2023, everyone was experimenting with LLMs to do text to sql. Examples were "cute" questions on "cute" data. Our work provided the first piece of evidence (to the best of our knowledge) that investing in Knowledge Graph provides higher accuracy for LLM-powered question-answering systems on SQL databases. The result was that by using a knowledge graph representations of SQL databases achieves 3X the accuracy for question-answering tasks compared to using LLMs directly on SQL databases. The release of our work sparked industry-wide follow-up: - The folks at dbt, led by Jason Ganz, replicated our findings, generating excitement across the semantic layer space - Semantic layer companies began citing our research, using it to advocate for the role of semantics - We continuously get folks thanking us for the work because they have been using it as supporting evidence for why their organizations should invest in knowledge graphs - RAG got extended with knowledge graphs: GraphRAG - This research has also driven internal innovation at data.world forming the foundation of our AI Context Engine where you can build AI apps to chat with data and metadata. Over the past year, I've observed two trends: 1) Semantics is moving from "nice-to-have" towards foundational: Organizations are realizing that semantics are fundamental for effective enterprise AI. Major cloud data vendors are incorporating these principles, broadening the adoption of semantics. While approaches vary (not always strictly using ontologies and knowledge graphs), the message is clear: semantics provides your unique business context that LLMs don't necessarily have. Heck, Ontology isn't a frowned upon word anymore 😀 2) Knowledge Graphs as the ‘Enterprise Brain’: Our work pushed to combine Knowledge Graphs with RAG, GraphRAG, in order to have semantically structured data that represents the enterprise brain of your organization. Incredibly honored to see Neo4j Graph RAG Manifesto citing our research as critical evidence for why knowledge graphs drive improved LLM accuracy. It's really exciting that the one year anniversary of our work is while Dean and I are at the International Semantic Web Conference. We are sharing our work on how ontologies come to the rescue to further increase the accuracy to 4x (we released that paper in May). This image is an overview of how it's achieved. It's pretty simple, and that is a good thing! I've dedicated my entire career (close to 2 decades) to figure out how to manage data and knowledge at scale and this GenAI boom has been the catalyst we needed in order to incentivize organizations to invest in foundations in order to truly speed up an innovate. There are so many people to thank! Here’s to more innovation and impact!
-
GraphRAG: Teaching LLMs to Connect the Dots 📚 Ever felt like your AI assistant just doesn't get the big picture? Traditional RAG systems are like that friend who remembers random facts but can't quite piece them together. Meet GraphRAG, Microsoft's clever solution to help LLMs see the forest, not just the trees. Imagine trying to solve a puzzle with pieces scattered across different rooms. That's what traditional RAG does - it finds individual pieces but struggles to put them together. GraphRAG creates a map of how all the information fits together. This means LLMs can now understand connections and context in ways they never could before. What all GraphRAG can do? 1. Uncover Hidden Connections GraphRAG is like a detective, finding links between facts even when they're spread out. It helps LLMs tackle complex questions that require understanding how different pieces of info relate to each other. 2. Pinpoint Accuracy GraphRAG uses its knowledge map to find answers that are spot-on and make sense in context. Plus, you can trace each part of an answer back to its source. 3. Unlock Meaningful Insights GraphRAG doesn't just fetch facts, it sees the big picture. It can spot trends, identify themes, and offer insights that would be near impossible to find otherwise. Why This Matters for You? Think about how often you've asked an AI a question and gotten a response that's... close, but not quite right. Or worse, an answer that's just plain wrong. GraphRAG could change all that. It's about making AI assistants that truly understand what you're asking and can give you answers that actually help. What's Next? As GraphRAG like developments mature, we might see: • More intuitive AI assistants that can handle complex, multi-step questions • Better automated research tools that can draw insights from vast databases • AI systems that can explain their reasoning, making them more trustworthy and useful in fields like medicine or law.
-
TL;DR: There has been a dramatic uptick in interest in Knowledge Graphs (KGs). Combined with LLMs, KGs can provide better insights into organizational data while reducing or even eliminating hallucinations just like some ideas in 𝗡𝗲𝘂𝗿𝗼-𝗦𝘆𝗺𝗯𝗼𝗹𝗶𝗰 𝗔𝗜. A long time ago I wrote about how Symbolic AI and Neural AI will come together to unlock new value while lowering enterprise risk. (https://bit.ly/3WZQ11q). We are definitely headed down that path with some interesting startups like Elemental Cognition (https://lnkd.in/eFUhFYEZ) and Amazon Web Services (AWS) using symbolic techniques for security scanning of LLM generated code in Q Developer (https://lnkd.in/ecJTSSaS). Another variant albeit not Neuro-Symbolic AI is the 𝗶𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻 𝗼𝗳 𝗞𝗚𝘀 𝗮𝗻𝗱 𝗟𝗟𝗠𝘀. KGs are inherently symbolic and integrating with LLMs is a no-brainer for specific use cases. A great writeup of the 𝗯𝗲𝗻𝗲𝗳𝗶𝘁𝘀 by the excellent Neo4j team (Philip Rathle, Emil Eifrem): https://lnkd.in/ebR6tMD8 which itself builds on some great work by the Microsoft GraphRAG team (https://lnkd.in/enRpA6Y7). Benefits summary: 1. 𝗛𝗶𝗴𝗵𝗲𝗿 𝗔𝗰𝗰𝘂𝗿𝗮𝗰𝘆 & More Useful Answers • A KG combined with an LLM improved accuracy by 3x • LinkedIn showed that KG integrated LLMs outperforms the baseline by 77.6% (https://lnkd.in/eNvvQaeq) 2. 𝗜𝗺𝗽𝗿𝗼𝘃𝗲𝗱 𝗗𝗮𝘁𝗮 𝗨𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱𝗶𝗻𝗴, 𝗙𝗮𝘀𝘁𝗲𝗿 𝗜𝘁𝗲𝗿𝗮𝘁𝗶𝗼𝗻 3. 𝗚𝗼𝘃𝗲𝗿𝗻𝗮𝗻𝗰𝗲: 𝗘𝘅𝗽𝗹𝗮𝗶𝗻𝗮𝗯𝗶𝗹𝗶𝘁𝘆, 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆, and More 𝗔𝗻𝗱 𝗵𝗲𝗿𝗲 𝗶𝘀 𝘁𝗵𝗲 𝗶𝗻𝘁𝗲𝗿𝗲𝘀𝘁𝗶𝗻𝗴 𝘁𝘄𝗶𝘀𝘁: KGs and ontologies have historically been hard to create and maintain. Turns out you can use LLMs+ to simplify that process!! Great research work here: https://lnkd.in/eTyGjSe5 and actual implementation by the Neo4J team (https://bit.ly/3WIJxmd). If you want to try this using AWS services give it a whirl here: https://go.aws/3T8FK0L 𝗔𝗰𝘁𝗶𝗼𝗻 𝗳𝗼𝗿 𝗖𝘅𝗢𝘀: Consider adding Knowledge Graphs to your enterprise Data and GenAI strategy.
-
Knowledge graphs to teach LLMs how to reason like doctors! Many medical LLMs can give you the right answer, but not the right reasoning which is a problem for clinical trust. 𝗠𝗲𝗱𝗥𝗲𝗮𝘀𝗼𝗻 𝗶𝘀 𝘁𝗵𝗲 𝗳𝗶𝗿𝘀𝘁 𝗳𝗮𝗰𝘁𝘂𝗮𝗹𝗹𝘆-𝗴𝘂𝗶𝗱𝗲𝗱 𝗱𝗮𝘁𝗮𝘀𝗲𝘁 𝘁𝗼 𝘁𝗲𝗮𝗰𝗵 𝗟𝗟𝗠𝘀 𝗰𝗹𝗶𝗻𝗶𝗰𝗮𝗹 𝗖𝗵𝗮𝗶𝗻-𝗼𝗳-𝗧𝗵𝗼𝘂𝗴𝗵𝘁 (𝗖𝗼𝗧) 𝗿𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴 𝘂𝘀𝗶𝗻𝗴 𝗺𝗲𝗱𝗶𝗰𝗮𝗹 𝗸𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 𝗴𝗿𝗮𝗽𝗵𝘀. 1. Created 32,682 clinically validated QA explanations by linking symptoms, findings, and diagnoses through PrimeKG. 2. Generated CoT reasoning paths using GPT-4o, but retained only those that produced correct answers during post-hoc verification. 3. Validated with physicians across 7 specialties, with expert preference for MedReason’s reasoning in 80–100% of cases. 4. Enabled interpretable, step-by-step answers like linking difficulty walking to medulloblastoma via ataxia, preserving clinical fidelity throughout. Couple thoughts: • introducing dynamic KG updates (e.g., weekly ingests of new clinical trial data) could keep reasoning current with evolving medical knowledge. • Could also integrating visual KGs derived from DICOM metadata help coherent reasoning across text and imaging inputs? We don't use DICOM metadata enough tbh • Adding testing with adversarial probing (like edge‑case clinical scenarios) and continuous alignment checks against updated evidence‑based guidelines might benefit the model performance Here's the awesome work: https://lnkd.in/g42-PKMG Congrats to Juncheng Wu, Wenlong Deng, Xiaoxiao Li, Yuyin Zhou and co! I post my takes on the latest developments in health AI – 𝗰𝗼𝗻𝗻𝗲𝗰𝘁 𝘄𝗶𝘁𝗵 𝗺𝗲 𝘁𝗼 𝘀𝘁𝗮𝘆 𝘂𝗽𝗱𝗮𝘁𝗲𝗱! Also, check out my health AI blog here: https://lnkd.in/g3nrQFxW
-
Good post by @microsoft on using graphs for RAG. This probably could have solved Air Canada's problem Baseline RAG struggles to connect the dots. This happens when answering a question requires traversing disparate pieces of information through their shared attributes in order to provide new synthesized insights. Baseline RAG performs poorly when being asked to holistically understand summarized semantic concepts over large data collections or even singular large documents. Baseline RAG struggles with queries that require aggregation of information across the dataset to compose an answer. Queries such as “What are the top 5 themes in the data?” perform terribly because baseline RAG relies on a vector search of semantically similar text content within the dataset. There is nothing in the query to direct it to the correct information. However, with GraphRAG we can answer such questions, because the structure of the LLM-generated knowledge graph tells us about the structure (and thus themes) of the dataset as a whole. This allows the private dataset to be organized into meaningful semantic clusters that are pre-summarized. The LLM uses these clusters to summarize these themes when responding to a user query. The LLM processes the entire private dataset, creating references to all entities and relationships within the source data, which are then used to create an LLM-generated knowledge graph. This graph is then used to create a bottom-up clustering that organizes the data hierarchically into semantic clusters (indicated by using color in Figure 3 below). This partitioning allows for pre-summarization of semantic concepts and themes, which aids in holistic understanding of the dataset. At query time, both of these structures are used to provide materials for the LLM context window when answering a question. https://lnkd.in/gAj4z7ED