1
Building
Generative AI with
Google
Abhi
Lead Customer Engineering Data Analytics and AI, MERC
AuNZ
Proprietary + Confidential
A deep history of research and innovation at Google
Responsible AI at the foundation
Built & Tested
for Safety
Privacy in design
Upholds high scientific
standards
Accountable to People
Socially Beneficial
Avoid creating unfair
bias
Google Cloud
Enhance Employee
Productivity
Modernise Customer
Service
Employee & Developer Productivity
Document, Email & Analysis Assist | Improve code development |
Simplify DevOps | Automate Non-Coding Processes
Streamline
& Automate Business
Processes
Customer Service Modernisation
Boost Agent & Employee Productivity | Improve Self-Service & Deflection Rates |
Enhance customer insights & predictions
Back Office
of the Future
Procurement Contract
Management & Compliance |
HR Help Desk & Internal Travel
Bookings | Sales and Marketing
& Accounts Payable
Digital Commerce &
Website Modernisation
Enrich Catalogs & Streamline Content
Generation | Conversation Commerce &
Enhanced Web Navigation | Improve
Self-Service & Deflection Rates
Marketing
Creative & Content Generation |
Personalisation & Media
Performance | Insights &
Measurement
We see 3 productivity pillars
driven by ML and GenAI
Proprietary + Confidential
The Stack
Proprietary + Confidential
Vertex AI
Gemini Models
AI Hypercomputer
Gemini for
Google Cloud
Your Agents
Gemini for
Workspace
INTERNAL ONLY - DO NOT DISTRIBUTE
6
AI Hypercomputer:
next generation AI
supercomputing
architecture
Flexible Consumption
Dynamic Workload Scheduler On Demand CUD Spot
Open Software
JAX, TensorFlow, PyTorch
Multislice Training, Multihost Inference, XLA
Google Kubernetes Engine & Compute Engine
Performance-Optimized Hardware
Compute
(GPUs, TPUs)
Storage
(Block, File, Object)
Networking
(OCS, Jupiter)
Proprietary + Confidential
TPU GKE/GCE
Integration GA
A3 GA
TPU v5p GA
HPC Toolkit
Support for A3 &
NeMO
SW GPU
Key TPU
TPU v5e GA
A3 Mega
Private Preview
Single Host
Inference
GA
Since 2015 Google has been rapidly enhancing its TPUs
TPU
Multislice
Training GA
TPU v5e
Public
Preview
Single Host
Inference Public
Preview
Multislice
Training Public
Preview
2023 Q3 2023 Q4 2024 Q1 2024 Q2 2024 Q3 2024 Q4
Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
A3 Mega
GA
GPU
A4
Private
Preview
(Q1'25)
Multi Host
Inference Public
Preview
QRM support
for GPUs in
public preview
Future
Reservations in
Public Preview
Ops Agent
monitoring for
GPUs on GCE
GA
TPU v5p
Public Preview
TPU v6e
Public Preview
TPU v6e
GA
gSC
Foundations
Preview
DWS public preview
ML Perf 4.0
Inference
ML Perf 4.0
Training
DWS GA
A3 Ultra (H200)
Private Preview
A3 Edge (H100)
Seoul
Proprietary + Confidential
Vertex AI
Gemini Models
AI Hypercomputer
Gemini for
Google Cloud
Your Agents
Gemini for
Workspace
Proprietary + Confidential
Gemini offers the world’s largest context window
Proprietary + Confidential
Gemini 1.0 Pro GPT-4 Turbo
Claude 3.5 Sonnet
Gemini 1.5 Pro
2M
2 hour video
22 hour audio
>60k lines of code
>1.4m words
Gemini on Vertex AI Gemma Open Models
Now available
GA
Now
GA
Now
Gemini 1.5 Flash
Fastest and most
cost-efficient model yet
Multimodality
Low Latency
Comparable quality as 1.5 Pro
(on common tasks)
Gemini 1.5 Pro
Native reasoning over enormous
amounts of data
2M Context Window
Multimodality
Versatile & top-tier quality
As of August 2024 : Gemini supported languages jumped from nearly 40 to over
100. This is important for us in APAC and can be paired with our Translation tools for
a much larger set of languages.
Mistral
Small | Large | Codestral
Claude 3.5 Sonnet
Open ecosystem that gives customers choice
Meta Llama 3.1
405B Model
GA
Now
State of the art 3rd party and open source models
are first-class citizens on Vertex
GA
July
Preview
July
Higher quality
Imagen 3 quality exceeds all leading
external competitors in aesthetics,
lower defects, prompt adherence,
and text on images (aspect ratios) 1:1,
9:16, 16:9, 3:4, 4:3
Safety built in
Digital watermark and safety
framework built in
Guardrails to limit reproduction
of people, scenes and much more
Prompt: a family of four sitting at the couch watching tv with their dog
Imagen 3 Fast
Imagen 3
Imagen 3: our latest image generation foundation model
Two new higher-quality model variants to help customers
optimise around quality and latency goals
Proprietary + Confidential
Vertex AI
Gemini Models
AI Hypercomputer
Gemini for
Google Cloud
Your Agents
Gemini for
Workspace
AIOps represents a suite of technologies across the data lifecycle; however most customers
don’t “see” this end-to-end view and are building their stacks ad hoc
Prepare Develop Validate Prompt Deploy Infer Automate Monitor
Data Collection Model Selection Benchmarking
Prompt
Deconstruction
Model Hosting
(Inference / Serving)
RLHF Tooling
Agent Design &
Orchestration
Logging & Analytics
Data Preprocessing
(e.g., Chunking)
Model Pre-training
Performance
Evaluation
Prompt Libraries &
Templates
Model Caching
Prompt
Reconstruction
Connector Tooling
(Tool Aggregation)
Error & Usage
Analysis
Data Retrieval
(incl. RAG tooling)
Model Fine-Tuning
Model Resilience
Testing
Prompt Chaining Model Orchestration
Infrastructure
Provisioning
LLM Chaining
App / Model
Debugging
Data Labeling &
Annotation
Hyperparameter
Tuning
Model Efficiency
Tracking
Prompt Embedding &
Context Aug.
Distributed
Computing
Human-in-the-Loop
Tooling
Agent Memory
Management
Performance
Monitoring
Data Versioning &
Auditing
Model Hub (Registry)
& Version Control
Experiment Tracking
Automated
Prompt Testing
API & Service
Integrations
Agent Self-Eval
Tooling
Output & Drift
Monitoring
Model Distillation &
Quantization
Model Explainability
Prompt A/B Testing
(Comparison, Merge)
Load Balancing CI/CD Pipelines
Feature Store Grounding Autoscaling
Real-time Agent
Debugging
Govern
Security Compliance Data Privacy Bias Detection Transparency Guardrails Sustainability Disaster Recovery
AIOps Capability Map
Model Building Model Monitoring
Model Deployment
Native to LLMOps
Proprietary + Confidential
Open Framework Support on Vertex AI
Ray on Vertex AI
Scale AI & Data with Ray
Developers face several major challenges when
scaling AI/ML workloads. Such as
1. Access to sufficient amount of CPU/GPUs
2. Diverse patterns and programming interfaces
3. Running the workload securely in production
With Ray on Vertex AI, OSS Ray users can run
securely on Vertex AI while enjoying both Ray’s
ergonomic APIs and Vertex’s scalable, secure, and
elastic infra.
& Saxml
Multi-host TPU with Saxml
● Saxml pre-built container
● Serve Llama 3 open models using multi-host
Cloud TPUs
PyTorch
● Co-host PyTorch models on the same VM
● Multiple endpoints can be deployed on the
same VM within a DeploymentResourcePool
Google Cloud
Model
“What is
a Pixel Tablet?”
“The Pixel Tablet was designed
by Google and contains a
Google Tensor G2 chip...”
With the latest
external knowledge
Less hallucinations
Vector DBs
Query: Pixel Tablet
A Brief History of LLM Applications
In the early days Retrieval Augmented Generation (RAG) fueled GenAI
Proprietary + Confidential
Context caching
First provider to offer
context caching API
75%
Lower input price with
context caching*
Take advantage of millions-of-tokens
context windows
Available across both 1.5 Pro and 1.5 Flash
*with >=32K context window
Context
Prompt
Input Prefill
Response
Generation
Output
Input
Prompt
Without
caching
With
caching
Input Prefill
Response
Generation
Output
$$$$
$
Cache
Context
Q/A and Summisation
Vertex
Grounding
with Google
Search
GENERALLY
AVAILABLE
Only provider to offer
grounding with Google
Search (with Gemini)
Grounding
with 3P data
Coming Soon
Currently working with
premier providers such
as
Grounding
on your data
GENERALLY
AVAILABLE
Ground on private
documents and data in
Vertex AI Search
Provide context to
Grounding API directly
Grounding
with
high-fidelity
Experimental
Ensures high levels of
factuality in response
Dynamic
retrieval
Coming Soon
Smartly decide if retrieval is
needed
Optimizes cost while ensuring
factuality
Q/A and Summisation
Vertex
Grounding brings the world’s knowledge to find
the relevant information for GenAI
The provided sources only contain financial
information for Alphabet Inc. for Q3 2024 and
previous quarters, but do not include any
information about Google's revenue for Q4 2024.
Grounding Score: 3%
Grounding with High Fidelity:
Introducing grounding scores and
sourcing from provided context
Prompt:
What was Google’s Q1 2024 revenue?
What was YoY growth?
Google's revenue in Q1 2024 was $80.5 billion,
which represents a 15% year-over-year growth.
Grounding Score: 99.2%,
Source: 2024q1-alphabet-earnings-release-pdf
(Page 1)
Prompt:
What was Google’s Q4 2024 revenue?
Given context/Input:
Alphabet quarterly
and annual reports
Q/A and Summisation
Vertex
Google Cloud
21
“What is a
Pixel Tablet?”
“The Pixel Tablet was
designed by Google
and contains a Google
Tensor G2 chip…”
Reasoning and
orchestration
with the Tools
Letting LLM
calling the
functions
of the Tools
Search Vector DB
Wikipedia Other APIs
Deployment
Model: Let's query with Wikipedia...
Tool: Query "Pixel Tablet" on Wikipedia
Tool: Let's query with Wikipedia...
Model: Summarize the relevant part...
Tools
Model
Orchestration
A Brief History of LLM Applications
…and evolved to Generative AI Agents with reasoning and orchestration
Building GenAI Agents on Vertex AI
Model & Grounding
Orchestrate & Plan
Create, launch, and manage
your agents at scale
Google, 3rd Party & Open
Source Models
Taking Action (Tools)
Ground with Google
Search to access fresh, high
quality information
Ground on your own enterprise
data quickly with out-of-the-box
RAG in Vertex AI Search
Build DIY RAG providers with
LlamaIndex on Vertex
High-Fidelity / 3rd Party
Grounding
Connect LLMs to external tools;
call APIs and Services
Build at any level: no code1
, low
code, or full code options in
Vertex AI Agent Builder &
Agent API
Create your own actions with
Function Calling
accessing custom or private APIs
Deploy and orchestrate
custom agents with
LangChain on Vertex
Access pre-built reusable modules
with Extensions2
Agentic Solutions
Vertex
Proprietary + Confidential
Transparency and Trust
with your GenAI solutions
● Side-by-side Evaluation
● Prompt Evaluation
● Explainability & Inspection
Google ShieldGemma
ShieldGemma is a suite of tools designed to
detect and mitigate harmful content in AI model
inputs and outputs.
ShieldGemma specifically targets hate speech,
harassment, sexually explicit, and dangerous
content.
Google provides rich tools to build safety and
trust into your experiences
For creators (UI), app developers (API), and AI practitioners (fine-tuning)
Search across Cloud
View data and machine learning artifacts across Google
Cloud products in a single place.
Discover key ML assets
Find champion models and golden datasets across projects
and regions, while still respecting IAM boundaries.
Augment with business metadata
Use Dataplex to document asset owners and additional
business metadata.
At a low price
Propagation and storage of Vertex AI technical metadata in
Dataplex is free. Pay only for Dataplex API usage and any
business metadata added via the Dataplex.
Preview
Data & ML Discovery with
Google Dataplex
25
Neo4j is a huge
unlock for RAG
Neo4j is Google’s Graph Database
Document Key-value In-memory Wide Column Graph Time-series Relational
DBaaS
Firestore
Serverless,
scalable
document
store
Cloud
Bigtable
Low latency,
scalable wide
column store
MemoryStore
Managed Redis
Cloud
Bigtable
Low latency,
scalable wide
column store
Fastest Path to
Graph
Cloud
Bigtable
Low latency,
scalable wide
column store
Cloud SQL - Managed
MySQL, PostgreSQL,
& SQL Server
Cloud Spanner
-Scalable relational
database
Neo4j Graph Database fills a gap in Google Cloud Platform.
Graph Augmented LLMs in action
Benefits of Using Graph Databases in RAG
● Enhanced Contextual Understanding: Graph databases
excel at capturing complex relationships between entities,
allowing the RAG system to better understand the context of
a query.
● Improved Retrieval Accuracy: Graph traversal algorithms
can efficiently traverse the knowledge graph to retrieve
relevant information, leading to more accurate responses.
● Explainable AI: The knowledge graph provides a
transparent and interpretable representation of the
information used to generate responses, making the AI
system more explainable.
Don’t rely on documents, bring relationships
between entities into the context
RAG layer
Graph DB
Applications for
knowledge
consumption
Knowledge
extraction and
ingestion
Structured
Unstructured
Ontologies
Data sources GenAI layer
Customer Service
Ticket Triaging
Recommendations
News Content & Discovery
Enterprise Knowledge
Search
Patient Prioritization
Clinical Decision Support
Systems
Pharmacovigilance
Health Assistants
FAQ Bots
Bloom
APIs
VertexAI
with Generative AI
Neo4j Aura
VertexAI
with Generative AI
Knowledge Graph with Semantic Search
Vector DB
Prompt
Engineering
Solution and Benefits
● Provider of commercial data, analytics and insights
for businesses spanning various sizes and sectors
internationally
● Use Neo4j to support our identity insights
business, including linked and matched data
● Answer questions that span connected data in
real-time, including Ultimate Beneficial Ownership
(UBO) information
Why Neo4j
● Needed a graph solution that aligned with their
cloud strategy
● Helps D&B focus on client needs rather than
database management
Large logistics network in
Australia
Solution and Benefits
● They needed a solution to better understand the
complex relationships within their logistics
network. They knew details about network
endpoints, but getting visibility across the network
of what happened in between was not possible.
● The lack of visibility means they cannot make
real-time decisions about asset flows, and their
ability to make strategic decisions about the
network is constrained due to a lack of
understanding of where bottlenecks are occurring.
Why Neo4j
● Needed a solution that could scale up to 32TB
and be mission critical for the organisation.
● Needed a graph solution that aligned with their
cloud strategy on GCP
Select Neo4j and Google Cloud Joint Customers
How will AI help you
run your business ?
Abhi
Lead Customer Engineering Data Analytics and AI, MERC
AuNZ

Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI on Google Cloud

  • 1.
    1 Building Generative AI with Google Abhi LeadCustomer Engineering Data Analytics and AI, MERC AuNZ
  • 2.
    Proprietary + Confidential Adeep history of research and innovation at Google Responsible AI at the foundation Built & Tested for Safety Privacy in design Upholds high scientific standards Accountable to People Socially Beneficial Avoid creating unfair bias
  • 3.
    Google Cloud Enhance Employee Productivity ModerniseCustomer Service Employee & Developer Productivity Document, Email & Analysis Assist | Improve code development | Simplify DevOps | Automate Non-Coding Processes Streamline & Automate Business Processes Customer Service Modernisation Boost Agent & Employee Productivity | Improve Self-Service & Deflection Rates | Enhance customer insights & predictions Back Office of the Future Procurement Contract Management & Compliance | HR Help Desk & Internal Travel Bookings | Sales and Marketing & Accounts Payable Digital Commerce & Website Modernisation Enrich Catalogs & Streamline Content Generation | Conversation Commerce & Enhanced Web Navigation | Improve Self-Service & Deflection Rates Marketing Creative & Content Generation | Personalisation & Media Performance | Insights & Measurement We see 3 productivity pillars driven by ML and GenAI
  • 4.
  • 5.
    Proprietary + Confidential VertexAI Gemini Models AI Hypercomputer Gemini for Google Cloud Your Agents Gemini for Workspace
  • 6.
    INTERNAL ONLY -DO NOT DISTRIBUTE 6 AI Hypercomputer: next generation AI supercomputing architecture Flexible Consumption Dynamic Workload Scheduler On Demand CUD Spot Open Software JAX, TensorFlow, PyTorch Multislice Training, Multihost Inference, XLA Google Kubernetes Engine & Compute Engine Performance-Optimized Hardware Compute (GPUs, TPUs) Storage (Block, File, Object) Networking (OCS, Jupiter)
  • 7.
    Proprietary + Confidential TPUGKE/GCE Integration GA A3 GA TPU v5p GA HPC Toolkit Support for A3 & NeMO SW GPU Key TPU TPU v5e GA A3 Mega Private Preview Single Host Inference GA Since 2015 Google has been rapidly enhancing its TPUs TPU Multislice Training GA TPU v5e Public Preview Single Host Inference Public Preview Multislice Training Public Preview 2023 Q3 2023 Q4 2024 Q1 2024 Q2 2024 Q3 2024 Q4 Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec A3 Mega GA GPU A4 Private Preview (Q1'25) Multi Host Inference Public Preview QRM support for GPUs in public preview Future Reservations in Public Preview Ops Agent monitoring for GPUs on GCE GA TPU v5p Public Preview TPU v6e Public Preview TPU v6e GA gSC Foundations Preview DWS public preview ML Perf 4.0 Inference ML Perf 4.0 Training DWS GA A3 Ultra (H200) Private Preview A3 Edge (H100) Seoul
  • 8.
    Proprietary + Confidential VertexAI Gemini Models AI Hypercomputer Gemini for Google Cloud Your Agents Gemini for Workspace
  • 9.
    Proprietary + Confidential Geminioffers the world’s largest context window
  • 10.
    Proprietary + Confidential Gemini1.0 Pro GPT-4 Turbo Claude 3.5 Sonnet Gemini 1.5 Pro 2M 2 hour video 22 hour audio >60k lines of code >1.4m words
  • 11.
    Gemini on VertexAI Gemma Open Models Now available GA Now GA Now Gemini 1.5 Flash Fastest and most cost-efficient model yet Multimodality Low Latency Comparable quality as 1.5 Pro (on common tasks) Gemini 1.5 Pro Native reasoning over enormous amounts of data 2M Context Window Multimodality Versatile & top-tier quality As of August 2024 : Gemini supported languages jumped from nearly 40 to over 100. This is important for us in APAC and can be paired with our Translation tools for a much larger set of languages.
  • 12.
    Mistral Small | Large| Codestral Claude 3.5 Sonnet Open ecosystem that gives customers choice Meta Llama 3.1 405B Model GA Now State of the art 3rd party and open source models are first-class citizens on Vertex GA July Preview July
  • 13.
    Higher quality Imagen 3quality exceeds all leading external competitors in aesthetics, lower defects, prompt adherence, and text on images (aspect ratios) 1:1, 9:16, 16:9, 3:4, 4:3 Safety built in Digital watermark and safety framework built in Guardrails to limit reproduction of people, scenes and much more Prompt: a family of four sitting at the couch watching tv with their dog Imagen 3 Fast Imagen 3 Imagen 3: our latest image generation foundation model Two new higher-quality model variants to help customers optimise around quality and latency goals
  • 14.
    Proprietary + Confidential VertexAI Gemini Models AI Hypercomputer Gemini for Google Cloud Your Agents Gemini for Workspace
  • 15.
    AIOps represents asuite of technologies across the data lifecycle; however most customers don’t “see” this end-to-end view and are building their stacks ad hoc Prepare Develop Validate Prompt Deploy Infer Automate Monitor Data Collection Model Selection Benchmarking Prompt Deconstruction Model Hosting (Inference / Serving) RLHF Tooling Agent Design & Orchestration Logging & Analytics Data Preprocessing (e.g., Chunking) Model Pre-training Performance Evaluation Prompt Libraries & Templates Model Caching Prompt Reconstruction Connector Tooling (Tool Aggregation) Error & Usage Analysis Data Retrieval (incl. RAG tooling) Model Fine-Tuning Model Resilience Testing Prompt Chaining Model Orchestration Infrastructure Provisioning LLM Chaining App / Model Debugging Data Labeling & Annotation Hyperparameter Tuning Model Efficiency Tracking Prompt Embedding & Context Aug. Distributed Computing Human-in-the-Loop Tooling Agent Memory Management Performance Monitoring Data Versioning & Auditing Model Hub (Registry) & Version Control Experiment Tracking Automated Prompt Testing API & Service Integrations Agent Self-Eval Tooling Output & Drift Monitoring Model Distillation & Quantization Model Explainability Prompt A/B Testing (Comparison, Merge) Load Balancing CI/CD Pipelines Feature Store Grounding Autoscaling Real-time Agent Debugging Govern Security Compliance Data Privacy Bias Detection Transparency Guardrails Sustainability Disaster Recovery AIOps Capability Map Model Building Model Monitoring Model Deployment Native to LLMOps
  • 16.
    Proprietary + Confidential OpenFramework Support on Vertex AI Ray on Vertex AI Scale AI & Data with Ray Developers face several major challenges when scaling AI/ML workloads. Such as 1. Access to sufficient amount of CPU/GPUs 2. Diverse patterns and programming interfaces 3. Running the workload securely in production With Ray on Vertex AI, OSS Ray users can run securely on Vertex AI while enjoying both Ray’s ergonomic APIs and Vertex’s scalable, secure, and elastic infra. & Saxml Multi-host TPU with Saxml ● Saxml pre-built container ● Serve Llama 3 open models using multi-host Cloud TPUs PyTorch ● Co-host PyTorch models on the same VM ● Multiple endpoints can be deployed on the same VM within a DeploymentResourcePool
  • 17.
    Google Cloud Model “What is aPixel Tablet?” “The Pixel Tablet was designed by Google and contains a Google Tensor G2 chip...” With the latest external knowledge Less hallucinations Vector DBs Query: Pixel Tablet A Brief History of LLM Applications In the early days Retrieval Augmented Generation (RAG) fueled GenAI
  • 18.
    Proprietary + Confidential Contextcaching First provider to offer context caching API 75% Lower input price with context caching* Take advantage of millions-of-tokens context windows Available across both 1.5 Pro and 1.5 Flash *with >=32K context window Context Prompt Input Prefill Response Generation Output Input Prompt Without caching With caching Input Prefill Response Generation Output $$$$ $ Cache Context Q/A and Summisation Vertex
  • 19.
    Grounding with Google Search GENERALLY AVAILABLE Only providerto offer grounding with Google Search (with Gemini) Grounding with 3P data Coming Soon Currently working with premier providers such as Grounding on your data GENERALLY AVAILABLE Ground on private documents and data in Vertex AI Search Provide context to Grounding API directly Grounding with high-fidelity Experimental Ensures high levels of factuality in response Dynamic retrieval Coming Soon Smartly decide if retrieval is needed Optimizes cost while ensuring factuality Q/A and Summisation Vertex Grounding brings the world’s knowledge to find the relevant information for GenAI
  • 20.
    The provided sourcesonly contain financial information for Alphabet Inc. for Q3 2024 and previous quarters, but do not include any information about Google's revenue for Q4 2024. Grounding Score: 3% Grounding with High Fidelity: Introducing grounding scores and sourcing from provided context Prompt: What was Google’s Q1 2024 revenue? What was YoY growth? Google's revenue in Q1 2024 was $80.5 billion, which represents a 15% year-over-year growth. Grounding Score: 99.2%, Source: 2024q1-alphabet-earnings-release-pdf (Page 1) Prompt: What was Google’s Q4 2024 revenue? Given context/Input: Alphabet quarterly and annual reports Q/A and Summisation Vertex
  • 21.
    Google Cloud 21 “What isa Pixel Tablet?” “The Pixel Tablet was designed by Google and contains a Google Tensor G2 chip…” Reasoning and orchestration with the Tools Letting LLM calling the functions of the Tools Search Vector DB Wikipedia Other APIs Deployment Model: Let's query with Wikipedia... Tool: Query "Pixel Tablet" on Wikipedia Tool: Let's query with Wikipedia... Model: Summarize the relevant part... Tools Model Orchestration A Brief History of LLM Applications …and evolved to Generative AI Agents with reasoning and orchestration
  • 22.
    Building GenAI Agentson Vertex AI Model & Grounding Orchestrate & Plan Create, launch, and manage your agents at scale Google, 3rd Party & Open Source Models Taking Action (Tools) Ground with Google Search to access fresh, high quality information Ground on your own enterprise data quickly with out-of-the-box RAG in Vertex AI Search Build DIY RAG providers with LlamaIndex on Vertex High-Fidelity / 3rd Party Grounding Connect LLMs to external tools; call APIs and Services Build at any level: no code1 , low code, or full code options in Vertex AI Agent Builder & Agent API Create your own actions with Function Calling accessing custom or private APIs Deploy and orchestrate custom agents with LangChain on Vertex Access pre-built reusable modules with Extensions2 Agentic Solutions Vertex
  • 23.
    Proprietary + Confidential Transparencyand Trust with your GenAI solutions ● Side-by-side Evaluation ● Prompt Evaluation ● Explainability & Inspection Google ShieldGemma ShieldGemma is a suite of tools designed to detect and mitigate harmful content in AI model inputs and outputs. ShieldGemma specifically targets hate speech, harassment, sexually explicit, and dangerous content. Google provides rich tools to build safety and trust into your experiences
  • 24.
    For creators (UI),app developers (API), and AI practitioners (fine-tuning) Search across Cloud View data and machine learning artifacts across Google Cloud products in a single place. Discover key ML assets Find champion models and golden datasets across projects and regions, while still respecting IAM boundaries. Augment with business metadata Use Dataplex to document asset owners and additional business metadata. At a low price Propagation and storage of Vertex AI technical metadata in Dataplex is free. Pay only for Dataplex API usage and any business metadata added via the Dataplex. Preview Data & ML Discovery with Google Dataplex
  • 25.
    25 Neo4j is ahuge unlock for RAG
  • 26.
    Neo4j is Google’sGraph Database Document Key-value In-memory Wide Column Graph Time-series Relational DBaaS Firestore Serverless, scalable document store Cloud Bigtable Low latency, scalable wide column store MemoryStore Managed Redis Cloud Bigtable Low latency, scalable wide column store Fastest Path to Graph Cloud Bigtable Low latency, scalable wide column store Cloud SQL - Managed MySQL, PostgreSQL, & SQL Server Cloud Spanner -Scalable relational database Neo4j Graph Database fills a gap in Google Cloud Platform.
  • 27.
    Graph Augmented LLMsin action Benefits of Using Graph Databases in RAG ● Enhanced Contextual Understanding: Graph databases excel at capturing complex relationships between entities, allowing the RAG system to better understand the context of a query. ● Improved Retrieval Accuracy: Graph traversal algorithms can efficiently traverse the knowledge graph to retrieve relevant information, leading to more accurate responses. ● Explainable AI: The knowledge graph provides a transparent and interpretable representation of the information used to generate responses, making the AI system more explainable. Don’t rely on documents, bring relationships between entities into the context
  • 28.
    RAG layer Graph DB Applicationsfor knowledge consumption Knowledge extraction and ingestion Structured Unstructured Ontologies Data sources GenAI layer Customer Service Ticket Triaging Recommendations News Content & Discovery Enterprise Knowledge Search Patient Prioritization Clinical Decision Support Systems Pharmacovigilance Health Assistants FAQ Bots Bloom APIs VertexAI with Generative AI Neo4j Aura VertexAI with Generative AI Knowledge Graph with Semantic Search Vector DB Prompt Engineering
  • 29.
    Solution and Benefits ●Provider of commercial data, analytics and insights for businesses spanning various sizes and sectors internationally ● Use Neo4j to support our identity insights business, including linked and matched data ● Answer questions that span connected data in real-time, including Ultimate Beneficial Ownership (UBO) information Why Neo4j ● Needed a graph solution that aligned with their cloud strategy ● Helps D&B focus on client needs rather than database management Large logistics network in Australia Solution and Benefits ● They needed a solution to better understand the complex relationships within their logistics network. They knew details about network endpoints, but getting visibility across the network of what happened in between was not possible. ● The lack of visibility means they cannot make real-time decisions about asset flows, and their ability to make strategic decisions about the network is constrained due to a lack of understanding of where bottlenecks are occurring. Why Neo4j ● Needed a solution that could scale up to 32TB and be mission critical for the organisation. ● Needed a graph solution that aligned with their cloud strategy on GCP
  • 30.
    Select Neo4j andGoogle Cloud Joint Customers
  • 31.
    How will AIhelp you run your business ? Abhi Lead Customer Engineering Data Analytics and AI, MERC AuNZ