Deep
Learning
Subset of neuralnetworks with
many layers (deep) that excels in
processing complex data like
images, audio and text.
13.
Gen(erative) AI
A partof deep learning focused
on creating new content like text,
images or music. In other words
it “generates” stuff.
14.
LLM
Large Language Model
Typeof Model trained on
massive amounts of text data. It
can answer questions and write
text.
Gemini is an LLM developed by
Google.
15.
Model
A system thathas been trained
to recognise patterns and make
predictions or decisions based
on data.
A tool that learns from examples
and uses that knowledge to
solve problems or perform a
task.
16.
Model
Models can betailored.
“Sit, stay, down” - your
average dog
But there are
fi
remen’s
dogs, K-9s, guide dogs
etc.
Multimodal
Model
A type ofmodel that can
understand various inputs (e.g.
audio, image, video, text) and
generate various outputs (e.g.
audio, image and text)
gemini-2.0-flash-001 is a
multimodal model. (*coming soon)
https://ai.google.dev/gemini-api/docs/models/gemini#gemini-2.0-
fl
ash
Transformer
Transformer architecture isa
super-smart text processor.
It uses an attention mechanism
with which it can
fi
gure out how
words relate to each other.
“The cat chased the mouse, and
it ran away”
21.
Attention
Attention assigns aweight to
each word based on its
relevance to others, scoring their
importance.
This helps the system to
determine relationships between
words and use that context to
determine meaning.
https://www.youtube.com/watch?v=KJtZARuO3JY
22.
Hallucination
Models can generateinformation
that is incorrect, irrelevant or
completely made up even
though it might sound plausible.
Always do fact-checking.
Transactional
These interactions generatean
answer based on an input but
they are one-off. They do not
“remember” the conversation.
There’s no actual conversation
here.
27.
Transactional
import { GoogleGenerativeAI} from '@google/
generative-ai';
const genAI = new
GoogleGenerativeAI(process.env.GEMINI_API_KEY
);
const model = genAI.getGenerativeModel(
{ model: 'gemini-1.5-flash' }
);
const prompt = 'What is Star Wars?';
const result = await
model.generateContent(prompt);
console.log(result.response.text());
Temperature
Parameter that controlsthe
randomness (creativity) of a
model’s responses.
Different models have different
temperature control parameters.
gemini-1.5-flash: 0.0 - 2.0 (default 1.0)
gemini-2.0-flash: 0.0 - 2.0 (default 0.7)
32.
Temperature
What is thecapital of Italy?
T0.2: “Rome”
T0.8: “Rome, the Eternal City”
T1.2: “Rome is the capital of Italy,
known for its ancient history and
landmarks like the Colosseum”
T1.6: “Rome, though Florence could claim
the title for its art and culture”
Chat
History can bemaintained by
passing a history array to the
startChat method.
const history = [];
let result = await model
.startChat({ history })
.sendMessageStream(input);
37.
Chat
Content needs tobe associated
with a role. Gemini supports 2
roles:
user and model.
User is the role which provides
the prompts.
Model is the role that provides
the responses.
38.
Chat
Note that thehistory array must
conform to the following
schema.
{
role: ‘user',
parts: [{
text: userInput
}]
}
39.
Try it out
npmrun chat
npm run chat-stream
npm run chat-with-history
40.
Token limits
Models havetoken limits on both
input and output.
For example gemini-2.0-
flash-001 has 1,048,576 input
token limit and 8,192 output
token limit.
https://ai.google.dev/gemini-api/docs/models/gemini#gemini-2.0-
fl
ash
41.
Model
parameters
Models take parametervalues
that control how they generate a
response.
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values
42.
Max output
tokens
How manytokens (words) should
be returned. Generally 100
tokens is about 60-80 words.
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values
43.
Top-K
Top-K controls howa model
selects tokens by limiting the
choices to the K most probable
tokens.
Top-K = 1 means that the model
always selects the most probable
token.
Top-K = 3 randomly selects the top
3 most probable tokens.
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values
44.
Top-P
Top-P dynamically limitstoken
selection to the smallest set of
words where the cumulative
probability meets a threshold.
Lower values make responses
more predictable. Higher values
consider less probably words.
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values
Safety
Block harmful contentsuch as
harassment, hate, sexually
explicit language, dangerous
content, and content that could
jeopardise civic integrity.
https://ai.google.dev/gemini-api/docs/safety-settings
Context
What happens inmulti-turn
conversational systems with the
context?
Often there’s a cutoff, we ask the
LLM to generate a summary and
store that summary so that the
model is aware of what was
discussed so far, without running
out of context.
49.
Structured
Output
Instruct the modelto respond
using a speci
fi
c format. The
format can be embedded in the
prompt or build a schema
programmatically.
https://ai.google.dev/gemini-api/docs/structured-output?hl=en&lang=node
50.
Structured
Output
const schema ={
description: 'Array of original Star Wars films with details',
type: SchemaType.ARRAY,
items: {
type: SchemaType.OBJECT,
properties: {
title: {
type: SchemaType.STRING,
description: 'Title of the film',
nullable: false,
},
released: {
type: SchemaType.STRING,
description: 'Release date of the film',
nullable: false,
},
characters: {
type: SchemaType.STRING,
description: 'Notable characters in the film',
nullable: false,
},
plot: {
type: SchemaType.STRING,
description: "Short summary of the film's plot",
nullable: false,
},
},
required: ['title', 'released', 'characters', 'plot'],
},
};
https://ai.google.dev/gemini-api/docs/structured-output?hl=en&lang=node
Live
Information
Models are trainedon vast
amount of data, and they have a
training cut-off date. In other
words, information that is inside
a model may be outdated.
Also, live information cannot be
retrieved by the models.
Function
Calling
Allows AI tocall prede
fi
ned
functions to perform speci
fi
c
actions. These actions can be for
data retrieval or anything else.
A function is just (in our case) a
JavaScript function.
55.
Function
Calling
const model =genAI.getGenerativeModel({
model: 'gemini-2.0-flash-exp',
tools: [
{
functionDeclarations: [
{
name: 'getWeather',
description:
'gets the weather for a city and returns the
forecast using the metric system.',
parameters: {
type: 'object',
properties: {
city: {
type: 'string',
description: 'the city for which the weather
is requested',
},
},
required: ['city'],
},
},
],
},
],
toolConfig: { functionCallingConfig: { mode: 'AUTO' } },
});
Vector
Vectorisation is theprocess of
turning words into vectors (list of
numbers)
The attention mechanism uses
the context to assign meaning to
a word allowing the model to
understand and process words
accurately.
Vector
Remember context matters.
Wordslike “deposit”, “money” is
a strong signal that bank is a
fi
nancial institution.
“Sat”, “river” and “by” suggest
bank means riverbank.
63.
Embedding
An embedding isa way of
representing words as numerical
vectors.
For example
“king” [0.8, 0.1, 0.6]
“queen” [0.8, 0.1, 0.8]
“dog” [0.3, 0.8, 0.2]
64.
Distance
The distance betweenthe
vectors determines their how
“close” they are to one another.
For example
“king” [0.8, 0.1, 0.6]
“queen” [0.8, 0.1, 0.8]
“dog” [0.3, 0.8, 0.2]
65.
Distance
Think about aproduct
recommendation. And let’s do
some super-simple maths.
Let’s say that Kiasu Karen
bought an apple 🍎 and a banana
🍌.
We take sweetness and texture
into consideration.
Embedding
Embeddings are important
becausethey allow for use-
cases that revolve around QA
systems, recommendation
systems and similarity searches.
Note searches can only be run
against Vector indices.
77.
Embedding
1. Ingest datato BigQuery
2. Create a model
3. Create embeddings
4. Create index
5. Query
78.
CREATE OR REPLACEVECTOR INDEX `idx`
ON `ai-workshop-448608.ai_workshop_films.films_with_embeddings`
(ml_generate_embedding_result)
OPTIONS(
distance_type='COSINE', index_type='IVF', ivf_options='{"num_lists": 10}'
)
CREATE OR REPLACE MODEL `ai-workshop-448608.ai_workshop_films.film_embedding`
REMOTE WITH CONNECTION `ai-workshop-448608.asia-southeast1.ai_workshop_connection`
OPTIONS(
endpoint = 'text-embedding-005'
);
CREATE OR REPLACE TABLE `ai-workshop-448608.ai_workshop_films.films_with_embeddings` AS (
SELECT *
FROM ML.GENERATE_EMBEDDING(
MODEL `ai-workshop-448608.ai_workshop_films.film_embedding`,
(
SELECT *, CONCAT(title, '', overview) AS content
FROM `ai-workshop-448608.ai_workshop_films.films`
)
)
);
https://www.youtube.com/watch?v=eztSNAZ0f_4
79.
https://www.youtube.com/watch?v=eztSNAZ0f_4
SELECT * FROMVECTOR_SEARCH(
TABLE `ai-workshop-448608.ai_workshop_films.films_with_embeddings`,
'ml_generate_embedding_result',
(
SELECT ml_generate_embedding_result AS embedding_col FROM ML.GENERATE_EMBEDDING
(
MODEL `ai-workshop-448608.ai_workshop_films.film_embedding`,
(SELECT "Star Wars" AS content),
STRUCT(TRUE AS flatten_json_output)
)
)
,top_k => 5
);
80.
Try it out
npmrun recommendation “”
npm run recommendation-advanced
81.
Similarity
Search
1. Embed thecontent
2. Embed the question/query
Return results based on
similarity (distance) matches.
Similarity
Search
User read andlikes “Harry Potter
and the Prisoner of Azkaban”
Book Similarity Score
Harry Potter and the Cursed Child 0.98
Percy Jackson & The Olympians 0.85
The Magicians by Lev Grossman 0.82
The other two results have lower similarity
scores but are more relevant to what the user
originally read.
https://www.youtube.com/watch?v=o5_t6Ai--ws
84.
Fine Tuning
Process ofadapting a pre-
trained model for speci
fi
c tasks
or use-cases. Done by providing
examples to the model. (This is
also called supervised
fi
ne-
tuning)
https://ai.google.dev/gemini-api/docs/model-tuning
RAG
Retrieval-Augmented Generation
is atechnique where retrieval is
done from source(s) speci
fi
ed by
the user and generation is done
using that information and
answering in natural language.
https://cloud.google.com/use-cases/retrieval-augmented-generation?hl=en
Agents
An AI agentis a smart assistant
that can observe a situation,
think about what to do and act to
achieve the desired outcome/
goal.
90.
Agents
There are manytypes of agents -
some are just calling functions,
some systems can also be multi-
agent.
91.
Agents
Multi agent systemscan pass
work to each other and
effectively communicate with
each other to react to a task.
92.
Agentic
RAG
Agentic RAG combinesAI agents
with RAG.
It initially retrieves answers from
speci
fi
ed data sources but can
autonomously expand its search
via function calling if no relevant
information is found.
Generative
Fill
Uses a methodcalled inpainting,
where AI looks at the colours,
textures and patterns around the
image and generates new
content and blends that with the
rest of the image.
Try it out
npmrun ecommerce
https://videoapi.cloudinary.com/video-demo/
video-smart-cropping
102.
C2PA
C2PA (Coalition forContent
Provenance and Authenticity) is a
standard developed to provide a
framework for verifying
authenticity and provenance of
digital content.
Evals
Evals are amethod for testing AI
systems to assess their
performance and reliability. They
are crucial in production to
measure how well a model
functions and responds to
different inputs.
https://cloud.google.com/vertex-ai/generative-ai/docs/models/determine-eval
106.
Evals
A basic evalcould be: “Did the AI
call the function when it had to”?
Score 1 - yes, Score 0 - no
Learn more
https://github.com/
braintrustdata/autoevals