Vivek Kaushik

AI is not magic. It is structured pattern learning, statistical reasoning, and probabilistic prediction at scale.

1️⃣ Artificial Intelligence (AI) – Overview

What is AI?

Artificial Intelligence is the broader field focused on building systems that can:

Perceive

Reason

Learn

Make decisions

AI includes:

Machine Learning

Deep Learning

Natural Language Processing

Computer Vision

Reinforcement Learning

Generative AI

2️⃣ Machine Learning (ML)

What is ML?

Machine Learning is a subset of AI where systems learn patterns from data instead of being explicitly programmed.

Instead of:


If A → then B

ML does:


Learn pattern(A, B) from data

Types of Machine Learning

1️⃣ Supervised Learning

Model learns from labeled data.

Example:

Input: Email text

Label: Spam / Not spam

Used for:

Classification

Regression

Examples:

Linear regression

Logistic regression

Decision trees

Neural networks

2️⃣ Unsupervised Learning

Model learns patterns without labels.

Used for:

Clustering

Dimensionality reduction

Anomaly detection

Examples:

K-means

DBSCAN

3️⃣ Semi-Supervised Learning

Combination of:

Small labeled dataset

Large unlabeled dataset

Useful when labeling is expensive.

4️⃣ Reinforcement Learning (NOT part of supervised)

Agent:

Takes actions

Receives reward

Learns policy to maximize long-term reward

Used in:

Robotics

Game AI

LLM RLHF (Reinforcement Learning from Human Feedback)

3️⃣ Deep Learning

What is Deep Learning?

Deep Learning is a subset of ML using multi-layer neural networks to learn hierarchical patterns.

Key characteristics:

Large datasets

Large models

Automatic feature extraction

High compute

Neural Networks

Basic structure:

Input → Hidden layers → Output

Each layer:

Weighted sum

Activation function

Backpropagation during training

Transformers (Modern AI Backbone)

Most modern LLMs use:

Transformer architecture

Key innovations:

Self-attention mechanism

Parallel processing

Contextual token relationships

4️⃣ Large Language Models (LLMs)

What is an LLM?

A large neural network trained on massive text datasets to:

Predict next token

Generate human-like text

Perform reasoning-like tasks

Important:

LLMs are probabilistic token predictors, not reasoning engines.

Key Properties

Context window (token limit)

Temperature (randomness control)

Top-K / Top-P sampling

Non-deterministic outputs

Why LLMs Hallucinate

LLMs:

Always try to produce an answer

Fill gaps probabilistically

May fabricate information if context is weak

Hallucination = confident but incorrect generation.

5️⃣ Embeddings

What Are Embeddings?

Embeddings convert text into numerical vectors that capture semantic meaning.

Example:

"doctor" and "physician"

→ Similar vector representations.

Why Embeddings Matter

They enable:

Semantic search

Vector databases

RAG systems

Similarity comparison

Embedding Models vs Chat Models

Embedding model:

Converts text → vector

Used for retrieval

Chat/completion model:

Generates text

Used for response creation

6️⃣ Retrieval-Augmented Generation (RAG)

What Problem Does RAG Solve?

LLMs:

Have static training data

Cannot access private enterprise documents

May hallucinate

RAG solves:

Grounding LLMs with external knowledge during inference.

Classic RAG Flow

Data ingestion

Chunking

Embedding generation

Vector storage

User query embedding

Top-K retrieval

Prompt augmentation

LLM generation

Post-processing

Why Chunking Matters

Too small:

Fragmented context

Weak retrieval

Too large:

Noise

Context overflow

Balanced chunking:

~800–1200 tokens with overlap

Hybrid Search

Pure vector search:

Similarity only

Hybrid search:

Keyword + semantic + vector

Improves:

Recall

Precision

Robustness

Common RAG Failure Modes

Poor chunking

Weak retrieval recall

Misconfigured hybrid search

Prompt grounding errors

Context window overflow

Stale index

Most common failure:

Retrieval quality degradation.

7️⃣ Fine-Tuning vs RAG

Fine-Tuning

Retrains model on domain data

Static knowledge

Higher cost

Requires training pipeline

Best for:

Style adaptation

Structured task specialization

RAG

No retraining required

Dynamic knowledge

Lower cost

Uses external data

Best for:

Enterprise knowledge

Frequently updated content

Key Distinction

Fine-tuning changes the model.
RAG augments the model.

8️⃣ Prompt Engineering

What Is Prompt Engineering?

Designing input instructions to guide LLM behavior.

Includes:

System role definition

Output format constraints

Few-shot examples

Guardrails

Chain of Thought (CoT)

Technique where model:

Generates reasoning steps before final answer.

Improves reasoning quality.

Structured Output

Using:

JSON schema

Function calling

Tool invocation

Improves determinism.

9️⃣ Agentic AI

What Is Agentic AI?

LLM system that:

Reasons

Decides

Calls tools

Executes multi-step workflows

Difference from RAG:

RAG = Retrieval + generate

Agent = Plan + act + observe + refine

When to Use Agentic Pattern

Use when:

Multi-step reasoning required

Workflow is non-deterministic

Tool orchestration needed

Do NOT use for simple FAQ bots.

🔟 Responsible AI

Key Principles

Transparency

Fairness

Privacy

Security

Accountability

Common Risks

Prompt injection

Data leakage

Hallucination

Bias

Unauthorized data access

Enterprise Best Practices

Retrieval-level RBAC

Strict system prompts

Content filtering

Human-in-the-loop for write actions

Audit logging

Evaluation pipelines

1️⃣1️⃣ Evaluation of RAG Systems

Measure separately:

Retrieval Quality

Recall@K

Precision

Generation Quality

Faithfulness

Groundedness

Hallucination rate

Completeness

Always evaluate retrieval and generation independently.

1️⃣2️⃣ Cost, Performance, Observability

Performance

Separate ingestion & query pipelines

Use hybrid search

Control Top-K

Monitor latency

Cost Optimization

Reduce MaxOutputTokens

Cache embeddings

Batch ingestion

Use smaller models where possible

Observability

Track:

Token usage

Latency (retrieval vs generation)

Error rates

Search scores

Model deployment name

Correlation IDs

Use:

Application Insights

Structured logging

Distributed tracing

🔥 Core Mental Models to Remember

LLMs are probabilistic generators, not databases.

Security must be enforced before generation.

Retrieval quality determines answer quality.

Ingestion and query pipelines must be separated.

Observability is mandatory in enterprise AI

AI is not magic. It is structured pattern learning, statistical reasoning, and probabilistic prediction at scale.

1️⃣ Artificial Intelligence (AI) – Overview

What is AI?

Artificial Intelligence is the broader field focused on building systems that can:

Perceive

Reason

Learn

Make decisions

AI includes:

Machine Learning

Deep Learning

Natural Language Processing

Computer Vision

Reinforcement Learning

Generative AI

2️⃣ Machine Learning (ML)

What is ML?

Machine Learning is a subset of AI where systems learn patterns from data instead of being explicitly programmed.

Instead of:


If A → then B

ML does:


Learn pattern(A, B) from data

Types of Machine Learning

1️⃣ Supervised Learning

Model learns from labeled data.

Example:

Input: Email text

Label: Spam / Not spam

Used for:

Classification

Regression

Examples:

Linear regression

Logistic regression

Decision trees

Neural networks

2️⃣ Unsupervised Learning

Model learns patterns without labels.

Used for:

Clustering

Dimensionality reduction

Anomaly detection

Examples:

K-means

DBSCAN

3️⃣ Semi-Supervised Learning

Combination of:

Small labeled dataset

Large unlabeled dataset

Useful when labeling is expensive.

4️⃣ Reinforcement Learning (NOT part of supervised)

Agent:

Takes actions

Receives reward

Learns policy to maximize long-term reward

Used in:

Robotics

Game AI

LLM RLHF (Reinforcement Learning from Human Feedback)

3️⃣ Deep Learning

What is Deep Learning?

Deep Learning is a subset of ML using multi-layer neural networks to learn hierarchical patterns.

Key characteristics:

Large datasets

Large models

Automatic feature extraction

High compute

Neural Networks

Basic structure:

Input → Hidden layers → Output

Each layer:

Weighted sum

Activation function

Backpropagation during training

Transformers (Modern AI Backbone)

Most modern LLMs use:

Transformer architecture

Key innovations:

Self-attention mechanism

Parallel processing

Contextual token relationships

4️⃣ Large Language Models (LLMs)

What is an LLM?

A large neural network trained on massive text datasets to:

Predict next token

Generate human-like text

Perform reasoning-like tasks

Important:

LLMs are probabilistic token predictors, not reasoning engines.

Key Properties

Context window (token limit)

Temperature (randomness control)

Top-K / Top-P sampling

Non-deterministic outputs

Why LLMs Hallucinate

LLMs:

Always try to produce an answer

Fill gaps probabilistically

May fabricate information if context is weak

Hallucination = confident but incorrect generation.

5️⃣ Embeddings

What Are Embeddings?

Embeddings convert text into numerical vectors that capture semantic meaning.

Example:

"doctor" and "physician"

→ Similar vector representations.

Why Embeddings Matter

They enable:

Semantic search

Vector databases

RAG systems

Similarity comparison

Embedding Models vs Chat Models

Embedding model:

Converts text → vector

Used for retrieval

Chat/completion model:

Generates text

Used for response creation

6️⃣ Retrieval-Augmented Generation (RAG)

What Problem Does RAG Solve?

LLMs:

Have static training data

Cannot access private enterprise documents

May hallucinate

RAG solves:

Grounding LLMs with external knowledge during inference.

Classic RAG Flow

Data ingestion

Chunking

Embedding generation

Vector storage

User query embedding

Top-K retrieval

Prompt augmentation

LLM generation

Post-processing

Why Chunking Matters

Too small:

Fragmented context

Weak retrieval

Too large:

Noise

Context overflow

Balanced chunking:

~800–1200 tokens with overlap

Hybrid Search

Pure vector search:

Similarity only

Hybrid search:

Keyword + semantic + vector

Improves:

Recall

Precision

Robustness

Common RAG Failure Modes

Poor chunking

Weak retrieval recall

Misconfigured hybrid search

Prompt grounding errors

Context window overflow

Stale index

Most common failure:

Retrieval quality degradation.

7️⃣ Fine-Tuning vs RAG

Fine-Tuning

Retrains model on domain data

Static knowledge

Higher cost

Requires training pipeline

Best for:

Style adaptation

Structured task specialization

RAG

No retraining required

Dynamic knowledge

Lower cost

Uses external data

Best for:

Enterprise knowledge

Frequently updated content

Key Distinction

Fine-tuning changes the model.
RAG augments the model.

8️⃣ Prompt Engineering

What Is Prompt Engineering?

Designing input instructions to guide LLM behavior.

Includes:

System role definition

Output format constraints

Few-shot examples

Guardrails

Chain of Thought (CoT)

Technique where model:

Generates reasoning steps before final answer.

Improves reasoning quality.

Structured Output

Using:

JSON schema

Function calling

Tool invocation

Improves determinism.

9️⃣ Agentic AI

What Is Agentic AI?

LLM system that:

Reasons

Decides

Calls tools

Executes multi-step workflows

Difference from RAG:

RAG = Retrieval + generate

Agent = Plan + act + observe + refine

When to Use Agentic Pattern

Use when:

Multi-step reasoning required

Workflow is non-deterministic

Tool orchestration needed

Do NOT use for simple FAQ bots.

🔟 Responsible AI

Key Principles

Transparency

Fairness

Privacy

Security

Accountability

Common Risks

Prompt injection

Data leakage

Hallucination

Bias

Unauthorized data access

Enterprise Best Practices

Retrieval-level RBAC

Strict system prompts

Content filtering

Human-in-the-loop for write actions

Audit logging

Evaluation pipelines

1️⃣1️⃣ Evaluation of RAG Systems

Measure separately:

Retrieval Quality

Recall@K

Precision

Generation Quality

Faithfulness

Groundedness

Hallucination rate

Completeness

Always evaluate retrieval and generation independently.

1️⃣2️⃣ Cost, Performance, Observability

Performance

Separate ingestion & query pipelines

Use hybrid search

Control Top-K

Monitor latency

Cost Optimization

Reduce MaxOutputTokens

Cache embeddings

Batch ingestion

Use smaller models where possible

Observability

Track:

Token usage

Latency (retrieval vs generation)

Error rates

Search scores

Model deployment name

Correlation IDs

Use:

Application Insights

Structured logging

Distributed tracing

🔥 Core Mental Models to Remember

LLMs are probabilistic generators, not databases.

Security must be enforced before generation.

Retrieval quality determines answer quality.

Ingestion and query pipelines must be separated.

Observability is mandatory in enterprise AI

🧠 AI Fundamentals

1️⃣ Artificial Intelligence (AI) – Overview

What is AI?

2️⃣ Machine Learning (ML)

What is ML?

Types of Machine Learning

1️⃣ Supervised Learning

2️⃣ Unsupervised Learning

3️⃣ Semi-Supervised Learning

4️⃣ Reinforcement Learning (NOT part of supervised)

3️⃣ Deep Learning

What is Deep Learning?

Neural Networks

Transformers (Modern AI Backbone)

4️⃣ Large Language Models (LLMs)

What is an LLM?

Key Properties

Why LLMs Hallucinate

5️⃣ Embeddings

What Are Embeddings?

Why Embeddings Matter

Embedding Models vs Chat Models

6️⃣ Retrieval-Augmented Generation (RAG)

What Problem Does RAG Solve?

Classic RAG Flow

Why Chunking Matters

Hybrid Search

Common RAG Failure Modes

7️⃣ Fine-Tuning vs RAG

Fine-Tuning

RAG

Key Distinction

8️⃣ Prompt Engineering

What Is Prompt Engineering?

Chain of Thought (CoT)

Structured Output

9️⃣ Agentic AI

What Is Agentic AI?

When to Use Agentic Pattern

🔟 Responsible AI

Key Principles

Common Risks

Enterprise Best Practices

1️⃣1️⃣ Evaluation of RAG Systems

Retrieval Quality

Generation Quality

1️⃣2️⃣ Cost, Performance, Observability

Performance

Cost Optimization

Observability

🔥 Core Mental Models to Remember

🧠 AI Fundamentals

1️⃣ Artificial Intelligence (AI) – Overview

What is AI?

2️⃣ Machine Learning (ML)

What is ML?

Types of Machine Learning

1️⃣ Supervised Learning

2️⃣ Unsupervised Learning

3️⃣ Semi-Supervised Learning

4️⃣ Reinforcement Learning (NOT part of supervised)

3️⃣ Deep Learning

What is Deep Learning?

Neural Networks

Transformers (Modern AI Backbone)

4️⃣ Large Language Models (LLMs)

What is an LLM?

Key Properties

Why LLMs Hallucinate

5️⃣ Embeddings

What Are Embeddings?

Why Embeddings Matter

Embedding Models vs Chat Models

6️⃣ Retrieval-Augmented Generation (RAG)

What Problem Does RAG Solve?

Classic RAG Flow

Why Chunking Matters

Hybrid Search

Common RAG Failure Modes

7️⃣ Fine-Tuning vs RAG