Vivek Kaushik
AboutBlogWorkMy Work Ethics
🧠 AI Fundamentals

🧠 AI Fundamentals

Created
Feb 14, 2026 07:37 AM
Tags
AI is not magic. It is structured pattern learning, statistical reasoning, and probabilistic prediction at scale.

1️⃣ Artificial Intelligence (AI) – Overview

What is AI?

Artificial Intelligence is the broader field focused on building systems that can:
  • Perceive
  • Reason
  • Learn
  • Act
  • Make decisions
AI includes:
  • Machine Learning
  • Deep Learning
  • Natural Language Processing
  • Computer Vision
  • Reinforcement Learning
  • Generative AI

2️⃣ Machine Learning (ML)

What is ML?

Machine Learning is a subset of AI where systems learn patterns from data instead of being explicitly programmed.
Instead of:
If A → then B
ML does:
Learn pattern(A, B) from data

Types of Machine Learning

1️⃣ Supervised Learning

Model learns from labeled data.
Example:
  • Input: Email text
  • Label: Spam / Not spam
Used for:
  • Classification
  • Regression
Examples:
  • Linear regression
  • Logistic regression
  • Decision trees
  • Neural networks

2️⃣ Unsupervised Learning

Model learns patterns without labels.
Used for:
  • Clustering
  • Dimensionality reduction
  • Anomaly detection
Examples:
  • K-means
  • DBSCAN
  • PCA

3️⃣ Semi-Supervised Learning

Combination of:
  • Small labeled dataset
  • Large unlabeled dataset
Useful when labeling is expensive.

4️⃣ Reinforcement Learning (NOT part of supervised)

Agent:
  • Takes actions
  • Receives reward
  • Learns policy to maximize long-term reward
Used in:
  • Robotics
  • Game AI
  • LLM RLHF (Reinforcement Learning from Human Feedback)

3️⃣ Deep Learning

What is Deep Learning?

Deep Learning is a subset of ML using multi-layer neural networks to learn hierarchical patterns.
Key characteristics:
  • Large datasets
  • Large models
  • Automatic feature extraction
  • High compute

Neural Networks

Basic structure:
Input → Hidden layers → Output
Each layer:
  • Weighted sum
  • Activation function
  • Backpropagation during training

Transformers (Modern AI Backbone)

Most modern LLMs use:
Transformer architecture
Key innovations:
  • Self-attention mechanism
  • Parallel processing
  • Contextual token relationships

4️⃣ Large Language Models (LLMs)

What is an LLM?

A large neural network trained on massive text datasets to:
  • Predict next token
  • Generate human-like text
  • Perform reasoning-like tasks
Important:
LLMs are probabilistic token predictors, not reasoning engines.

Key Properties

  • Context window (token limit)
  • Temperature (randomness control)
  • Top-K / Top-P sampling
  • Non-deterministic outputs

Why LLMs Hallucinate

LLMs:
  • Always try to produce an answer
  • Fill gaps probabilistically
  • May fabricate information if context is weak
Hallucination = confident but incorrect generation.

5️⃣ Embeddings

What Are Embeddings?

Embeddings convert text into numerical vectors that capture semantic meaning.
Example:
"doctor" and "physician"
→ Similar vector representations.

Why Embeddings Matter

They enable:
  • Semantic search
  • Vector databases
  • RAG systems
  • Similarity comparison

Embedding Models vs Chat Models

Embedding model:
  • Converts text → vector
  • Used for retrieval
Chat/completion model:
  • Generates text
  • Used for response creation

6️⃣ Retrieval-Augmented Generation (RAG)

What Problem Does RAG Solve?

LLMs:
  • Have static training data
  • Cannot access private enterprise documents
  • May hallucinate
RAG solves:
Grounding LLMs with external knowledge during inference.

Classic RAG Flow

  1. Data ingestion
  1. Chunking
  1. Embedding generation
  1. Vector storage
  1. User query embedding
  1. Top-K retrieval
  1. Prompt augmentation
  1. LLM generation
  1. Post-processing

Why Chunking Matters

Too small:
  • Fragmented context
  • Weak retrieval
Too large:
  • Noise
  • Context overflow
Balanced chunking:
  • ~800–1200 tokens with overlap

Hybrid Search

Pure vector search:
  • Similarity only
Hybrid search:
  • Keyword + semantic + vector
Improves:
  • Recall
  • Precision
  • Robustness

Common RAG Failure Modes

  1. Poor chunking
  1. Weak retrieval recall
  1. Misconfigured hybrid search
  1. Prompt grounding errors
  1. Context window overflow
  1. Stale index
Most common failure:
Retrieval quality degradation.

7️⃣ Fine-Tuning vs RAG

Fine-Tuning

  • Retrains model on domain data
  • Static knowledge
  • Higher cost
  • Requires training pipeline
Best for:
  • Style adaptation
  • Structured task specialization

RAG

  • No retraining required
  • Dynamic knowledge
  • Lower cost
  • Uses external data
Best for:
  • Enterprise knowledge
  • Frequently updated content

Key Distinction

Fine-tuning changes the model.
RAG augments the model.

8️⃣ Prompt Engineering

What Is Prompt Engineering?

Designing input instructions to guide LLM behavior.
Includes:
  • System role definition
  • Output format constraints
  • Few-shot examples
  • Guardrails

Chain of Thought (CoT)

Technique where model:
  • Generates reasoning steps before final answer.
Improves reasoning quality.

Structured Output

Using:
  • JSON schema
  • Function calling
  • Tool invocation
Improves determinism.

9️⃣ Agentic AI

What Is Agentic AI?

LLM system that:
  • Reasons
  • Decides
  • Calls tools
  • Executes multi-step workflows
Difference from RAG:
RAG = Retrieval + generate
Agent = Plan + act + observe + refine

When to Use Agentic Pattern

Use when:
  • Multi-step reasoning required
  • Workflow is non-deterministic
  • Tool orchestration needed
Do NOT use for simple FAQ bots.

🔟 Responsible AI

Key Principles

  • Transparency
  • Fairness
  • Privacy
  • Security
  • Accountability

Common Risks

  • Prompt injection
  • Data leakage
  • Hallucination
  • Bias
  • Unauthorized data access

Enterprise Best Practices

  • Retrieval-level RBAC
  • Strict system prompts
  • Content filtering
  • Human-in-the-loop for write actions
  • Audit logging
  • Evaluation pipelines

1️⃣1️⃣ Evaluation of RAG Systems

Measure separately:

Retrieval Quality

  • Recall@K
  • Precision
  • MRR

Generation Quality

  • Faithfulness
  • Groundedness
  • Hallucination rate
  • Completeness
Always evaluate retrieval and generation independently.

1️⃣2️⃣ Cost, Performance, Observability

Performance

  • Separate ingestion & query pipelines
  • Use hybrid search
  • Control Top-K
  • Monitor latency

Cost Optimization

  • Reduce MaxOutputTokens
  • Cache embeddings
  • Batch ingestion
  • Use smaller models where possible

Observability

Track:
  • Token usage
  • Latency (retrieval vs generation)
  • Error rates
  • Search scores
  • Model deployment name
  • Correlation IDs
Use:
  • Application Insights
  • Structured logging
  • Distributed tracing

🔥 Core Mental Models to Remember

LLMs are probabilistic generators, not databases.
Security must be enforced before generation.
Retrieval quality determines answer quality.
Ingestion and query pipelines must be separated.
Observability is mandatory in enterprise AI
Table of Contents
1️⃣ Artificial Intelligence (AI) – OverviewWhat is AI?2️⃣ Machine Learning (ML)What is ML?Types of Machine Learning1️⃣ Supervised Learning2️⃣ Unsupervised Learning3️⃣ Semi-Supervised Learning4️⃣ Reinforcement Learning (NOT part of supervised)3️⃣ Deep LearningWhat is Deep Learning?Neural NetworksTransformers (Modern AI Backbone)4️⃣ Large Language Models (LLMs)What is an LLM?Key PropertiesWhy LLMs Hallucinate5️⃣ EmbeddingsWhat Are Embeddings?Why Embeddings MatterEmbedding Models vs Chat Models6️⃣ Retrieval-Augmented Generation (RAG)What Problem Does RAG Solve?Classic RAG FlowWhy Chunking MattersHybrid SearchCommon RAG Failure Modes7️⃣ Fine-Tuning vs RAGFine-TuningRAGKey Distinction8️⃣ Prompt EngineeringWhat Is Prompt Engineering?Chain of Thought (CoT)Structured Output9️⃣ Agentic AIWhat Is Agentic AI?When to Use Agentic Pattern🔟 Responsible AIKey PrinciplesCommon RisksEnterprise Best Practices1️⃣1️⃣ Evaluation of RAG SystemsRetrieval QualityGeneration Quality1️⃣2️⃣ Cost, Performance, ObservabilityPerformanceCost OptimizationObservability🔥 Core Mental Models to Remember
Copyright 2026 Vivek Kaushik