Understanding AI Language Processing After Transformers: Learning AI’s Working Principles Through 6 Practical Exercises

📌 Lecture Overview

Title: A lecture that changes your understanding of GPT and Gemini
Instructor: Yang Sil-jang of Vibecoding University
Date: May 22, 2026
Views: 17,675
URL: https://youtu.be/Z_zR-WanGuQ?si=Fd3TCay4WgTnoNMK

1. Tokenizing and Language Processing

graph LR
    A[Original Text] --> B[Tokenizer]
    B --> C[Token Sequence]
    C --> D[Embedding Conversion]
    D --> E[Attention Processing]

Tokenization Phenomenon:
- “Hello” → Divided into 8 tokens
- English “How are you?” → 6 tokens
- Korean requires 2~3 times more tokens than English
- Context window efficiency reduction
Tokenization Characteristics:
- Divided into meaning units, not word units
- Directly affects the model’s processing method
- Directly affects cost and processing capacity

2. Embedding and Meaning Representation

# Embedding similarity measurement example
def calculate_similarity(word1, word2):
    embedding1 = model.encode(word1)
    embedding2 = model.encode(word2)
    return cosine_similarity(embedding1, embedding2)

# Example results
print(calculate_similarity("apple", "pear"))  # 0.34
print(calculate_similarity("apple", "car"))  # 0.26
print(calculate_similarity("king", "queen"))  # 0.60

Embedding Characteristics:
- Numerical representation of the semantic position of words
- Semantically close words are expressed close to each other in vector space
- Embedding results vary depending on the model version
- Embedding results of the same word vary depending on the context

3. Attention Mechanism

graph TD
    A[Query] --> B[Attention Calculation]
    C[Key] --> B
    D[Value] --> B
    B --> E[Final Output]

Attention Role:
- Analysis of relationships between words in a sentence
- Calculation of Q(Query), K(Key), V(Value) roles
- Application of importance weights for each word
- Complex structure with 12 transformer blocks and 12 heads

4. Generation Strategies and Output Control

# Generation strategy comparison
def generate_text(prompt, temperature=0.7, top_p=0.9):
    response = model.generate(
        prompt,
        temperature=temperature,
        top_p=top_p,
        max_tokens=12
    )
    return response

# Result comparison
print(generate_text("Today's lunch", temperature=0.1))  # Deterministic output
print(generate_text("Today's lunch", temperature=1.8))  # Creative output

Generation Strategies:
- Temperature: Controls output diversity (0.1~1.8)
- Top-P: Candidate filtering based on probability distribution
- Generation strategies determine the diversity and consistency of output results

5. Model Limitations and Practical Applications

# Hallucination example
response = model.generate(
    "Summarize the main content and author of the paper 'Hierarchical Contextual Embedding Model for Korean Sentiment Analysis' published in the 2019 Korean AI Conference Journal",
    temperature=1.2
)
print(response)
# Output: "The paper focuses on..."

# Knowledge cutoff example
response = model.generate("Tell me today's date and the current dollar exchange rate")
print(response)
# Output: "My knowledge cutoff is October 3, 2023..."

Model Limitations:
- Hallucination: Generation of non-existent information
- Knowledge cutoff: Inability to access information after model training
- Context window constraints: Limitation on the number of tokens that can be processed

6. API-Based Conversational Systems

# API call example
import requests

API_URL = "https://api.example.com/v1/chat"
headers = {"Authorization": "Bearer YOUR_API_KEY"}

def chat_with_model(messages):
    response = requests.post(
        API_URL,
        headers=headers,
        json={"messages": messages}
    )
    return response.json()

# System prompt example
messages = [
    {"role": "system", "content": "You are a kind science teacher."},
    {"role": "user", "content": "What is a black hole?"},
    {"role": "assistant", "content": "A black hole is a region in space..."},
    {"role": "user", "content": "Please give me an example"}
]

response = chat_with_model(messages)
print(response)

API System Characteristics:
- Distinction between system/user/assistant roles
- Accumulated management of conversation history
- Emphasis on the importance of context window management

🎯 Conclusion

This lecture systematically explains the AI language processing mechanisms after transformers through 6 practical exercises. It helps in visually understanding the entire process from tokenizing to API systems. It clearly explains the limitations of the model and the considerations required for practical applications, enabling the acquisition of core concepts necessary for practical applications.

📚 References

Vibecoding University website: https://bit.ly/4sh1g2d
Next lecture: RAG·Tool Use·Agent