← Back | AI & LLM Integration
Week 7–8
Week 7–8 Β· AI & Frontend

AI & LLM Integration

This is what makes you stand out. Integrating AI into your Java application turns a standard CRUD app into an intelligent product. These skills are the highest-demand in the market.

πŸ€– OpenAI / Azure OpenAI 🦜 LangChain4j πŸ“š RAG
🧠
Concept
What is an LLM?
A Large Language Model (LLM) is like a very well-read person who has read almost every book, article, and webpage on the internet. You can ask them questions and they give thoughtful, human-like answers. They're not "thinking" the way humans do β€” they're predicting the most likely next word, over and over, extremely well.

Popular LLMs: GPT-4o (OpenAI), Claude (Anthropic), Gemini (Google), Llama (Meta). As developers, we call these models via APIs.
TermMeaning
LLMLarge Language Model β€” the AI model (GPT-4o, Claude, etc.)
PromptThe text you send to the model
CompletionThe model's response
Token~4 characters or ~3/4 of a word. Pricing is per token.
Context windowMax tokens the model can "see" at once (GPT-4o: 128k tokens)
TemperatureRandomness: 0=deterministic/factual, 1=creative/varied
EmbeddingA numeric vector representation of text (used for semantic search)
πŸ”Œ
Step 1
Calling OpenAI API Directly
Understanding the raw API before using LangChain4j
pom.xml β€” add dependency
<dependency> <groupId>com.openai</groupId> <artifactId>openai-java</artifactId> <version>0.8.0</version> </dependency>
application.properties
openai.api.key=${OPENAI_API_KEY} # Never hardcode! Use env variable
Java β€” Simple chat request
@Service public class AiService { @Value("${openai.api.key}") private String apiKey; public String chat(String userMessage) { OpenAI client = OpenAI.builder() .apiKey(apiKey) .build(); ChatCompletion response = client.chat().completions().create( ChatCompletionCreateParams.builder() .model("gpt-4o-mini") .addUserMessage(userMessage) .temperature(0.7) .maxTokens(500) .build() ); return response.choices().get(0).message().content().orElse(""); } } // Usage in controller @GetMapping("/ask") public String ask(@RequestParam String question) { return aiService.chat(question); }
API key security: Set OPENAI_API_KEY as an environment variable. NEVER commit it to Git. Add it to .gitignore. In production, use Azure Key Vault or App Service environment variables.
🦜
Step 2 β€” Recommended
LangChain4j with Spring Boot
The best way to build AI apps in Java
LangChain4j is like Spring Boot for AI. Just as Spring Boot auto-configures Tomcat, database connections, and JSON serialization, LangChain4j auto-configures LLM clients, memory, tools, and vector stores. Much less boilerplate.
pom.xml
<dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-spring-boot-starter</artifactId> <version>0.32.0</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-open-ai-spring-boot-starter</artifactId> <version>0.32.0</version> </dependency>
application.properties
langchain4j.open-ai.chat-model.api-key=${OPENAI_API_KEY} langchain4j.open-ai.chat-model.model-name=gpt-4o-mini langchain4j.open-ai.chat-model.temperature=0.7
Create an AI Service interface β€” Spring magic!
// LangChain4j generates the implementation automatically @AiService public interface StudentAssistant { @SystemMessage("You are a helpful assistant for CampusToAI Academy students. " + "Help them with Java, Spring Boot, and cloud topics. " + "Be concise and give code examples when helpful.") String chat(@UserMessage String message); } // Inject and use it like any Spring bean! @RestController public class ChatController { private final StudentAssistant assistant; public ChatController(StudentAssistant assistant) { this.assistant = assistant; } @PostMapping("/api/chat") public String chat(@RequestBody String message) { return assistant.chat(message); } }
This is it β€” that's your AI chatbot endpoint! Send POST to /api/chat with a question and get an AI answer. LangChain4j handles the API call, serialization, and response parsing.
✍️
Skill
Prompt Engineering
How to write effective prompts

Prompt engineering is the skill of crafting inputs that get the best output from an LLM. The quality of your prompt directly determines the quality of the response.

Key Techniques

TechniqueExample
Be specific"Explain Java generics" β†’ "Explain Java generics to a student who knows basic OOP but has never seen <T>, with one code example"
Assign a role"You are a senior Java developer reviewing code for a fresher"
Provide context"Given this Spring Boot controller: [code], what is wrong with it?"
Chain of thought"Think step by step before answering"
Few-shotProvide 1-2 examples of input→output before asking your question
System vs User prompt in code
@AiService public interface CodeReviewer { @SystemMessage( "You are an expert Java code reviewer. " + "Review the provided code and identify: bugs, security issues, " + "performance problems, and style violations. " + "Format your response as: [Issue Type]: description. Fix: suggestion." ) String reviewCode(@UserMessage String code); }
πŸ“š
Advanced Pattern
RAG β€” Retrieval Augmented Generation
Problem: LLMs don't know about YOUR data (your course materials, your company docs, your database). Their knowledge has a cutoff date.

RAG solution: Before asking the AI a question, search your own database for relevant documents and include them in the prompt. "Here is some context from our docs: [relevant text]. Now answer this question: [user question]"

RAG Flow

User asks: "What are the payment options for CampusToAI?" 1. Convert question to embedding (vector) 2. Search vector database for similar content β†’ Finds: course-brochure.md, payment-terms.md 3. Build augmented prompt: "Context: [content from found documents] Question: What are the payment options for CampusToAI?" 4. Send to LLM β†’ accurate, grounded answer
LangChain4j RAG with document ingestion
@Bean public EmbeddingStoreIngestor ingestor(EmbeddingModel model, EmbeddingStore<TextSegment> store) { return EmbeddingStoreIngestor.builder() .embeddingModel(model) .embeddingStore(store) .build(); } // Ingest your documents (once, at startup or via API) public void ingestDocuments() { Document doc = FileSystemDocumentLoader.loadDocument("course-faq.txt"); ingestor.ingest(doc); } // AI Service with RAG β€” content retriever auto-injects context @AiService public interface CourseAssistant { @SystemMessage("You are a CampusToAI Academy assistant. Use only the provided context to answer.") String answer(@UserMessage String question); }
🎯
Interview Prep
Common Interview Questions
QWhat is an LLM and how is it different from traditional programming?

An LLM (Large Language Model) is a deep learning model trained on massive text datasets. It generates responses by predicting the next token based on patterns learned during training.

Traditional programming: explicit rules and logic coded by developer β†’ deterministic output.

LLM: trained on examples β†’ statistical pattern matching β†’ probabilistic output. They're better at open-ended tasks (summarisation, Q&A, code generation) where explicit rules are impractical.

QWhat is a token in LLM context?

A token is the basic unit of text that an LLM processes. Approximately 1 token β‰ˆ 4 characters β‰ˆ 3/4 of a word in English.

"Hello world" β‰ˆ 2 tokens. A 500-word essay β‰ˆ 375 tokens.

Why it matters: API pricing is per token (input + output). Context window limits are in tokens. GPT-4o has a 128,000 token context window β€” roughly a 96,000-word book.

QWhat is RAG (Retrieval Augmented Generation)?

RAG is a technique that improves LLM responses by injecting relevant external knowledge into the prompt before querying the model.

Why: LLMs have a training cutoff and don't know your proprietary data. RAG lets you: ask questions about your own documents, get up-to-date information, reduce hallucinations (making up answers), and keep sensitive data in your own database rather than training the model on it.

Flow: user question β†’ embed question β†’ search vector DB β†’ retrieve relevant docs β†’ augment prompt with docs β†’ LLM gives grounded answer.

QWhat is an embedding?

An embedding is a numeric vector (list of floating-point numbers) that represents the semantic meaning of text. Texts with similar meanings produce similar vectors.

Example: "Hello" and "Hi" have similar embeddings. "Hello" and "Database" have very different embeddings.

Used in RAG: convert documents to embeddings and store in a vector database. When a user asks a question, convert the question to an embedding and find the most similar document embeddings (semantic search).

QWhat is prompt engineering?

Prompt engineering is the practice of crafting effective inputs (prompts) to get desired outputs from LLMs. Key techniques:

  • System message β€” set the AI's role and behaviour
  • Few-shot prompting β€” provide examples of inputβ†’output format
  • Chain of thought β€” ask the model to "think step by step"
  • Specificity β€” vague prompts get vague answers
  • Output format β€” ask for JSON, markdown, bullet points