Java-Powered Intelligence: Exploring AI Integration in Java Applications

Patroclos Lemoniatis
6 min readMay 28, 2024


Image by GrumpyBeere from Pixabay

Spring Boot

Spring Boot is an open-source Java framework used for programming standalone, production-grade Spring-based applications with minimal effort. Spring Boot is a convention-over-configuration extension for the Spring Java platform intended to help minimize configuration concerns while creating Spring-based applications.¹

Artificial Intelligence³

AI, short for Artificial Intelligence, refers to the simulation of human intelligence processes by machines, particularly computer systems. These processes include learning, reasoning, and self-correction.

AI encompasses a wide range of techniques and approaches, including:

1. Machine Learning: A subset of AI that enables machines to learn from data without being explicitly programmed. Machine learning algorithms iteratively learn from data, identify patterns, and make decisions or predictions.

2. Deep Learning: A type of machine learning that uses artificial neural networks with many layers (deep neural networks) to model and process complex patterns in large amounts of data. Deep learning has been particularly successful in areas such as image and speech recognition.

3. Natural Language Processing (NLP): The ability of computers to understand, interpret, and generate human language. NLP enables applications such as virtual assistants, language translation, and sentiment analysis.

4. Computer Vision: The field of AI focused on enabling computers to interpret and understand visual information from the real world, such as images and videos. Computer vision is used in applications ranging from facial recognition to autonomous vehicles.

5. Robotics: The integration of AI techniques into robotic systems to enable them to perceive their environment, make decisions, and perform tasks autonomously. Robotics has applications in manufacturing, healthcare, agriculture, and many other fields.

AI technologies have a wide range of applications across various industries, including:

  • healthcare
  • finance
  • transportation
  • education
  • entertainment

Some common applications of AI include³

  • Personalized recommendations in e-commerce and content streaming platforms.
  • Fraud detection and risk assessment in banking and finance.
  • Autonomous vehicles and drones for transportation and delivery.
  • Medical image analysis for disease diagnosis and treatment planning.
  • Chatbots and virtual assistants for customer service and support.

How do machines translate real world information into data they understand?

Vector embeddings³

also known as word embeddings or simply embeddings, are numerical representations of words or phrases in a continuous vector space. These representations are designed to capture semantic and syntactic similarities between words, making them useful for various natural language processing (NLP) tasks. Overall, vector embeddings provide a powerful way to represent and process natural language text in machine learning models, enabling them to understand and generate human-like language behavior.

Vector databases³

also known as vectorized databases or vector stores, are specialized databases designed to efficiently store, index, and query high-dimensional vector data. Unlike traditional relational databases that primarily deal with structured data like numbers and strings, vector databases are optimized for handling vectors, which are numerical representations of objects or entities in a multi-dimensional space.

Why are vector databases important?

One of the primary values a database brings to application development is the ability to organize and categorize data efficiently for us by those applications. As stated above, vector databases are at the foundation of building generative AI applications because they enable vector search capabilities. When machine learning was in its infancy the data used by LLMs was typically small and finite, but as generative AI has become mainstream, the amount of data used to train and augment learning has grown exponentially. This is why vector databases are so important, they simplify fundamental operations for generative AI apps by storing large volumes of data in the structure that generative AI applications need for optimized operations.²

“[…] once you understand this ML multitool (embedding), you’ll be able to build everything from search engines to recommendation systems to chatbots and a whole lot more. You don’t have to be a data scientist with ML expertise to use them, nor do you need a huge labeled dataset.” — Dale Markowitz, Google Cloud.

LLM — Large Language Models

These are sophisticated AI models, like GPT-3, that are trained on vast amounts of text data to understand and generate human-like language. Large Language Models have various applications, including text generation, language translation, question answering, and more.

We can run and play with open source models using Ollama tool

Llama2 LLM

Llama 2 is released by Meta Platforms, Inc. This model is trained on 2 trillion tokens, and by default supports a context length of 4096. Llama 2 Chat models are fine-tuned on over 1 million human annotations, and are made for chat.

Let’s explore a java sample implementation …

Using the springboot framework we will create the app’s configuration , controller and service to handle user chat questions and converting text to embeddings …


import java.util.Map;

import javax.sql.DataSource;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.task.AsyncTaskExecutor;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor;

public class AppConfiguration {

private DataSource dataSource;

PostgresMlEmbeddingClient embeddingClient() {
PostgresMlEmbeddingClient embeddingClient = new PostgresMlEmbeddingClient(
new JdbcTemplate(dataSource),
.withTransformer("distilbert-base-uncased") // huggingface transformer model name.
.withVectorType(VectorType.PG_VECTOR) //vector type in PostgreSQL.
.withKwargs(Map.of("device", "cpu")) // optional arguments.
.withMetadataMode(MetadataMode.EMBED) // Document metadata mode.

return embeddingClient;


import java.util.List;

import org.slf4j.Logger;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.util.MimeTypeUtils;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;


public class LlamaRestController {

private final Logger log = org.slf4j.LoggerFactory.getLogger(LlamaRestController.class);

private final LlamaAiService llamaAiService;

public LlamaRestController(LlamaAiService llamaAiService) {
this.llamaAiService = llamaAiService;

public ResponseEntity<LlamaResponse> chat(
@RequestParam(value = "text", defaultValue = "") String text) throws IOException {
if (text.isBlank()) {
return ResponseEntity.status(HttpStatus.BAD_REQUEST).build();
}"generating response for text ... {}", text);
final LlamaResponse aiResponse = llamaAiService.generateMessage(text);"got response for text {}", aiResponse.getMessage().toString());
return ResponseEntity.status(HttpStatus.OK).body(aiResponse);

public ResponseEntity<LlamaResponse> summarize(
@RequestParam(value = "text", defaultValue = "") String text) throws IOException {
if (text.isBlank()) {
return ResponseEntity.status(HttpStatus.BAD_REQUEST).build();
text = String.format("please summarize this text: %s", text);"generating response for text ... {}", text);
final LlamaResponse aiResponse = llamaAiService.generateMessage(text);"got response for text {}", aiResponse.getMessage().toString());
return ResponseEntity.status(HttpStatus.OK).body(aiResponse);

public ResponseEntity<LlamaResponse> generate(
@RequestParam(value = "promptMessage") String promptMessage,
@RequestParam(value = "filename", defaultValue = "") String filename) throws IOException {

if (promptMessage.isBlank()) {
return ResponseEntity.status(HttpStatus.BAD_REQUEST).build();

List<Media> media = null;
LlamaResponse aiResponse = null;

if (filename != null && filename.length() > 0) {
if (!filename.toLowerCase().endsWith(".png")) {
aiResponse = new LlamaResponse();
aiResponse.setMessage("only .png files allowed");
return ResponseEntity.badRequest().body(aiResponse);

filename = filename.startsWith("=") ? filename.substring(1, filename.length()) : filename;
byte[] fileData = new ClassPathResource("/static/files/" + filename).getContentAsByteArray();
media = List.of(new Media(MimeTypeUtils.IMAGE_PNG, fileData));"generating response for text [{}] and file [{}] ...", promptMessage, filename);
aiResponse = llamaAiService.generateMessage(promptMessage, media);
{"generating response for text ... {}", promptMessage);
aiResponse = llamaAiService.generateMessage(promptMessage);
}"got response for text {}", aiResponse.getMessage().toString());

return ResponseEntity.status(HttpStatus.OK).body(aiResponse);


import java.util.List;

import org.springframework.stereotype.Service;

public class LlamaAiService {

private final OllamaChatClient chatClient;

public LlamaAiService(OllamaChatClient chatClient) {
this.chatClient = chatClient;

public LlamaResponse generateMessage(String promptMessage) throws IOException {
return generateMessage(promptMessage, null);

public LlamaResponse generateMessage(String promptMessage, List<Media> media) throws IOException {
String llamaMessage = null;

if (media == null) {
llamaMessage =;
else {
var userMessage = new UserMessage(promptMessage, media);
ChatResponse response = Prompt(List.of(userMessage), OllamaOptions.create().withModel("mistral")));
llamaMessage = response.getResult().getOutput().getContent();

LlamaResponse resp = new LlamaResponse();
return resp;
interaction with llama2 LLM
interaction with vector database PostgresML

Fell free to dowload a copy of the code…

[1]: Spring Boot,

[2]: What is a Vector Database?: A Comprehensive Guide,

[3] Information provided by OpenAI’s GPT model, accessed on May 28, 2024.