
On-premise AI knowledge engine for real estate, built in 90 days
United States
Real estate
3 months
MVP delivered
Embedded AI team
1 Solution architect,
2021-now
Work ongoing
1 Full-stack engineer
Business context
A US company has worked 20+ years in property management – a tough, competitive market.
As rivals began rolling out AI features, the leadership team knew they had to keep pace.
Solution overview
Together with Radency, they started with the most urgent use case for AI – building an internal knowledge engine.
We helped deliver it as an on-premise MVP solution in 3 months only.
AI challenge:
parsing, structuring, and protecting knowledge
"Do you have AI?”
That was the question our client kept hearing at industry roadshows. With competitors already showcasing it, they decided to dive in, starting from a practical business use case.
After years of collaboration, the CEO approached us with new ideas. We had to pick one that was both high-impact and quick-to-validate through R&D.
We landed on a RAG-powered assistant that parses, cleans, and structures 
a vast pool of documents to give business users instant answers to their queries.
What once took managers 30 minutes now takes seconds.
“The CEO had many ideas for how to integrate AI in their business. We started with a RAG-powered knowledge engine to help their managers find needed operational data quickly.”

Max Honcharuk
Partner & Solution Architect at Radency
We solved three core problems:
Privacy.
All data had to stay on-premise, with strict expiration rules. No cloud APIs, no external LLMs.
Parsing didn’t work.
Documents came in PDFs, DOCX, or JSON, some with tables, others as scanned images full of OCR noise.
Answers had to be fast,
reliable, and tailored to each business user’s settings, terminology, and internal communication standards.


Radency assembled 

a dedicated AI team
We kicked off this project with our Solution Architect and a certified full-stack engineer from Radency’s certification program.
1 →
Discovery in 1 week
A joint workshop with the CEO helped narrow dozens of ideas to one high-impact use case: a knowledge engine powered by retrieval augmented generation (RAG).
2 →
Proof of Concept in 40 days
Our Solution Architect and full-stack engineer ran R&D on parsing pipelines and models, testing different approaches to clean messy documents and retrieve accurate answers.
3 →
MVP in 1,5 months
With the concept validated, the team built an MVP that could process and query PDFs, which became the foundation of the client’s on-premise Knowledge Engine.
4 →
Expansion to financial data
We continued training the system to handle financial reports and other ops data. Today it processes a variety of reports, from balance sheets to liability breakdowns.
5 →
Prepping for production at scale
Our engineer continues supporting this project under the coordination of our Solution Architect. They’re improving accuracy, expanding coverage, and prepping the product for full rollout.
Results achieved with Radency’s AI team:
3 months for PoC and MVP
A small team delivered a working on-premise knowledge engine in just three months of active development + R&D.

~99% faster data lookups
Finding a figure in a 100-page report could take up to 30 minutes. Now semantic search retrieves it instantly, with the local LLM returning answers in seconds.
100% privacy compliance
All models, embeddings, and rerankers run on the client’s own GPU server, with no data leaving their infrastructure. Expiration rules automatically clear old records.
10 report types covered
Beyond PDFs, the engine now processes ~10 financial report formats, giving managers instant access to data once buried in manual reviews.

“With the RAG engine, managers can get answers in seconds instead of spending half an hour scrolling through reports.”

Max Honcharuk
Partner & Solution Architect at Radency

MVP 

of an on-premise AI knowledge engine, shipped in 3 months
The knowledge engine was built as a fully on-premise RAG system, keeping every document inside the client’s infrastructure. It connects to the client's platform via standard REST APIs.
The entire AI stack runs locally on a GPU server. This setup:
-
Meets the client’s data residency and privacy policies
-
Lets the company avoid the recurring costs of cloud LLMs
Our team continues to refine AI engine's accuracy by improving parsing pipelines and fine-tuning the model.
Key features shipped with Radency’s engineers:
01
FEATURE
On-premise AI stack
Everything runs locally. Embeddings, reranking, and generation sit on a self-hosted Llama 3.3 model. No data leaves the servers, which keeps it compliant and cheaper than running cloud LLMs.
02
FEATURE
Document parsing pipelines
Different processing pipelines handle different file types: PDFs, tables, scans, multi-report packets. OCR cleans up messy scans. Metadata routing sends files to the right parser.
03
FEATURE
Semantic search & chat
Users can just ask questions instead of scrolling through 100-page reports. A query like “Compare assets with liabilities this quarter” gets an instant, context-aware answer.
04
FEATURE
Role-based access
Access is locked down by role. Right now only managers and operators can use it. Rollout starts with internal ops teams before opening up further.
05
FEATURE
Data lifecycle manager
Handles history and cleanup. Chat sessions are stored, collections can be resynced, and old data is automatically removed on schedule.
06
FEATURE
Financial report support
Covers around 10 report types: balance sheets, income statements, liability. New data is added as business needs evolve.
Solution architecture:

Technologies used
→ AI / NLP
LLaMA (self-hosted)
BGE Embedding Model
→ Backend
REST API (Flask)
LLamaIndex
→ Storage
Milvus
MongoDB
MS SQL Server
→ Infrastructure
On-prem GPU server
Docker
From zero AI to Knowledge Engine in 90 days
We keep improving the Knowledge Engine and preparing it for full rollout. Meanwhile, the client wants to scale the AI team in future as they plan to move beyond RAG into AI agents and more advanced models.
Before
No AI, risk of falling behind competitors
Up to 30 min to search through reports
Privacy concerns blocked cloud adoption
After
On-premise knowledge engine in 3 months
Answers in seconds (~99% faster)
100% local, self-hosted AI stack
