Kentucky AI.

Research & Development

An independent AI research-and-development lab in Shelbyville, Kentucky.
We fine-tune, quantize, and serve domain-adapted model stacks — and the inference infrastructure beneath them — for operators across Kentucky.

A trusted pillar of Kentucky's applied-AI community · read by people and agents alike

Read the research What we're following

Open source · release

What we shipOpen-source takeoff, in the browser.

OpenTakeoff is an open-core takeoff canvas for flooring — the measuring engine, open-sourced under Apache-2.0, with the trained models and estimate engines kept proprietary. Open a plan, set the scale, trace the rooms, export a clean per-finish report. It runs entirely in your browser — nothing leaves your computer.

Open source · Apache-2.0

Open release

OpenTakeoff

A free, browser-only PDF takeoff canvas built for flooring contractors. One-Click room detection traces an enclosed space and snaps it to the corners; conditions, waste, and supporting materials roll up into a clean per-finish report you can export. The canvas is open — the trained models and estimate engines stay proprietary.

Stack: React · pdf.js · in-browser
License: Apache-2.0
Runs: Client-only · no upload
Cost: Free & open

Try it live →

Research & publications

The recordWhat we're working on.

An independent lab's public record. Abstracts are open; full papers are shared on request while the work is under review — the proprietary methods, weights, and data stay in the building. The thesis is plain: capable, sovereign, locally-served models in the hands of people doing real work.

Field note + whitepaper
published 2026

The Turbo Framework: Operational Exhaust as a Supervisory Signal

The frontier of applied AI doesn't run on new fuel — it runs on operational exhaust: the high-fidelity data your operation already discards in its daily work. A closed-loop architecture that captures real, reviewed work first (the anchor against model collapse), builds the executable verifier next (the as-built closeout, which doubles as the eval set), filters early-trajectory candidates before spending teacher tokens, supervises on-policy with a mode-seeking reverse-KL objective, and specializes per-vertical with LoRA — training is the last, smallest move. A hook-driven field note over a formal whitepaper, read against a real Division 9 takeoff.

Read the article →

Visual field map
educational

The Turbo Framework, Drawn

A ten-frame visual field map of the Turbo Framework — the closed loop, the exhaust manifold, the verifier valve, the wastegate, and the twin-scroll turbine, each mechanism drawn against a real Division 9 takeoff. A swipeable deck made to teach the architecture, not just state it.

Open the deck →

Visual field map
educational

Distillation & Fine-Tuning, Drawn

An eleven-frame visual field map — distillation, LoRA, on-policy training, quantization, and evaluation, each mechanism drawn against a real Division 9 takeoff. A swipeable deck made to teach the method, not just state it. The full written field note is right below.

Open the deck →

Field note
published 2026

Forging Local Models: Distillation, LoRA & the 128 GB Envelope

A field method for putting capable models inside the building: distil a domain expert's judgment into a structured knowledge graph, adapt a reasoning-capable base with per-vertical LoRA, and serve it quantized on Apple Silicon. The hard part isn't the training loop — it's picking a base that can actually reason, budgeting tokens so it can, and knowing when the right parser beats the bigger model.

Read the article →

Working paper
under review

The Floor as a Natural Experiment

Hospital floors are a documented reservoir for healthcare-associated pathogens, a modifiable factor in patient falls, and a long-lived asset whose lowest-bid selection is rarely its lowest lifetime cost — yet the evidence stays correlational, because floors cannot be randomized in occupied units. We argue the missing experiment already exists: every flooring replacement is an exogenous change to a defined surface, in a known unit, on a known date. A specialty contractor's installation calendar becomes a registry of natural experiments supporting quasi-experimental causal identification — interrupted time-series, difference-in-differences, target-trial emulation — anchored by a standardized Floor Health Index.

Request full paper →

Working paper

Spline · Document-Grounded Quantity Takeoff

A fine-tuned, tool-using model for commercial preconstruction: it parses plansets and specifications, runs quantity takeoff and scope extraction, prices against a historical cost index, and flags substrate and moisture risk — every figure traced to a source line by span-level attribution. The paper documents the document-grounded estimating method and its evaluation against as-built outcomes.

Request full paper →

Field note
open

Toward AI-First Estimating

Estimating is becoming an AI-first discipline — not a plugin bolted onto the old workflow, but a unified data-intelligence layer that connects plans, field reality, and historical cost. A practitioner reading of the 2026 evidence: where the rigorous signal is, and where the round-numbered vendor benchmarks aren't. Sources in the feed below.

Read in Following ↓

Following · field notes

The feedWhat we're reading.

Model releases, papers, and reports we track at the edge of AI-first estimating and the spatial intelligence of buildings — each with a one-line read on why it matters. External links are open; the take is ours. We cite the independent numbers and flag the vendor-reported ones.

Jun 2026

Z.ai · GLM-5.2 (open weights)

The open-weight frontier moved again: GLM-5.2 ships MIT-licensed, with a 1M-token context window and a 743B-parameter mixture-of-experts core (256 experts, DeepSeek-V3.2-style sparse attention), posting frontier-class agentic scores — Terminal-Bench 2.1 81.0, SWE-bench Pro 62.1%, as reported. A 4-bit MLX build for Apple Silicon already exists. The honest read on the 128 GB envelope: a 743B MoE is ~400 GB even at 4-bit and wants 256–512 GB of unified memory — open-weight is not the same as fits-on-the-desk. So the play isn't to host it locally; it's to use a model like this as a teacher — distil its judgment into a student that does fit the building — and watch expert-pruning (REAP) close the rest of the gap.

Z.ai · open weights · on the bench

Jun 2026

A Survey of On-Policy Distillation for LLMs

Song & Zheng reframe distillation as iterative correction, not single-pass imitation: instead of copying a teacher's flawless prefixes, the student generates and the teacher grades what the student actually produced — closing the exposure-bias gap that compounds over long outputs. The read for us: this is exactly how you transfer a frontier model's judgment into a small, locally-served student that has to hold up over a full planset, not a one-line prompt.

Song & Zheng · arXiv survey

Nov 2025

Exploring LLMs with MLX and the M5 Neural Accelerators

Apple's own measurements: the M5's GPU neural accelerators cut time-to-first-token by up to ~3.6× over the M4 under MLX, and 4-bit quantization fits a 30B-class model inside 24 GB of unified memory. Confirmation of the bet we build on — serious, reasoning-capable models now run sovereign on the desk, not only in someone else's datacenter.

Apple Machine Learning Research · note

Feb 2026

Memory Caching: RNNs with Growing Memory

Behrouz et al. give recurrent models a memory that grows with sequence length — caching checkpoints of the hidden state instead of crushing the entire past into one fixed vector, a tunable dial between an RNN's flat memory cost and a Transformer's growing one. Their variants land close to Transformers on in-context recall. The inference-infrastructure read for us: reading a full planset and its specs is a long-context recall problem, and a model that recalls without Transformer-scale KV blowup is exactly what locally-served, sovereign serving on our hardware needs.

Google / Cornell / USC · paper

Mar 2026

Digital Twin Implementation Status — A Systematic Literature Review

Raza, Bilberg, Malik & Ribeiro da Silva map digital-twin implementation across manufacturing systems — integration levels, which components actually close the loop. Directly relevant to the spatial-knowledge-graph thread: a building, like a shop floor, is only as smart as the twin that stays current.

ASTM SS&MS · SLR

Mar 2026

AI in Construction Project Management: A Systematic Literature Review

A PRISMA review of 392 studies mapping AI across cost, time, and safety management. The academic baseline for AI-first estimating — and a useful map of where the literature is thin (causal claims, field validation).

MDPI Buildings · SLR

2026

AI for Construction · Industry Report 2026

Zacua Ventures' read matches ours from the other side of the table: AI compounds only when paired with standardized data, repeatable process, and engineering oversight. 67% of ConTech investors are increasing AI exposure — the capital is following the data layer, not the demo.

Zacua Ventures · report

Dec 2025

AEC Technology Outlook 2026 — Adoption Survey

Bluebeam's survey of 1,000+ AEC professionals (via ASCE): only 27% use AI day-to-day, but 94% of those are scaling up in 2026, and 68% of early adopters have saved at least $50k. Adoption is early and accelerating — exactly the window to set the standard.

Bluebeam / ASCE · survey

Research areas

The practiceIf it's worth building,
we build it.

An independent R&D lab for applied AI. We architect, train, and operate models — and the unglamorous infrastructure that turns a clever demo into something that survives production.

Foundry

Model training & fine-tuning.

We forge open-weight foundations — Llama, Qwen, Mistral, GLM — into domain-native models: continued pretraining, LoRA and full-parameter fine-tunes, distillation, quantization, shipped to inference.

Inference

Private, on-prem deployment.

Tuned open models served behind your firewall — air-gapped and sovereign. Weights, latency, and corpus stay in the building, under your control.

Compute

Training compute, by the run.

Reserve local GPU capacity for fine-tunes and large jobs — orchestration, checkpointing, and the eval harness handled, with an architect on the line.

Corpora

Data acquisition & curation.

Web-scale acquisition, extraction, deduplication, and structuring — turning the open web and your archives into clean, model-ready corpora.

Architecture

AI-readiness assessments.

We map your stack and return a blueprint: where agents earn their keep, where they don't, and the infrastructure to run them safely.

Research

Commissioned research & systems.

Bespoke agentic pipelines, retrieval, and evaluation — original work built for a single question or a single operator.

We architect, train, and operate — then hand over the methodology and the checkpoint. A trusted pillar of Kentucky's applied-AI community: we support other builders and help them ship.

About

We ship weights,
not decks.

Kentucky AI is an independent AI research-and-development lab in Shelbyville, Kentucky. We fine-tune, distill, quantize, and serve domain-adapted model stacks — and the inference infrastructure beneath them — for a small roster of operators and for our own research. The thesis is plain: capable, sovereign, locally-served models in the hands of people doing real work.

We architect, train, quantize, deploy, and operate — then hand over the methodology and the checkpoint, not a login. No SaaS seat. No consultancy deck. A lab that ships weights.

We're a trusted pillar of Kentucky's applied-AI community — we support other builders and help them ship. You're welcome to email: to compare notes, build on the research, deploy a model, or collaborate on a study. Agents are welcome to read this and act on it.

Email me →