The Hidden Costs of Gen AI on Our Environment

The Deledao Team
Dec 16, 2025
8 min read

AI is everywhere in K12. Vendors promise smarter threat detection, better classroom support, and faster filtering. But few talk about what these tools cost. Not just in dollars, but in power consumption, water usage, and operational complexity.

You might ask, how does the bill break down? Well, it depends entirely on which type of AI model you choose. In this article, we'll break down the real environmental and practical differences between generative AI and real-time AI for content filtering; and why bigger isn't always better when it comes to protecting students online.

The difference between generative AI and deterministic AI
What parameter counts tell us about AI energy usage
The environmental costs of running generative AI
Do web filtering systems benefit from using generative AI models?
Specialized models often outperform large generative models in niches
Questions K12 IT directors should ask (and insist on) when evaluating AI filter solutions
Why deterministic AI filtering makes sense for schools
Use the right tool for the right job

1,287 megawatt-hours (MWh) were needed to train GPT-3. That's roughly the same amount of energy used to power ~120 U.S. home for a year. — 1,287 megawatt-hours (MWh) were needed to train GPT-3. That's roughly the same amount of energy used to power ~120 U.S. home for a year - Nidhal Jegham et. al., 2025

The difference between generative AI and deterministic AI

How does generative AI work?

Generative AI aims to understand patterns to generate content (think ChatGPT, image generation tools, etc). These general-purpose and compute-hungry Large Language Models (LLMs) can write human-like text, generate images or code, answer questions, and perform a wide variety of tasks requiring broad “world knowledge.”

To satisfy this ambition, AI Companies have constructed huge models, often taking up hundreds of gigabytes (GB), even terabytes (TB) of storage. They rely on powerful, state-of-the-art cloud computers and often run in tightly controlled data centers with advanced cooling, networking, and power infrastructure to support their needs.

Training GPT-3 was estimated to consume 1,287 megawatt-hours (MWh), roughly the energy use of ~120 U.S. homes for a year. This releases hundreds of tons of CO₂, even before counting inference. (csail.mit.edu)

How does deterministic AI work?

In contrast, deterministic AI (also called Predictive AI), such as Deledao's InstantAI ™, aims to understand patterns in order to recognize content, because content filtering is narrow in scope. Rather than training itself to generate creative content, it focuses on understanding the behaviors or signs of what makes a page distracting or inappropriate, just like how a human would.

How “safe or risky” is this website?
Is this in the “allowed vs blocked” category?
How "educational or distracting” is this video?

Unlike generative AI, these models require less storage; they are optimized for speed, low resource use, and deterministic behavior. They can often run on modest hardware: CPUs, low-end or mid-range GPUs, or edge-deployment appliances.

For a school environment, that’s significant, because web filtering is not a production task, but a classification job at its core. The domain is narrow, consisting of web pages, video metadata, text, and images (which makes it fairly predictable). In that context, specialized task-specific models are often the right tool.

Your mailman wouldn't drive a tank to deliver your mail. You're school's web filter doesn't need a 100-billion-parameter AI model to decide whether a student's web page is safe. — Your mailman wouldn't drive a tank to deliver your mail. Why make your AI filter do the same?

What parameter counts tell us about AI energy usage

For those who are IT directors, let’s understand how model size translates into real-world infrastructure demands:

Parameter count is a proxy for how large a model is, and how much memory and compute it requires.
VRAM (or memory footprint) determines what kind of hardware you need to host the model (multiple GPUs with 80 GB VRAM each vs. a simple server or edge device).
Compute footprint is used to determine energy consumption, latency, and scaling cost.

For generative AI models

Large models, with parameter counts often ranging between 20B–750B+ parameters or more demand multiple high-end GPUs with advanced cooling and data center infrastructure. In practice, this translates into complex cloud hosting or dedicated GPU farms (which could cost tens of thousands of dollars every month, in addition to high operational overhead).

For deterministic AI models

Smaller models in the 1M–100M parameter range fit comfortably in modest hardware, sometimes even running on CPUs or inexpensive GPUs. They require far less power, draw minimal VRAM, and avoid complex parallelization or massive inference pipelines.

Using lighter, task-appropriate models can reduce energy consumption by as much as 98% compared to indiscriminately using large models. (arXiv)

In other words, the gains from throwing huge models at simple tasks rapidly diminish, but the resource costs don’t.

We can reduce AI energy consumption by 28%, if we select the right model for the right job. - Tiage da Silva Barros et. al., 2025

The environmental costs of running generative AI

Global data center electricity consumption in 2022 was estimated at 460 terawatt-hours (TWh). That's enough to put data centers among the top 20 electricity consumers globally. (csail.mit.edu)
In North America alone, data center power requirements rose from ~2,688 megawatts at the end of 2022 to ~5,341 megawatts by the end of 2023, driven in part by demand for generative AI workloads. (MIT News)
Data center electricity demand globally is projected to more than double by 2030, with AI as a primary driver. (IEA)

Data center energy demand is already huge (and growing)

We can see these aren’t marginal trends, but long-term infrastructure commitments. Bigger data centers mean larger cooling systems, heavier water usage, and greater demand on electrical grids.

Material and water costs from training large AI models

It’s an understatement to say training a model like GPT-3 is resource-intensive. Training alone consumed ~1,287 MWh, along with generating hundreds of tons of CO₂. (csail.mit.edu)

But energy and carbon are only part of the story. A recent study analyzing the environmental impact of building large AI models found that, beyond energy and water, the material footprint (the raw metals and rare earths used in GPUs) is nontrivial. Training a large model may require hundreds or thousands of GPUs, with a combined weight measured in tons of heavy metals. (arXiv)

In addition, high-density data centers demand significant water for cooling. According to one industry report, inference workloads for some LLMs draw water for cooling at non-trivial rates. All in all, it makes the vast amount of water used a real sustainability concern. (Devera)

Also read: How Can AI Help in K12 Education?

The issue behind inference: It’s not “free”

Even after training, inference consumes power and resources. A recent benchmark study (2025) showed that some of the most energy-intensive models consume over 33 watt-hours (Wh) per long prompt; orders of magnitude higher than more efficient models. (arXiv)

Scale this: hundreds of thousands or millions of queries per day (as you might see in a large school district), and energy use becomes comparable to powering dozens or hundreds of homes.

A survey of “small-model vs big-model” for real-world tasks found that by selecting the right model for the right job, global AI energy consumption could be cut by nearly 28% in 2025 alone. (arXiv)

Do web filtering systems benefit from using generative AI models?

Filtering ≠ Generation

Web filtering is about decision-making on structured inputs: URLs, HTML/text content, images, and video thumbnails. The goal is to digest context and decipher if content is safe, allowed, or educational. There’s no need for long-form output generation or behavior.

LLMs are built for flexible, generative output. Their strength of generating content doesn't translate well into improved accuracy or reliability for understanding context, and could even lead to undesirable behavior like AI hallucinations or mis-categorizations. This is not to mention that Language Models cost hundreds of times more to maintain and serve, at the cost of only potentially marginal gains.

AI hallucinations are like my kids after candy: Misunderstanding instructions and creating false narratives — Let's be brutally honest now.

Specialized models often outperform large generative models in niches

Generative AI knows a bit about everything, while real-time AI knows a lot about content filtering. When a model is fine-tuned on a well-defined domain (school web content, video metadata, common platforms), it becomes extremely efficient and accurate for that domain.

The overhead of extra parameters used by LLMs, pre-trained on the entire internet, yields little to no added value for classroom filtering.

Moreover, specialized models tend to be more deterministic and explainable: easier to debug, audit, and secure. That’s critical when you need to justify decisions to school administrators, parents, or compliance officers.

Also read: The Best Web Filtering Softwares for Schools | The Ultimate Guide

Cascaded, multi-layer filtering architecture is efficiency at scale

Deledao ActiveScan’s real-time filtering system, for example, is not just a single monolithic model

Identifies attempts to bypass restrictions - whether through anonymizing tools, unusual activity patterns, or other evasive behaviors.
Analyzes user inputs and interactions - to detect inappropriate or off-task content across text, visuals, and behavior.
Interprets context - to catch disguised or coded language and ensure harmful or unsuitable material is filtered out.

Deledao’s ActiveScan™ is built with precisely this kind of multi-layer architecture, demonstrating that small, specialized models are both efficient and effective.

It’s like mail screening in a school building: most letters are safe and pass through quickly; only a few needs detailed inspection. You don’t need a full security sweep for every envelope.

Why specialized AI is modern, not outdated

Some mistakenly assume that “small models are old-school, outdated AI.” In reality, modern AI research is embracing efficiency and specialization, not just brute force.

Advanced: State-of-the-art models use the latest transformer architectures, attention mechanisms, and still require fine-tuning to get the job done.
Efficient: Researchers increasingly push for “model selection,” choosing the smallest, most efficient model appropriate for a given task. This is a core principle of green AI. (arXiv)
Lightweight: Quantization, distillation, and edge deployment make small models not only strong, but resource-friendly.

From a school’s perspective, specialized AI models:

Avoid massive hardware expenses.
Support on-premises or hybrid deployment (good for privacy, compliance, and network resilience).
Are easier to maintain, scale, and audit.
Reduce risk of vendor lock-in and spiraling costs.

Real-time AI, for example, is a lightweight yet powerful way of using AI.

If GPT-4 is the size of a 100-story skyscraper, a Real-time AI models are the size of one classroom — If GPT-4 is the size of a 100-story skyscraper, Real-time AI models are the size of one classroom

Questions K12 IT directors should ask (and insist on) when evaluating AI filter solutions

Ask these questions when a vendor pitches you AI-powered filtering/monitoring/analysis:

Model size & parameter count: Are you based on a Generative LLM, or utilizing your own small, domain-specific classifier?
Hosting & hardware requirements: Do you need high-end GPUs / cloud clusters, or can this run on modest hardware or edge devices?
Inference latency and throughput: Does the system support real-time filtering (sub-second), or does it rely on slow, cloud-based, round-trip inference?
Energy, water, and carbon transparency: Do they disclose approximate power draw, cooling/water use, and environmental footprint per request / per student?
Pipeline architecture: Is it a cascaded, multi-layer approach, or a monolithic generative model doing everything?
Privacy and data locality: Is data processed locally (on-prem) or sent off to cloud data centers with unknown jurisdictional, compliance, or security implications?

These questions help reveal whether an “AI solution” is truly optimized for K12, or simply a repackaged generative model riding AI hype.

Also read: Making Smarter Choices: Why Research-Based EdTech Matters More Than Ever

Why deterministic AI filtering makes sense for schools

Let’s imagine two scenarios in a typical medium-sized school district.

Option A: Vendor uses generative AI in their filtering stack

These use generative AI in several of their features. This approach can introduce challenges when generative models are used in line with browsing activity:

Higher environmental Impact - Generates higher energy and water demands
Higher compute requirements* - Generative models (LLMs, transformer-based generators, etc.) generally require more GPU/TPU resources than specialized classifiers.
Less predictable cost profiles - If the vendor relies on large-model inference, cost could be several times higher compared to even traditional filters.
Potential latency - Depending on how frequently the model is queried and whether the inference happens on local hardware or external cloud services.
Difficult to Maintain – Generative AI is still a relatively new field, and it is still difficult to maintain reliability compared to traditional systems.

*Most vendors don’t run a full LLM on every page load. Generative AI tends to be used for specific detection problems.

Option B: Vendor uses a layered, deterministic AI architecture

Vendors, like Deledao’s ActiveScan ™, use specialized real-time models for fast heuristics.

Lower environmental impact - Small-scale inference consumes significantly less energy and water than LLM-class workloads
Real-time response - Filtering decisions happen instantly
Reliable at scale - Well-suited for districts with thousands of devices under load
Predictable, stable pricing - No reliance on expensive generative model inference

For K12 schools where both reliability and sustainability matter, Option B often aligns better with operational realities.

Use the right tool for the right job

In K12, AI is an infrastructure. And like all infrastructure, it needs to be chosen with care. Generative AI has its place in creative writing labs, adaptive tutoring, and research tools. But for content filtering, supervision, and safety enforcement, it is often overkill. It’s like using a jet engine to power a school bus. Effective, but wasteful, expensive, and unsustainable.

As you vet AI solutions for your district: place emphasis not on buzzwords (“AI,” “LLM,” “deep learning”) but on fit, efficiency, transparency, and long-term sustainability. The future of K12 AI doesn’t require massive data centers sweating for every student click. It requires smart engineering, respect for resources, and tools built for the job.

ActiveScan's InstantAI™ offers a smarter path: efficient and scalable.

LEARN MORE