What is agentic AI in manufacturing?

Agentic AI in manufacturing refers to autonomous AI agents that perceive production conditions, reason about what they observe, and decide what corrective action to take — without requiring constant human intervention. Unlike rule-based automation, agentic systems are goal-oriented and adaptive. IntelFactor deploys multimodal vision and language agents on NVIDIA Jetson edge devices and AWS cloud to give manufacturers real-time production intelligence.

How does IntelFactor's multimodal AI work on the factory floor?

IntelFactor runs computer vision inference on-device in 18ms using NVIDIA Jetson with TensorRT optimization. High-confidence events are auto-classified and acted on at the edge. Ambiguous cases escalate to cloud language models on AWS Bedrock for multimodal reasoning against your plant's SOPs and production history. Every operator decision feeds back into the model, making the system more accurate over time.

What is the difference between agentic AI and traditional factory automation?

Traditional factory automation is rule-based and reactive — it executes predefined logic when specific conditions are met. Agentic AI is goal-oriented and adaptive — it perceives production conditions, reasons about context, and decides what action to take. IntelFactor's agents don't just flag anomalies; they analyze root causes, correlate with production history, and route the right action to the right person based on your plant's own SOPs and operational patterns.

Does IntelFactor work on-premise without cloud dependency?

Yes. IntelFactor is edge-first. All real-time inference runs on-device on NVIDIA Jetson Orin hardware with no cloud dependency for time-critical decisions. Cloud (AWS) is used for deeper multimodal reasoning, analytics, and long-term evidence storage — but the production line never depends on an internet connection.

What manufacturing industries does IntelFactor serve?

IntelFactor serves mid-market manufacturers in consumer goods, metal products, precision tools, and electronics — running high-volume production lines where operational resilience, traceability, and continuous improvement matter. Current deployments include cutlery, metal stamping, and electronics manufacturers.

Building a Custom Vision ML Pipeline for Manufacturing

Most vision AI companies talk about "detecting defects." Very few explain how to build a system that actually runs in a factory—on real hardware, at real speed, with real operator workflows.

This article explains how IntelFactor engineers a production-grade AI inspection platform: from patch-based anomaly detection to deterministic edge control, continuous learning, and enterprise governance. We compare product-level inspection with process monitoring, detail the technical architecture, and provide a phased buildout plan with deployment guidance on NVIDIA Jetson Orin Nano hardware.

Product Inspection vs Process Monitoring

Not all factory vision is the same. There are two fundamentally different approaches:

Process-Level Monitoring watches whether the production process (conveyor cycles, machinery rhythm) is running normally using overhead cameras. It alerts on jams, spills, missing items, or line-level anomalies. Strength: broad situational awareness. Limitation: coarse insight, not tuned for individual part defects.
Product-Level Inspection examines each unit at the region-of-interest level (surface, dimensions, assembly) using controlled industrial cameras. It learns "good" product appearance and flags deviations. Strength: precise defect detection (scratches, dents, misalignments). Limitation: narrower scope (specific station), but that is intentional.

In practice, a factory might use both: process monitoring ensures the line runs, product inspection ensures what comes off the line meets quality. IntelFactor's value lies in fine-grained quality control and root-cause insights.

Technical Architecture

IntelFactor's pipeline runs entirely on edge hardware. Here's how the components connect:

Camera → Edge Station (Jetson Orin Nano) → ROI Cropping & Preprocessing → CNN Backbone (ResNet-18/ViT) → Patch Embeddings

From embeddings, two parallel paths:

Anomaly Model (PatchCore/PaDiM) — detects unknown defects by comparing against a "normal" embedding distribution
Supervised Detector (YOLO) — classifies known defect types once enough labeled examples exist

Both outputs feed into a Deterministic PASS/FAIL Gate — a rule layer that produces a hard verdict. That verdict drives:

Actuator Output (GPIO/Modbus) for reject mechanisms
Local Evidence Buffer for frames, metadata, and operator dispositions

The evidence buffer syncs asynchronously to the Cloud Dashboard for model registry, retraining jobs, drift monitoring, and the advisory RCA assistant.

Key Design Principles

CNN Backbone → Embeddings: A pretrained CNN (ResNet, EfficientNet, or ViT) extracts mid-level features for each image patch. These embeddings capture textures and shapes relevant to defects.
Anomaly Model: Stores the "normal" distribution of embeddings. PatchCore uses a memory bank with nearest-neighbor scoring. PaDiM fits a Gaussian per patch position.
Supervised Detector: Optional YOLO model to classify common defects once enough labels exist. Complements the anomaly detector (which catches unknown defects).
Deterministic Gate: Takes the anomaly score and supervised outputs to produce PASS or FAIL. Enforces minimum anomaly area, temporal persistence, and SOP thresholds.
Edge Station: All inference and gating happen on the Jetson. No cloud in the loop. The PASS/FAIL signal goes directly to the PLC or reject mechanism.
Local Buffer: Raw frames and metrics are buffered on-device (48h default). They sync to cloud when connectivity is available.
Cloud Services: Model registry, retraining jobs, drift analysis, and reporting. The cloud does not make real-time decisions—it provides dashboards and orchestrates updates.

This architecture prioritizes edge-first intelligence and deterministic action.

Phase-by-Phase Buildout Plan (1–16 Weeks)

Phase 1 (Weeks 1–4): Anomaly Engine Deployment

Integrate camera stream and ROI cropping on the Jetson
Export CNN backbone (e.g. ResNet-18) to TensorRT for FP16 inference
Implement PatchCore (memory bank) and/or PaDiM (Gaussian) enrollment from good images
Compute anomaly heatmap and score each frame
Target: Inference latency <30ms on Jetson Orin Nano
Done when: On-device demo where known-good images yield PASS and injected defects yield FAIL, with heatmap visualization

Phase 2 (Weeks 4–6): Stability & Drift Monitoring

Log statistics: anomaly-score quantiles, feature centroid shifts, FP overrides, lighting levels
Compute a single "Stability Index" that triggers when the normal distribution shifts
Implement guided recalibration: prompt user to recapture normal data and rebuild baseline
Done when: Dashboard shows stability index; alert appears after significant scene change; recalibration workflow updates the model

Phase 3 (Weeks 6–10): Hybrid Learning Loop

Queue top anomalies for operator review (borderline and novel scores)
Operators choose: "Confirm Defect", "False Positive", or "Assign Defect Class"
Automatically label confirmed anomalies and add to defect classes
Schedule YOLO retraining when 20+ examples accumulate per class
Done when: New defect class can be labeled and trained in <1 day; updated YOLO model deployed to station after retrain

Phase 4 (Weeks 8–12): Deterministic Control & Metrics

Implement PASS/FAIL logic with configurable delay and pulse width
Expose GPIO or Modbus outputs to the PLC or reject system
Log action latencies (edge inference vs PLC response) and frame drops
Integrate TensorBoard/Grafana for latency KPIs
Target: End-to-end decision latency (camera→GPIO) <50ms; cold-start to decision <100ms
Done when: On FAIL, the hardware actuator triggers reliably at the configured timing; logs show latency metrics

Phase 5 (Weeks 12–14): Temporal Layer (Optional)

Maintain a sliding window of embedding statistics or scores
Detect periodicity breaks (using auto-correlation) or missing items
Done when: If production rhythm is broken (e.g. belt holds a bottle for >N seconds), generate an alert

Phase 6 (Weeks 10–14): Governance & Hardening

Implement a Model Registry: track dataset hash, model binary version, threshold config, deployed stations
Offline resilience: local buffer (48h), sync queue, and UI indicator
Explainability: show heatmaps, nearest-neighbor examples, top-3 normal patches
Done when: System passes internal security review; offline buffer performs under network cuts

Phase 7 (Weeks 14–16+): UX & Performance Polishing

Refine operator interface (single-key review, touchscreen ready)
Performance tune: quantify Jetson utilization, optimize TensorRT FP16 engines
Write automated deployment scripts and integration tests
Target: Operator review <5 seconds; CPU/GPU utilization <80% at target FPS
Done when: Operators can onboard a new station in <2 hours following the setup guide

Implementation Guidance: Jetson Orin Nano

Hardware Overview

The 8GB Jetson Orin Nano Super features a 1024-core Ampere GPU and delivers up to 67 TOPS of AI performance. In practice, it runs two medium CNNs (e.g. YOLO + ResNet-18) in parallel at real-time speed. YOLOv11n inference in TensorRT FP16 runs at ~4–10ms on Orin Nano. Combined with the patch model (~5ms), total pipeline latency stays <30ms.

Backbone Choices: A ResNet-18 (5.5M params) or EfficientNet-B0 (5.3M) are great starting points—few ms latency, proven transfer learning. For maximum pretraining, NVIDIA's self-supervised ViT (DINOv2) achieves state-of-art embeddings, but ViT→TensorRT on Jetson requires NVIDIA's TAO workflow. We recommend ResNet-18 for initial deployment.

TensorRT Export

ONNX Export: Use PyTorch's torch.onnx.export to dump the backbone (including intermediate patch outputs) to ONNX
TensorRT Build: Convert ONNX to a TRT engine on the Jetson (trtexec with --fp16). Ensure workspace size is large enough
Verification: Benchmark latency with trtexec. Expect ~5–10ms for the CNN backbone alone
Optimization Tips: Use INT8 calibration only if needed (many patch models degrade). FP16 usually suffices on Orin. Profile to avoid unnecessary ops

PatchCore vs PaDiM vs EfficientAD

PatchCore: Highest accuracy (~99.6% AUROC on MVTec), handles most anomalies with no supervised labels. Requires storing patch descriptors (manage with coreset sampling). Implementation: use Anomalib for reference pipeline
PaDiM: Gaussian per-patch is faster and low-memory. Good if Jetson RAM is limited. ~99% performance on benchmarks
EfficientAD (Teacher-Student): Fastest inference (millisecond-level), but adds training complexity. Consider in later phases if latency is critical

Model Governance & Drift Handling

Drift Detection Metrics

Feature Shift: Distance between incoming embedding centroids and baseline centroids
Score Drift: Rolling percentile of anomaly scores (p95 drift signals issues)
False-Positive Rate: Count FP overrides in reviews (rising FPR suggests drift)
Lighting/Color: Mean luminance and color temperature changes

Stability Score & Alerts

Combine metrics into a Stability Score. If it exceeds a threshold, flag the station as needing attention.

Guided Recalibration

When drift is detected:

Show a prompt or email alert to the technician
Guide them to capture ~10 minutes of current normal production via the UI
Automatically rebuild the patch model using both old + new normals (or incremental update)
Show comparison: histograms of anomaly scores before and after
Log the event in the registry (who, when, how much data)

The Hybrid Anomaly→Supervised Loop

IntelFactor uses human-in-the-loop to expand the model over time:

Review Queue: Show top-K anomalous crops for operator labeling. Only borderline cases are sent (uncertainty sampling)
Feedback Actions: For each flagged image, the operator can Confirm Defect (new positive label), False Alarm (add to normals), or Assign Defect Class (if known)
Data Accumulation: Collect images/patches of each defect class with labels until enough examples exist
Retraining: Schedule a cloud-based training job to fine-tune or train a YOLO model with new data
Deployment: Test and deploy new model artifacts via the registry

Operator UX and Setup Checklist

Review UX

A minimal interface with large PASS/FAIL indicators. Review mode shows 3–5 "suspect" thumbnails with hotkeys (P=pass, F=fail) and quick tagging. Target: <5 seconds per review decision.

Station Setup Checklist

Camera Setup: Mount camera, verify focus and lighting
ROI Selection: Draw region(s) covering product. Mask irrelevant areas (background, conveyor)
Baseline Capture: Collect 50–200 normal images under current conditions
Initial Enrollment: Build baseline model and set a conservative anomaly threshold
Signal Test: Run a known defect through; ensure FAIL is triggered
Drift Safeguards: Enable stability monitoring. Confirm offline buffer works
Operator Training: Show staff how to interpret PASS/FAIL and use the review page

Enterprise Hardening

Audit Trail: All decisions (scores, user overrides, deployments) are logged with timestamps and user IDs
Version Control: The model registry logs which data and code made each model
Offline First: The system never "turns off" if cloud is unreachable
Security & Privacy: Camera feed stays local. Data at rest is encrypted

Conclusion

IntelFactor is building the Datadog for Vision QC. By focusing on product-level anomalies, enforcing deterministic edge decisions, and enabling continuous human-in-the-loop learning, it addresses key pain points that process-only monitoring does not.

The ultimate goal: A deterministic QC station that learns from your line, not another disconnected dashboard.

Book a Demo to see how edge-first inspection works for your production line.

Building a Production-Grade AI Inspection Pipeline: From PatchCore to Factory Floor