基于YOLOv的毕业设计Web应用：从零构建目标检测服务的完整实践-育师

基于YOLOv的毕业设计Web应用：从零构建目标检测服务的完整实践

1. 背景痛点：为什么“能跑就行”的模型一到Web就翻车

毕设答辩前一周，我亲眼看着隔壁宿舍的兄弟把笔记本风扇拉成直升机，原因无他——YOLOv5在PyCharm里跑得飞起，一搬到浏览器就连环报错：

显存泄漏：每次请求GPU内存涨200 MB，10次请求后CUDA OOM。
图片尺寸硬编码：前端传了横图，后端的letterbox把高宽写反，框全飘到天上。
同步阻塞：Flask默认单线程，三并发请求就把推理线程卡成PPT。
路径写死：/home/lxy/yolov5s.pt在服务器上根本不存在，现场演示直接社死。

这些问题本质是把“科研脚本”当成“生产服务”来用。下文用最小代价把脚本升级成可演示、可部署的Web服务，让毕设老师挑不出刺。

2. 技术选型：三分钟敲定不纠结

维度	Flask	FastAPI
学习曲线	低，文档多	低，但异步概念需5分钟
性能（单worker）	30 QPS	180 QPS（异步+Starlette）
自动文档	无	/docs 一键生成
代码量	多写路由+Swagger	注解即接口

结论：答辩演示选FastAPI，省掉写接口文档的Word。

推理后端	延迟（RTX3060/YOLOv5s/640）	迁移成本
PyTorch原生	22 ms	0
ONNX Runtime-GPU	15 ms	一行代码`torch.onnx.export`
TensorRT fp16	8 ms	需装`tensorrt`+`pycuda`，镜像大1 GB

结论：ONNX Runtime是毕设甜蜜点，提速30%，依赖只多50 MB；TensorRT留给想冲“优秀毕设”的同学。

3. 核心实现：四步流水线拆到函数级

模型加载——单例+懒加载
把YOLOv5封装成Predictor类，在__call__里做推理，构造函数只跑一次：self.model = ort.InferenceSession(onnx_path, providers=['CUDAExecutionProvider'])。
用functools.lru_cache保证全局仅一份Session，避免重复冷启动。
图像预处理——与训练阶段像素级对齐
统一封装letterbox、BGR→RGB、np.transpose(2,0,1)、np.ascontiguousarray，返回np.float32且/255.0。
把input_size抽成配置，拒绝硬编码。
推理后处理——NMS+坐标还原
ONNX输出形状(1,25200,85)，后处理三步：
- 置信度过滤≥0.001
- cv2.dnn.NMSBoxes按nms_thresh=0.45
- 把x_center,y_center,w,h→x1,y1,x2,y2，并基于letterbox缩放系数还原到原图尺寸。
  返回List[Dict]，字段"cls","conf","bbox":[x1,y1,x2,y2]，方便前端直接画框。
JSON序列化——浮点精度截断
使用json.dumps(result, ensure_ascii=False, separators=(',', ':'), default=str)
坐标保留2位小数，既省带宽又不让框抖动。

4. 代码实战：最小可运行仓库

目录结构：

yolo-web/ ├─ main.py # FastAPI入口 ├─ predictor.py # 封装ONNX推理 ├─ config.py # 全局超参 ├─ static/ # 静态页 └─ requirements.txt ` ### 4.1 后端（FastAPI） ```python # config.py MODEL_ONNX = "weights/yolov5s.onnx" INPUT_SIZE = 640 CONF_THRES = 0.25 NMS_THRES = 0.45 MAX_BATCH = 1 # 毕业设计单图即可

# predictor.py import cv2, numpy as np, onnxruntime as ort from config import * class Predictor: _instance = None def __new__(cls, *args, **kw): if cls._instance is None: cls._instance = super().__new__(cls) cls._instance._init() return cls._instance def _init(self): self.session = ort.InferenceSession(MODEL_ONNX, providers=['CUDAExecutionProvider']) self.input_name = self.session.get_inputs()[0].name def preprocess(self, img0): h0, w0 = img0.shape[:2] r = min(INPUT_SIZE / h0, INPUT_SIZE / w0) h, w = int(round(h0 * r)), int(round(w0 * r)) img = cv2.resize(img0, (w, h)) delta_h, delta_w = INPUT_SIZE - h, INPUT_SIZE - w top, bottom = delta_h // 2, delta_h - (delta_h // 2) left, right = delta_w // 2, delta_w - (delta_w // 2) img = cv2.copyMakeBorder(img, top, bottom, left, right cv2.BORDER_CONSTANT, value=(114,114,114)) img = img[:, :, ::-1].transpose(2,0,1).astype(np.float32) / 255.0 return np.expand_dims(img, 0), (r, left, top) def postprocess(self, outs, meta): r, dw, dh = meta outs = np.squeeze(outs) # (25200,85) mask = outs[:, 4] > CONF_THRES outs = outs[mask] boxes, scores = outs[:, :4], outs[:, 4] boxes[:, [0,2]] -= dw boxes[:, [1,3]] -= dh boxes /= r indices = cv2.dnn.NMSBoxes(boxes.tolist(), scores.tolist(), CONF_THRES, NMS_THRES) final = [] for i in indices: x1,y1,x2,y2 = map(float, boxes[i]) final.append({"cls":int(outs[i,5:].argmax()), "conf":round(float(scores[i]),2), "bbox":[round(x1,2),round(y1,2), round(x2,2),round(y2,2)]}) return final def __call__(self, img0): blob, meta = self.preprocess(img0) out = self.session.run(None, {self.input_name: blob})[0] return self.postprocess(out, meta)

# main.py from fastapi import FastAPI, File, UploadFile, HTTPException from fastapi.responses import HTMLResponse import cv2, numpy as np, uvicorn from predictor import Predictor from config import * app = FastAPI(title="YOLOv5-ONNX-Service") @app.post("/predict") async def predict(file: UploadFile = File(...)): if not file.content_type.startswith("image"): raise HTTPException(400, "image required") buf = await file.read() nparr = np.frombuffer(buf, np.uint8) img = cv2.imdecode(nparr, cv2.IMREAD_COLOR) if img is None: raise HTTPException(400, "imdecode failed") result = Predictor()(img) return {"filename": file.filename, "detections": result} @app.get("/") async def index(): return HTMLResponse(open("static/index.html").read()) if __name__ == "__main__": uvicorn.run("main:app", host="0.0.0.0", port=8000, reload=False)

4.2 前端（30行HTML）

<!-- static/index.html --> <!doctype html> <html> <head><title>YOLOv5 Upload</title></head> <body> <h2>选图→推理→画框</h2> <input type="file" id="file" accept="image/*"> <button onclick="go()">推理</button> <canvas id="can"></canvas> <script> async function go(){ const f = document.getElementById('file').files[0]; const fd = new FormData(); fd.append('file', f); const r = await fetch('/predict',{method:'POST', body:fd}).then(x=>x.json()); const img = new Image(); img.src = URL.createObjectURL(f); await img.decode(); const c = document.getElementById('can'); c.width=img.width; c.height=img.height; const ctx = c.getContext('2d'); ctx.drawImage(img,0,0); ctx.strokeStyle='lime'; ctx.lineWidth=2; r.detections.forEach(x=>{ const [x1,y1,x2,y2]=x.bbox; ctx.strokeRect(x1,y1,x2-x1,y2-y1); ctx.fillText(`${x.cls} ${x.conf}`, x1, y1-5); }); } </script> </body> </html>

一键启动：

pip install -r requirements.txt # opencv-python, fastapi, uvicorn, onnxruntime-gpu python main.py

浏览器打开http://localhost:8000，选张校园街景，一秒出框，演示通关。

5. 性能与安全：让服务撑得住老师围观

请求限流
用slowapi装饰器给/predict加@.limit("5/minute")，避免同学好奇狂刷。
输入校验
除了文件MIME，额外限制len(buf) < 4 MB，防止有人甩单反RAW直接把内存打爆。
GPU内存管理
ONNX Runtime默认贪婪模式，在providers列表里加device_id=0, gpu_mem_limit=2GB；
推理完手动torch.cuda.empty_cache()（若混用PyTorch）可缓解碎片化。
冷启动优化
服务启动时预热：构造函数里随机跑一张全零图，CUDA初始化完毕，用户首请求不掉坑。
异常兜底
用try/except包住cv2.imdecode、模型推理，返回HTTP 400/500并记日志，避免服务直接500崩溃重启。

6. 生产环境避坑清单

模型路径硬编码 → 用os.getenv("MODEL_PATH", "weights/yolov5s.onnx")，Docker启动时挂卷。
单例模式线程安全 → ONNX Runtime推理本身线程安全，但cv2.dnn.NMSBoxes需外部加锁，或改用torchvision.ops.nms。
并发竞争写临时文件 → 禁止cv2.imwrite("/tmp/tmp.jpg")，用io.BytesIO在内存完成。
日志无持久化 → 用logging.handlers.RotatingFileHandler保存三天，答辩老师突然要看报错也有据可查。
忘记关reload→ 线上uvicorn务必reload=False，否则改个注释就重启，请求掉一半。

7. 可继续扩展的脑洞

多模型服务：在/predict路由加查询参数?model=yolov8n，后端维护Dict[str, Predictor]，按key路由，毕设秒变“工业中台”。
用户认证：接入OAuth2 Password Flow，给每位老师发token，答辩现场谁上传了猫片一清二楚。
边缘缓存：对相同图片MD5做redis.setex(key, 60, json_result)，重复秒回，QPS再翻三倍。

把代码拉下来，改两行权重路径，推到云服务器，你就拥有了一个能在手机浏览器里实时演示的目标检测站点。毕设答辩不再背电脑，老师扫码即可体验。先跑通，再调优，剩下的时间安心写论文——祝顺利毕业！