“Latency–Throughput Tradeoffs of ONNX Runtime, TensorRT-LLM, vLLM, and Triton: An Empirical Comparison on 1B–3B Parameter LLM Inference” (2026) Journal of Global Engineering Review, 4(1), pp. 173–182. doi:10.66372/JGER.v4i1.12.