“Latency–Throughput Tradeoffs of ONNX Runtime, TensorRT-LLM, VLLM, and Triton: An Empirical Comparison on 1B–3B Parameter LLM Inference”. Journal of Global Engineering Review 4, no. 1 (February 10, 2026): 173–182. Accessed May 31, 2026. https://gereview.com/index.php/jger/article/view/46.