Latency–Throughput Tradeoffs of ONNX Runtime, TensorRT-LLM, vLLM, and Triton: An Empirical Comparison on 1B–3B Parameter LLM Inference. Journal of Global Engineering Review, [S. l.], v. 4, n. 1, p. 173–182, 2026. DOI: 10.66372/JGER.v4i1.12. Disponível em: https://gereview.com/index.php/jger/article/view/46. Acesso em: 31 may. 2026.