[1]

2026. Latency–Throughput Tradeoffs of ONNX Runtime, TensorRT-LLM, vLLM, and Triton: An Empirical Comparison on 1B–3B Parameter LLM Inference. Journal of Global Engineering Review. 4, 1 (Feb. 2026), 173–182. DOI:https://doi.org/10.66372/JGER.v4i1.12.