Latency–Throughput Tradeoffs of ONNX Runtime, TensorRT-LLM, vLLM, and Triton: An Empirical Comparison on 1B–3B Parameter LLM Inference. jger [Internet]. 2026 Feb. 10 [cited 2026 May 31];4(1):173-82. Available from: https://gereview.com/index.php/jger/article/view/46