LLM Inference Bench Dashboard

Instructions

Configuration: Use the configuration dropdown to load a different configuration to plot.
Selectors: These controls are mandatory for a graph to be displayed. It requires selecting attributes for X-axis, Y-axis, and color.
Filters: The controls in this section helps in fine tuning the plots. Examples include filtering specific models, hardware, frameworks etc.
Options: This section has wide range to features to beautify the plot. Examples include giving a meaningful title, axis labels, position of the legend etc.

Github

Metrix of Evaluated Frameworks and Hardwares:

Framework/ Hardware	NVIDIA A100	NVIDIA H100	NVIDIA GH200	AMD MI250	AMD MI300X	Intel Max1550	Habana Gaudi2	Sambanova SN40L
vLLM	Yes	Yes	Yes	Yes	Yes	Yes	No	N/A
llama.cpp	Yes	Yes	Yes	Yes	Yes	Yes	N/A	N/A
TensorRT-LLM	Yes	Yes	Yes	N/A	N/A	N/A	N/A	N/A
DeepSpeed-MII	Yes	No	No	No	No	No	Yes	N/A
Sambaflow	N/A	N/A	N/A	N/A	N/A	N/A	N/A	Yes

Cite this work:

        @INPROCEEDINGS{####, 
        author=Krishna Teja Chitty-Venkata and Siddhisanket Raskar and Bharat Kale and Farah Ferdaus and Aditya Tanikanti and Ken Raffenetti and Valerie Taylor and Murali Emani and Venkatram Vishwanath},
        LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators
        booktitle={2024 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)}, 
        title={LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators}, 
        year={2024},
        volume={},
        number={},
        pages={},
        keywords={Large Language Models, AI Accelerators, Performance Evaluation, Benchmarking },
        doi={}}

Acknowledgements

This research used resources of the Argonne Leadership Computing Facility, a U.S. Department of Energy (DOE) Office of Science user facility at Argonne National Laboratory and is based on research supported by the U.S. DOE Office of Science-Advanced Scientific Computing Research Program, under Contract No. DE-AC02-06CH11357. We gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.

Selectors

Filters

Options

Instructions

Github

Metrix of Evaluated Frameworks and Hardwares:

Cite this work:

Acknowledgements