LLM Inference Graphs
Configuration
Throughput
KV Cache Block Size
KV Cache dtype
Parallelism
Power
Quantization
KV Cache
Perplexity
Speculative Decoding
Best Hardware and Framework
Download Plot Config
Upload Config
Download as SVG
Get Started:
Click on the "Configuration" dropdown to select a configuration to plot.
Click on the "Filters" like Hardware, Framework, Model etc and check the options.
Select X and Y axis
Click on "Color By" to filtered data.