openbench provides standardized, reproducible benchmarking for LLMs across 30+ evaluation suites (and growing) spanning knowledge, math, reasoning, coding, science, reading comprehension, health, long ...
An open-source Python library for simplifying local testing of Databricks workflows using PySpark and Delta tables. This library enables seamless testing of PySpark processing logic outside Databricks ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results