FriendliAI — founded by the researcher behind continuous batching, the technique at the core of vLLM — is launching InferenceSense, a platform that fills idle neocloud GPU capacity with paid AI ...
Whether you are looking for an LLM with more safety guardrails or one completely without them, someone has probably built it.
If you run LLMs locally, these are the settings you need to be aware of.