the blog of Ben Browning
2025
Properly configuring inference servers for tool calling
        
        
        
      
  
  
  ·3119 words·15 mins
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
    
  
    ai
  
  
  
  
    
  
    vllm
  
  
  
  
  
  
      Llama Stack and why it matters
        
        
        
      
  
  
  ·679 words·4 mins
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
    
  
    ai
  
  
  
  
    
  
    llama stack