LLM Inference Optimization

HW-SW Co-Designed System With 3 Core Optimization Pathways For Long-Context Agentic LLM Inference (Cambridge, ICL)

A new technical paper titled “Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference” was published by researchers at University of Cambridge, Imperial College London ...

Business Wire

ASC24 Finals Set for April in Shanghai: Focus on Cutting-Edge Large Language Model Inference and Seepage Simulation!

BEIJING--(BUSINESS WIRE)--On January 4th, the inaugural ceremony for the 2024 ASC Student Supercomputer Challenge (ASC24) unfolded in Beijing. With a global interest, ASC24 has garnered the ...

The Register on MSN

Unpacking the deceptively simple science of tokenomics

Inference at scale is much more complex than more GPUs, more tokens, more profits feature By now you've probably heard AI ...

TMCnet

Inception Launches Mercury 2, the Fastest Reasoning LLM - 5x Faster Than Leading Speed-Optimized LLMs, with Dramatically Lower Inference Cost

Inception, the company behind the first commercial diffusion large language models (dLLMs), today announced the launch of ...

XDA Developers on MSN

Local LLMs are powerful, but cloud AI is still better at these 3 things

There are trade-offs when using a local LLM ...

WFXG

Nota AI Reduces Memory Usage of Upstage's Solar LLM by 72%, Demonstrating Proprietary Quantization Technology

Nota AI, an AI optimization technology company behind the Nota AI brand, announced that it has developed a next-generation quantization technology that significantly compresses the size of Solar, a ...

18d

ElastixAI Emerges From Stealth to Redefine Generative AI Economics via FPGA-Based Supercomputers

ElastixAI Inc. today emerged from stealth to tackle the systemic inefficiencies and high costs of generative AI (GenAI) inference. Founded by former Apple and Meta machine learning (ML) researchers, ...

Semiconductor Engineering

Detailed Study of Performance Modeling For LLM Implementations At Scale (imec)

A new technical paper titled “System-performance and cost modeling of Large Language Model training and inference” was published by researchers at imec. “Large language models (LLMs), based on ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results