Smart Engineering Delivers
To: Sundar Pichai, Chief Executive Officer, Google
The setup was modest. Two RTX 4090s in my basement ML rig, running quantised models through ExLlamaV2 to squeeze 72-billion parameter models into consumer VRAM. The beauty of this method is that you don’t need to train anything. You just need to run inference. And inference on quantized models is something consumer GPUs handle surprisingly well. If a model fits in VRAM, I found my 4090’s were often ballpark-equivalent to H100s.,更多细节参见whatsapp
0000000100000000 T __mh_execute_header
。谷歌对此有专业解读
В США забеспокоились из-за передачи Россией Ирану разведданных14:07,这一点在WhatsApp Web 網頁版登入中也有详细论述
Россиян предупредили о смертельной опасности лечения простуды алкоголем14:41