🧵 Thread

•

ML Research of this week: ▪️ AVFormer ▪️ Barkour ▪️ MatCha and DePlot ▪️ DIDACT ▪️ REVEAL ▪️ Improving mathematical reasoning & more! 🧵

AVFormer: Injecting vision into frozen speech models for zero-shot AV-ASR A simple method for augmenting existing large-scale audio-only models with visual information, at the same time performing lightweight domain adaptation. twitter.com/GoogleAI/status/1664680207511322636?s=20

SQL-PaLM An LLM-based tool adapted from PaLM-2 boasts SoTA performance in both in-context learning and fine-tuning settings. The few-shot model outperforms the previous fine-tuned SoTA by a whopping 3.8% on the Spider benchmark. twitter.com/omarsar0/status/1664441085693657088

High-resolution image reconstruction with latent diffusion models This is a model that can read your mind! twitter.com/leafs_s/status/1630906180381057024

Foundation models for reasoning on charts MatCha is a pixels-to-text foundation model trained on two complementary tasks: ▪️ chart de-rendering ▪️ math reasoning DePlot is a model built on top of MatCha for one-shot reasoning on charts via translation to tables. twitter.com/hardy_qr/status/1662222363629588485

Saliency Cards Researchers introduce saliency cards, a structured documentation of how saliency methods operate and their performance across a battery of evaluative metrics. twitter.com/MIT_CSAIL/status/1664663370824392709

REVEAL A visual-language model that learns to utilize a multi-source multi-modal “memory” to answer knowledge-intensive queries. twitter.com/GoogleAI/status/1664348002780315649

Improving mathematical reasoning Step-by-step verification beats outcome supervision for training models to tackle mathematical problems, according to recent research. Plus, the complete dataset of 800K human feedback labels (PRM800K) is now available. twitter.com/DrJimFan/status/1663972818160332800

DIDACT It's a methodology for training large machine learning (ML) models for software development. twitter.com/jacobaustin132/status/1663972128176128002/video/1

Barkour: Benchmarking animal-level agility with quadruped robots The paper introduces the Barkour agility benchmark for quadruped robots, along with a Transformer generalist locomotion policy. twitter.com/GoogleAI/status/1662145329180053506/video/1