Commit Graph - mlx - Gitea for Geophysics

zhangyiss/mlx

Fork 0

mirror of https://github.com/ml-explore/mlx.git synced 2025-12-16 01:49:05 +08:00

Commit Graph

Select branches

Hide Pull Requests

cpp20

cuda-sdpa-vector

fft

gguf_q4_k

gh-pages

ibv-backend

ibv-backend-test

interrupt_eval

jagrit06/cuda-gemm-experiment

jit-nax

main

q-sdpa

qmm

ring-init

sdpa-test

sdpav-backup

sign-warns

simple-gemm

split_logsumexp

steel-refactor

#1

#1000

#1002

#1003

#1006

#1007

#1010

#1011

#1014

#1016

#1018

#1019

#102

#1020

#1028

#1030

#1032

#1034

#1035

#1036

#1037

#1038

#1039

#104

#1043

#1053

#1054

#1058

#1059

#1060

#1061

#1064

#1067

#1070

#1074

#1077

#1079

#108

#1081

#1085

#1087

#109

#1091

#1092

#1093

#1097

#1098

#1099

#11

#110

#1100

#1101

#1102

#1104

#1105

#1109

#111

#1110

#1111

#1112

#1113

#1114

#1115

#1116

#1117

#1118

#1119

#1120

#1122

#1123

#1124

#1125

#1126

#1129

#1131

#1132

#1135

#1136

#1137

#1138

#1139

#1140

#1141

#1142

#1147

#1149

#115

#1150

#1151

#1152

#1154

#1157

#116

#1161

#1165

#1167

#1168

#1169

#117

#1172

#1174

#1175

#1176

#1177

#1178

#1179

#118

#1180

#1183

#1184

#1185

#1188

#1189

#119

#1190

#1191

#1194

#1195

#1199

#120

#1200

#1202

#1203

#1204

#1205

#1206

#1208

#1209

#121

#1211

#1212

#1215

#1216

#122

#1221

#1222

#1224

#1227

#1228

#123

#1235

#1236

#1237

#1239

#1242

#1243

#1245

#1246

#1247

#1249

#125

#1252

#1253

#1256

#1260

#1262

#1262

#1263

#1264

#1266

#1268

#1269

#1270

#1273

#1274

#1275

#1278

#1279

#128

#1280

#1281

#1282

#1283

#1285

#1287

#1289

#1291

#1297

#1298

#1299

#1300

#1301

#1304

#1305

#1306

#1307

#1309

#131

#1310

#1314

#1315

#1316

#1318

#1319

#1320

#1323

#1325

#1326

#1327

#1329

#133

#1330

#1332

#1333

#1334

#1336

#1337

#1339

#1340

#1343

#1344

#1346

#1347

#1348

#1349

#1350

#1351

#1352

#1353

#1355

#1356

#1358

#1359

#136

#1360

#1361

#1362

#1365

#1366

#1367

#1368

#1369

#137

#1371

#1372

#1373

#1374

#1376

#1379

#138

#1381

#1383

#1384

#1385

#1387

#1389

#139

#1390

#1391

#1394

#1395

#1396

#1397

#1401

#1402

#1403

#1404

#1405

#1407

#1408

#1410

#1412

#1414

#1415

#1416

#1417

#1418

#1419

#142

#1420

#1421

#1425

#143

#1430

#1431

#1434

#1436

#1437

#144

#1440

#1442

#1444

#1445

#1446

#1447

#1449

#145

#1450

#1451

#1452

#1453

#1455

#1456

#1460

#1461

#1462

#1468

#1470

#1471

#1476

#1477

#1478

#1479

#1482

#1485

#1486

#1488

#149

#1490

#1491

#1492

#1493

#1495

#1496

#1497

#1498

#150

#1501

#1502

#1503

#1506

#1508

#1509

#1510

#1514

#1515

#1515

#1518

#1519

#1521

#1522

#1523

#1524

#1525

#1526

#1528

#1529

#1532

#1534

#1535

#1537

#1539

#1541

#1543

#1545

#1546

#1548

#1550

#1551

#1553

#1555

#1556

#1557

#1558

#156

#1561

#1562

#1563

#1564

#1565

#1566

#1568

#1569

#157

#1570

#1572

#1574

#1575

#1577

#1578

#1579

#158

#1584

#1587

#1589

#159

#1590

#1591

#1594

#1595

#1596

#1597

#1600

#1601

#1603

#1606

#1607

#1609

#161

#1610

#1612

#1613

#1615

#1616

#1617

#1620

#1625

#1626

#1627

#1628

#1629

#1630

#1632

#1634

#1635

#1637

#1638

#1639

#1640

#1642

#1644

#1645

#1646

#1650

#1651

#1652

#1653

#1654

#1655

#1656

#1657

#1658

#1659

#166

#1660

#1661

#1662

#1663

#1664

#1665

#1666

#1667

#1668

#167

#1671

#1672

#1673

#1674

#1675

#1677

#1678

#1679

#1680

#1681

#1682

#1683

#1684

#1685

#1687

#1688

#1689

#1690

#1691

#1692

#1693

#1694

#1695

#1696

#1697

#1698

#1699

#170

#1700

#1701

#1702

#1704

#1705

#1706

#1708

#1709

#1710

#1714

#1715

#1716

#1718

#1719

#1721

#1722

#1723

#1724

#1726

#1727

#1728

#1731

#1732

#1733

#1735

#1736

#1737

#1738

#1740

#1741

#1742

#1743

#1745

#1746

#1747

#1748

#1749

#1750

#1752

#1753

#1754

#1755

#1756

#1757

#1758

#1759

#1760

#1761

#1762

#1763

#1764

#1765

#1768

#1772

#1773

#1774

#1775

#1776

#1777

#1782

#1783

#1784

#1788

#1789

#1789

#1793

#1795

#1797

#1798

#1799

#1801

#1802

#1803

#1805

#1806

#181

#1810

#1811

#1812

#1813

#1814

#1816

#1817

#1819

#1820

#1822

#1825

#1827

#1829

#183

#1830

#1831

#1833

#1834

#1835

#1836

#1837

#1838

#184

#1840

#1843

#1844

#1845

#1848

#185

#1852

#1854

#1856

#1857

#1858

#1859

#186

#1860

#1861

#1862

#1863

#1864

#1865

#1866

#1867

#1869

#187

#1870

#1874

#1875

#1876

#1879

#1882

#1883

#1884

#1885

#1887

#1889

#189

#1890

#1892

#1894

#1896

#1897

#1898

#1899

#190

#1900

#1901

#1902

#1904

#1906

#1911

#1913

#1914

#1915

#1916

#1917

#1920

#1921

#1922

#1923

#1924

#1925

#1926

#1928

#1929

#1931

#1932

#1935

#1936

#1937

#1938

#1939

#1940

#1943

#1944

#1948

#1949

#195

#1950

#1952

#1953

#1955

#1957

#196

#1961

#1962

#1966

#1968

#1969

#1970

#1970

#1972

#1973

#1974

#1975

#1976

#1978

#198

#1980

#1981

#1982

#1983

#1985

#1986

#1987

#1988

#1989

#199

#1990

#1991

#1992

#1995

#1996

#1997

#1998

#1999

#2

#2000

#2004

#2005

#2006

#2007

#2008

#2009

#2011

#2012

#2013

#2014

#2016

#2017

#2018

#202

#2020

#2021

#2024

#2025

#2026

#2027

#2028

#2029

#203

#2031

#2032

#2033

#2035

#2036

#2040

#2041

#2042

#2043

#2044

#2045

#2046

#2047

#2048

#2049

#205

#2051

#2052

#2053

#2054

#2055

#2058

#2059

#2060

#2061

#2062

#2065

#2066

#2068

#2069

#207

#2070

#2071

#2072

#2073

#2074

#2074

#2075

#2078

#2079

#2080

#2081

#2082

#2087

#209

#2090

#2091

#2092

#2094

#2095

#210

#2100

#2101

#2102

#2104

#2104

#2109

#211

#2110

#2114

#2117

#2119

#2121

#2123

#2128

#2129

#2131

#2135

#2136

#2138

#2139

#2141

#2142

#2143

#2145

#2147

#2148

#215

#2150

#2151

#2152

#2153

#2156

#2156

#2157

#2158

#2160

#2161

#2162

#217

#2172

#2173

#2176

#2177

#2178

#2179

#2181

#2182

#2183

#2187

#2188

#2189

#2191

#2192

#2193

#2195

#22

#2201

#2202

#2204

#2206

#2207

#2209

#221

#2210

#2213

#2214

#2215

#2216

#2217

#2219

#222

#2220

#2221

#2223

#2225

#2226

#2230

#2231

#2232

#2234

#2237

#2239

#224

#2240

#2241

#2242

#2243

#2244

#2246

#2247

#2248

#225

#2250

#2255

#2256

#2258

#2259

#226

#2260

#2261

#2262

#2263

#2264

#2265

#2268

#2269

#227

#2270

#2271

#2272

#2274

#2275

#2276

#2277

#228

#2280

#2282

#2283

#2284

#2286

#2287

#2288

#2289

#2290

#2291

#2293

#2294

#2295

#2296

#2297

#2297

#2298

#2299

#230

#2300

#2300

#2302

#2303

#2304

#2306

#2307

#2308

#2311

#2314

#2316

#2317

#2318

#232

#2320

#2321

#2322

#2323

#2324

#2325

#2326

#2327

#2328

#2329

#233

#2330

#2331

#2335

#2336

#2339

#2340

#2341

#2342

#2343

#2345

#2346

#2347

#235

#2350

#2351

#2352

#2354

#2355

#2356

#2357

#2360

#2361

#2362

#2363

#2364

#2365

#2367

#2368

#2370

#2371

#2372

#2375

#2377

#2378

#2379

#2380

#2382

#2383

#2385

#2386

#2387

#2388

#2389

#2392

#2396

#2397

#2398

#2399

#240

#2400

#2401

#2404

#2406

#2407

#2408

#2409

#2411

#2412

#2413

#2414

#2415

#2416

#2417

#2419

#2420

#2421

#2423

#2424

#2425

#2426

#2427

#2429

#2430

#2431

#2432

#2433

#2434

#2435

#2437

#2438

#2439

#244

#2440

#2441

#2442

#2443

#2444

#2445

#2446

#2447

#2448

#2449

#245

#2450

#2453

#2454

#2455

#2460

#2461

#2462

#2463

#2464

#2465

#2466

#2468

#247

#2470

#2471

#2472

#2473

#2474

#2476

#2477

#2480

#2482

#2483

#2484

#2485

#2486

#2487

#2488

#2489

#249

#2491

#2493

#2494

#2495

#2496

#2499

#250

#2502

#2505

#2506

#2508

#2510

#2511

#2513

#2514

#2515

#2517

#2518

#252

#2520

#2521

#2523

#2524

#2525

#2526

#2527

#2528

#2530

#2531

#2532

#2533

#2534

#2535

#2539

#254

#2541

#2542

#2543

#2544

#2545

#2546

#2548

#2549

#255

#2550

#2551

#2552

#2553

#2554

#2555

#2557

#2558

#2559

#256

#2560

#2562

#2563

#2564

#2565

#2567

#2568

#2569

#2570

#2571

#2572

#2573

#2574

#2576

#2577

#2578

#2580

#2581

#2582

#2584

#2586

#2587

#2588

#2591

#2592

#2594

#2595

#2598

#26

#260

#2600

#2601

#2602

#2603

#2603

#2604

#2606

#2608

#2609

#261

#2611

#2612

#2613

#2614

#2618

#2619

#2619

#2620

#2621

#2622

#2627

#263

#2630

#2631

#2633

#2634

#2636

#2638

#2641

#2642

#2644

#2645

#2646

#2648

#2649

#2650

#2652

#2653

#2654

#2656

#2657

#2658

#2659

#2661

#2662

#2663

#2666

#2667

#2669

#2671

#2672

#2673

#2678

#2679

#268

#2680

#2682

#2684

#2686

#2687

#2688

#2689

#2690

#2692

#2694

#2697

#2699

#2700

#2701

#2702

#2704

#2705

#2706

#2706

#2707

#2709

#2713

#2715

#2716

#2717

#2718

#2719

#2720

#2721

#2722

#2723

#2723

#2725

#2726

#2727

#2730

#2731

#2733

#2734

#2736

#2737

#2739

#274

#2740

#2741

#2743

#2746

#275

#2750

#2751

#2752

#2753

#2754

#2756

#2757

#2758

#2759

#276

#2760

#2761

#2762

#2763

#2764

#2765

#2767

#2769

#277

#2771

#2772

#2773

#2774

#2775

#2776

#2777

#2778

#278

#2780

#2781

#2782

#2783

#2784

#2785

#2786

#2787

#2788

#2789

#2789

#2790

#2792

#2796

#2796

#2797

#2798

#2799

#280

#2800

#2802

#2804

#2805

#2808

#2808

#2809

#2809

#281

#2810

#2811

#2813

#2814

#2815

#2817

#2817

#2818

#2819

#2820

#2822

#2823

#2824

#2825

#2826

#2828

#2830

#2831

#2832

#2833

#2836

#2838

#284

#2843

#2845

#2847

#2850

#2852

#2853

#2853

#2854

#2857

#2859

#2860

#2860

#2861

#2862

#2862

#2863

#2866

#2868

#2869

#2870

#2871

#2872

#2873

#2874

#2875

#2877

#2882

#2883

#2885

#2885

#2886

#289

#2890

#2891

#2892

#2893

#2896

#2897

#2899

#29

#2901

#2902

#2902

#2903

#2904

#2904

#2905

#2906

#2906

#2909

#2910

#2910

#2911

#2912

#292

#295

#298

#299

#3

#302

#304

#305

#306

#307

#308

#309

#310

#311

#312

#313

#315

#316

#317

#318

#319

#32

#323

#324

#325

#327

#329

#330

#332

#334

#335

#336

#337

#339

#34

#340

#342

#344

#345

#347

#348

#349

#350

#352

#353

#354

#355

#356

#357

#358

#359

#36

#364

#366

#370

#371

#373

#374

#375

#379

#38

#380

#381

#382

#383

#384

#385

#386

#388

#389

#390

#391

#392

#394

#395

#397

#398

#4

#401

#405

#406

#409

#411

#412

#414

#415

#418

#419

#421

#423

#424

#425

#426

#427

#428

#430

#431

#432

#433

#435

#438

#441

#443

#444

#445

#446

#447

#448

#449

#453

#455

#456

#457

#458

#461

#462

#463

#47

#470

#472

#473

#475

#476

#477

#478

#479

#480

#481

#482

#484

#489

#490

#492

#494

#497

#498

#5

#50

#500

#501

#505

#507

#508

#510

#511

#512

#513

#514

#517

#519

#520

#521

#523

#524

#525

#526

#527

#53

#533

#536

#537

#539

#541

#543

#55

#552

#554

#558

#559

#56

#560

#561

#562

#563

#564

#565

#569

#571

#579

#58

#581

#584

#588

#59

#591

#592

#595

#599

#6

#601

#602

#603

#604

#607

#608

#612

#613

#614

#616

#619

#62

#620

#622

#623

#624

#625

#626

#627

#629

#630

#631

#634

#635

#637

#638

#639

#64

#641

#643

#645

#647

#648

#651

#653

#656

#658

#659

#660

#661

#662

#663

#664

#667

#670

#674

#675

#676

#677

#678

#679

#68

#681

#682

#683

#684

#685

#686

#687

#688

#689

#69

#691

#692

#694

#696

#697

#698

#699

#7

#70

#702

#703

#704

#705

#706

#707

#708

#709

#710

#711

#713

#715

#716

#717

#718

#72

#721

#723

#727

#729

#730

#735

#737

#739

#74

#740

#744

#745

#747

#749

#75

#752

#759

#760

#761

#763

#764

#766

#768

#771

#776

#777

#778

#78

#78

#782

#783

#784

#785

#786

#787

#788

#79

#790

#791

#792

#793

#794

#796

#797

#8

#80

#800

#801

#802

#803

#804

#805

#806

#807

#809

#81

#812

#813

#816

#818

#819

#820

#821

#822

#823

#824

#826

#828

#829

#831

#835

#836

#838

#839

#84

#841

#843

#844

#848

#849

#85

#850

#852

#853

#858

#859

#861

#862

#863

#864

#867

#869

#87

#870

#871

#872

#874

#875

#876

#877

#879

#88

#880

#881

#883

#886

#889

#89

#890

#891

#892

#893

#894

#895

#897

#899

#9

#90

#901

#902

#903

#904

#905

#906

#907

#91

#911

#915

#916

#917

#919

#92

#920

#921

#923

#924

#925

#926

#929

#932

#933

#934

#94

#940

#941

#942

#944

#947

#948

#949

#950

#951

#952

#953

#955

#956

#957

#958

#960

#961

#962

#964

#967

#969

#970

#971

#972

#973

#974

#975

#976

#977

#978

#979

#98

#980

#981

#982

#984

#986

#987

#988

#989

#99

#991

#992

#993

#994

#998

v0.0.10

v0.0.11

v0.0.2

v0.0.3

v0.0.4

v0.0.5

v0.0.6

v0.0.7

v0.0.9

v0.1.0

v0.10.0

v0.11.0

v0.11.1

v0.12.0

v0.12.1

v0.12.2

v0.13.0

v0.13.1

v0.14.0

v0.14.1

v0.15.0

v0.15.1

v0.15.2

v0.16.0

v0.16.1

v0.16.2

v0.16.3

v0.17.0

v0.17.1

v0.17.2

v0.17.3

v0.18.0

v0.18.1

v0.19.0

v0.19.1

v0.19.2

v0.19.3

v0.2.0

v0.20.0

v0.21.0

v0.21.1

v0.22.0

v0.22.1

v0.23.0

v0.23.1

v0.23.2

v0.24.0

v0.24.1

v0.24.2

v0.25.0

v0.25.1

v0.25.2

v0.26.0

v0.26.1

v0.26.2

v0.26.3

v0.26.5

v0.27.1

v0.28.0

v0.29.0

v0.29.1

v0.29.2

v0.29.3

v0.29.4

v0.3.0

v0.30.0

v0.4.0

v0.5.0

v0.5.1

v0.6.0

v0.7.0

v0.8.0

v0.8.1

v0.9.0

v0.9.1

728d4db582 Support destination arg in tree flatten/unflatten (#2450) Luca Vivona 2025-08-06 18:34:59 -04:00
99d8de8445 Fix cudnn routing Jagrit Digani 2025-08-06 15:05:58 -07:00
c66b76a8c8 Update routing Jagrit Digani 2025-08-06 15:01:15 -07:00
f81edd184f Complete 2 pass sdpav Jagrit Digani 2025-08-06 13:57:40 -07:00
7f8ba2a003 [WIP] 2 pass sdpav Jagrit Digani 2025-08-06 09:54:41 -07:00
c28249b81a Add more nvtx range for debug Jagrit Digani 2025-08-01 12:49:24 -07:00
e74bcdc5e3 Add sdpa file Jagrit Digani 2025-07-25 12:30:50 -07:00
d8ed6c1aa3 Add base cudnn attention support Jagrit Digani 2025-07-25 12:30:22 -07:00
db5c7efcf6 revert default cuda install (#2465) Awni Hannun 2025-08-06 06:19:12 -07:00
7bb96e4249 fix cublas on h100 (#2466) Awni Hannun 2025-08-06 06:18:58 -07:00
fa89f0b150 faster gather qmm sorted test (#2463) Awni Hannun 2025-08-05 06:27:40 -07:00
ca973d1e83 fix install tags (#2464) Awni Hannun 2025-08-04 20:01:23 -07:00
828c5f1137 Use SmallVector for shapes and strides (#2454) Cheng 2025-08-05 09:41:03 +09:00
7d86a5c108 Feat: add USE_SYSTEM_FMT CMake option (#2219) Gaétan Lepage 2025-08-05 01:36:11 +02:00
0b807893a7 fix wraps compile (#2461) Awni Hannun 2025-08-04 16:14:18 -07:00
6ad0889c8a default install cuda on linux (#2462) Awni Hannun 2025-08-04 15:33:05 -07:00
737dd6d1ac Add missing <algorithm> header to jit_compiler.cpp (#2460) Zamderax 2025-08-04 14:00:46 -07:00
aaf78f4c6b Use LRU cache for cuda graph (#2448) Cheng 2025-08-02 21:28:57 +09:00
8831064493 Fix arctan2 grads (#2453) Angelos Katharopoulos 2025-08-01 21:06:04 -07:00
be9bc96da4 [CUDA] Matmul utils initial commit (#2441) Angelos Katharopoulos 2025-08-01 14:22:25 -07:00
86258f292f [CUDA] Vectorize generated kernels (#2444) Angelos Katharopoulos 2025-07-31 18:18:57 -07:00
b26d88591c [CUDA] Save primitive inputs faster (#2449) Cheng 2025-08-01 10:16:06 +09:00
86c6a15571 [CUDA] Backward convolution (#2431) Cheng 2025-08-01 09:54:05 +09:00
8b25ce62d5 Add tests for export including control flow models and quantized models (#2430) junpeiz 2025-07-31 11:06:26 -07:00
da5912e4f2 fix custom metal extension (#2446) Awni Hannun 2025-07-31 06:25:36 -07:00
daafee676f Fix wrong graph key when using concurrent context (#2447) Cheng 2025-07-31 22:01:05 +09:00
d32519c8ee fix gemv regression (#2445) Awni Hannun 2025-07-30 14:23:01 -07:00
b405591249 fix circular reference (#2443) Awni Hannun 2025-07-30 09:37:44 -07:00
3bf81ed1bd [CUDA] Quantized refactoring (#2442) Angelos Katharopoulos 2025-07-30 08:27:20 -07:00
2204182bba Make CI faster (#2440) Cheng 2025-07-30 18:26:36 +09:00
3628e5d497 Use load_vector in arg_reduce (#2439) Cheng 2025-07-30 17:40:26 +09:00
a0ae49d397 Move arange to its own file (#2438) Cheng 2025-07-30 13:05:51 +09:00
254476718b Remove the kernel arg from get_launch_args (#2437) Cheng 2025-07-30 11:43:02 +09:00
3adba92ebe Cuda faster softmax (#2435) Awni Hannun 2025-07-29 17:18:12 -07:00
ef631d63af faster rms norm (#2433) Awni Hannun 2025-07-29 13:12:00 -07:00
970dbe8e25 Use ccache in CI (#2414) Cheng 2025-07-29 08:43:22 +09:00
641be9463b Add more CUDA architectures for PyPi package (#2427) Awni Hannun 2025-07-28 12:35:15 -07:00
ab0e608862 [CUDA] More sizes for gemv (#2429) Awni Hannun 2025-07-28 12:35:01 -07:00
1588659062 no occupancy query for launch params (#2426) Awni Hannun 2025-07-28 09:09:41 -07:00
b9e88fb976 [CUDA] Fix segfault on exit (#2424) Awni Hannun 2025-07-27 08:08:13 -07:00
4ad53414dd fix cuda pypi package (#2423) v0.27.1 Awni Hannun 2025-07-25 15:20:29 -07:00
d1165b215e version (#2420) Awni Hannun 2025-07-25 13:29:28 -07:00
dcb8319f3d update install docs and requirements (#2419) Awni Hannun 2025-07-25 12:13:19 -07:00
5597fa089c Fix qvm splitk (#2415) Awni Hannun 2025-07-25 11:50:24 -07:00
9acec364c2 [CUDA] Always use batched matmul (#2404) Awni Hannun 2025-07-24 20:46:02 -07:00
7d9d6ef456 docs: fix adam and adamw eps placement (#2416) Skonor 2025-07-24 16:40:45 -07:00
6f5874a2f2 [CUDA] Initial implementation of Convolution with cuDNN (#2385) Cheng 2025-07-25 08:12:10 +09:00
70dc336785 Test on cuda 12.2 and 12.9 (#2413) Awni Hannun 2025-07-24 06:06:15 -07:00
4e504039f5 [Metal] Release metal events (#2412) Awni Hannun 2025-07-23 19:53:42 -07:00
d1f4d291e8 Fix uv install and add dev release (#2411) Awni Hannun 2025-07-23 16:54:19 -07:00
e1840853ce full row mask in sdpa consistently gives nan (#2406) Awni Hannun 2025-07-23 16:37:03 -07:00
0f5ce173da [CUDA] --compress-mode requires CUDA 12.8 (#2407) Cheng 2025-07-23 22:11:11 +09:00
588854195f Remove unused code in Convolution::vjp (#2408) Cheng 2025-07-23 22:11:00 +09:00
28d068bce6 Fix an error in the comment for mx.dequantize (#2409) Fangjun Kuang 2025-07-23 21:10:50 +08:00
8269c9d02d Support unaligned M qmm Angelos Katharopoulos 2025-07-23 00:40:27 -07:00
903b40627c Add dynamic shared memory and improve qmm Angelos Katharopoulos 2025-07-22 23:36:53 -07:00
d107d8d495 add cuda gemv (#2400) Awni Hannun 2025-07-22 08:24:13 -07:00
1e496ddb82 [CUDA] Simplify allocator (#2392) Awni Hannun 2025-07-22 08:24:01 -07:00
74eccbf3fa use size option in binary (#2399) Awni Hannun 2025-07-22 07:00:53 -07:00
08638223ca Fix including stubs in wheel (#2398) Awni Hannun 2025-07-22 06:30:17 -07:00
700f7dcf01 Refactor the matmul a bit Angelos Katharopoulos 2025-07-21 23:38:21 -07:00
56cc858af9 Add contiguous_copy_cpu util for copying array (#2397) Cheng 2025-07-21 23:30:35 +09:00
f55c4ed1d6 Remove thrust iterators (#2396) Cheng 2025-07-21 23:30:27 +09:00
6c60bd1cbf Fixed mma and working dequant Angelos Katharopoulos 2025-07-21 04:39:27 -07:00
a64cc02a0c Somewhat working matmul primitives Angelos Katharopoulos 2025-07-21 02:22:25 -07:00
346ae5fdb5 Refactor quantized Angelos Katharopoulos 2025-07-16 16:22:25 -07:00
93d70419e7 [CUDA] speedup handling scalars (#2389) Awni Hannun 2025-07-18 21:47:31 -07:00
63f663d9c6 fix cuda manylinux version to match others (#2388) Awni Hannun 2025-07-18 21:02:16 -07:00
84b4d96efa fix release build + patch bump (#2387) v0.26.5 Awni Hannun 2025-07-18 14:47:37 -07:00
aec67f2fa6 patch bump (#2386) Awni Hannun 2025-07-18 12:25:48 -07:00
deee214a95 Adding support for the Muon Optimizer (#1914) Gökdeniz Gülmez 2025-07-18 21:25:28 +02:00
45adec102c Add contiguous_copy_gpu util for copying array (#2379) Cheng 2025-07-18 22:44:25 +09:00
31fc530c76 [CUDA] Add more ways finding CCCL headers in JIT (#2382) Cheng 2025-07-18 07:25:34 +09:00
fbb3f65a1a fix resource leaks in matmul and graph (#2383) Awni Hannun 2025-07-17 06:50:15 -07:00
6b1b8ea91b [CUDA] Add work per thread to compile (#2368) Angelos Katharopoulos 2025-07-17 06:47:52 -07:00
b2273733ea Test with CUDA 12.2 (#2375) Awni Hannun 2025-07-16 13:00:37 -07:00
f409b229a4 fix ring distributed test (#2380) Awni Hannun 2025-07-16 11:25:24 -07:00
30571e2326 Rename the copy util in cpu/copy.h to copy_cpu (#2378) Cheng 2025-07-16 23:34:24 +09:00
d7734edd9f fix complex reduce + nan propagation in min and max (#2377) Awni Hannun 2025-07-15 18:19:47 -07:00
2ba69bc8fa lower memory uniform sampling (#2361) Awni Hannun 2025-07-15 14:22:07 -07:00
cb349a291c [CUDA] Use cuda::std::complex in place of cuComplex (#2372) Cheng 2025-07-15 16:36:13 +09:00
f0a0b077a0 Install linux with mlx[cuda] and mlx[cpu] (#2356) Awni Hannun 2025-07-14 17:17:33 -07:00
49114f28ab fix flaky test (#2371) Awni Hannun 2025-07-14 17:16:18 -07:00
e7d2ebadd2 [CUDA] Affine quantize (#2354) Awni Hannun 2025-07-14 15:45:44 -07:00
e569803d7c update linux build (#2370) Awni Hannun 2025-07-14 15:13:56 -07:00
d34f887abc Add Primitive::name and remove Primitive::print (#2365) Cheng 2025-07-15 06:06:35 +09:00
5201df5030 Fix imag() vjp (#2367) Angelos Katharopoulos 2025-07-14 13:11:16 -07:00
2d3c26c565 [CUDA] Do not put kernels in annoymous namespace (#2362) Cheng 2025-07-13 06:24:45 +09:00
6325f60d52 [CUDA] Bundle CCCL for JIT compilation (#2357) Cheng 2025-07-12 10:45:37 +09:00
a9c720e8cd Improve the ring backend initialization ring-init Angelos Katharopoulos 2025-07-11 15:31:28 -07:00
42cc9cfbc7 fix copy dispatch (#2360) Awni Hannun 2025-07-11 10:59:35 -07:00
8347575ba1 [CUDA] Implement Scan kernel (#2347) Cheng 2025-07-11 08:54:12 +09:00
b6eec20260 Fix edge check in qmm_n QuantizedLoader (#2355) Angelos Katharopoulos 2025-07-10 16:28:50 -07:00
0eb035b4b1 Fix type promotion in Adam with bias correction (#2350) Angelos Katharopoulos 2025-07-10 11:14:42 -07:00
afb9817599 [CUDA] Put version in ptx cache dir path (#2352) Cheng 2025-07-10 23:24:21 +09:00
8fb3e7a26c [CUDA] Set current device before cudaGraphLaunch (#2351) Cheng 2025-07-10 23:24:02 +09:00
8c7bc30ce4 Align mlx::core::min op nan propagation with NumPy (#2346) jhavukainen 2025-07-10 06:20:43 -07:00
85873cb162 [CUDA] Do vectorized store/load in contiguous elementwise ops (#2342) Cheng 2025-07-10 10:48:43 +09:00
e14ee12491 add zero for argsort vjp (#2345) Awni Hannun 2025-07-09 14:37:14 -07:00
8b9a3f3cea Align mlx::core::max op nan propagation with NumPy (#2339) jhavukainen 2025-07-09 11:26:27 -07:00

... 3 4 5 6 7 ...