mlx-examples

zhangyiss/mlx-examples

Fork 0

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-10-30 10:08:08 +08:00

Commit Graph

Select branches

Hide Pull Requests

awq

awq-tq

dist-eval

distributed-layers

flux-dist-improv

flux-qlora

load-gguf

main

openlm

packed-quants

#1

#10

#1001

#1003

#1004

#1006

#1009

#1012

#1013

#1014

#1015

#1016

#102

#1020

#1023

#1024

#1026

#1027

#1028

#103

#1035

#1036

#1037

#1038

#1040

#1045

#1047

#1048

#1048

#1049

#105

#1050

#1054

#1055

#106

#1061

#1062

#1063

#1065

#1068

#1069

#107

#1070

#1072

#1074

#1075

#1078

#1079

#108

#1080

#1081

#1082

#1085

#1089

#1090

#1092

#1093

#1094

#1096

#1099

#110

#1100

#1103

#1105

#1113

#1114

#1115

#1117

#1118

#1119

#112

#1121

#1122

#1125

#1128

#1129

#113

#1131

#1132

#1133

#1135

#1137

#1140

#1141

#1143

#1144

#1145

#1146

#1148

#1149

#115

#1152

#1153

#1154

#1155

#1156

#1157

#1158

#1159

#116

#1163

#1164

#1166

#1173

#1174

#1176

#1178

#1180

#1189

#1191

#1192

#1193

#1194

#1196

#1197

#1199

#12

#1200

#1202

#1204

#1205

#1206

#1208

#1209

#121

#1210

#1211

#1212

#1213

#1214

#1215

#1216

#1217

#1220

#1222

#1225

#1228

#1229

#123

#1230

#1230

#1231

#1231

#1233

#1234

#1235

#1240

#1241

#1242

#1246

#1249

#1250

#1251

#1253

#1257

#1259

#1260

#1263

#1265

#1267

#1270

#1271

#1272

#1273

#1276

#1277

#1278

#1279

#1280

#1283

#1287

#129

#1291

#1294

#1295

#1297

#1298

#1299

#1300

#1301

#1302

#1304

#1305

#1308

#1311

#1312

#1314

#1315

#1316

#1319

#1321

#1322

#1323

#1324

#1325

#1326

#133

#1330

#1331

#1332

#1336

#1338

#1339

#134

#1340

#1342

#1346

#1348

#1349

#1350

#1353

#1358

#1359

#136

#1364

#1365

#1367

#1370

#1371

#1375

#1375

#1377

#1377

#1383

#1383

#1385

#1386

#1387

#1388

#1389

#1390

#14

#140

#141

#144

#145

#147

#149

#151

#152

#153

#154

#157

#158

#159

#160

#161

#162

#163

#164

#165

#166

#167

#169

#171

#172

#173

#174

#176

#177

#178

#180

#183

#184

#186

#187

#189

#19

#190

#191

#192

#193

#195

#197

#198

#2

#200

#201

#202

#203

#205

#208

#211

#213

#214

#215

#219

#22

#221

#222

#227

#229

#23

#231

#234

#235

#237

#238

#24

#240

#241

#242

#243

#245

#248

#250

#251

#252

#253

#254

#255

#257

#260

#263

#264

#265

#266

#269

#27

#270

#271

#272

#274

#275

#276

#278

#282

#284

#285

#287

#290

#291

#292

#293

#294

#295

#30

#300

#301

#302

#303

#306

#307

#308

#309

#310

#311

#312

#315

#318

#319

#32

#320

#321

#325

#326

#33

#331

#333

#335

#337

#338

#340

#342

#343

#347

#350

#351

#352

#353

#354

#358

#36

#360

#361

#363

#364

#365

#366

#369

#37

#373

#375

#377

#378

#379

#380

#386

#387

#388

#389

#391

#392

#393

#396

#397

#398

#399

#40

#401

#405

#408

#409

#41

#41

#411

#413

#414

#415

#417

#418

#419

#42

#420

#421

#424

#426

#427

#429

#43

#431

#432

#433

#439

#441

#443

#445

#446

#449

#45

#450

#451

#453

#455

#457

#458

#461

#462

#466

#467

#468

#469

#47

#470

#471

#472

#474

#475

#479

#48

#482

#483

#483

#486

#489

#491

#494

#495

#496

#497

#498

#5

#50

#501

#502

#503

#505

#506

#509

#51

#510

#514

#515

#516

#518

#519

#52

#520

#521

#522

#523

#528

#53

#530

#531

#534

#536

#539

#541

#544

#545

#546

#547

#548

#549

#55

#552

#555

#558

#562

#563

#565

#566

#569

#570

#571

#572

#573

#574

#577

#578

#580

#581

#585

#589

#590

#591

#592

#595

#596

#599

#602

#603

#604

#608

#609

#610

#611

#613

#614

#62

#621

#623

#628

#632

#633

#634

#636

#639

#64

#640

#643

#644

#645

#648

#650

#651

#654

#657

#66

#661

#665

#666

#667

#668

#670

#673

#674

#675

#676

#679

#680

#681

#682

#683

#684

#685

#687

#688

#69

#690

#691

#693

#694

#697

#698

#7

#701

#702

#703

#705

#707

#708

#711

#712

#715

#716

#717

#719

#720

#721

#729

#73

#731

#735

#736

#739

#74

#740

#743

#744

#746

#749

#75

#752

#753

#758

#76

#760

#763

#766

#77

#770

#773

#775

#778

#779

#78

#782

#789

#79

#790

#792

#793

#794

#797

#798

#8

#80

#800

#802

#803

#806

#807

#810

#813

#817

#818

#82

#821

#822

#824

#825

#827

#828

#830

#831

#833

#835

#837

#838

#839

#84

#840

#85

#851

#852

#853

#855

#856

#857

#86

#863

#867

#87

#871

#877

#879

#88

#882

#885

#886

#888

#889

#89

#890

#891

#895

#898

#899

#90

#902

#903

#904

#905

#906

#907

#911

#913

#914

#915

#915

#920

#923

#926

#93

#931

#932

#935

#936

#937

#94

#940

#942

#945

#946

#948

#949

#954

#955

#956

#957

#96

#960

#961

#962

#963

#965

#969

#97

#971

#971

#973

#979

#98

#981

#983

#984

#989

#99

#990

#991

#993

#995

#996

#998

9b83004631 Faster sampling with mx.compile (#937) Awni Hannun 2024-08-15 11:29:09 -07:00
95840f32e2 Fix whipser conversion for safetensors models (#935) Awni Hannun 2024-08-14 10:22:04 -07:00
33905447f9 Whisper updates to allow HF models (#923) Awni Hannun 2024-08-09 11:11:58 -07:00
df744c98e6 Predict stop sequence matches during streaming (#541) tidely 2024-08-07 01:24:15 +03:00
8fa12b0058 Adapters loading (#902) Khush Gupta 2024-08-01 16:18:18 -07:00
85dc76f6e0 Server: support stream_options (#913) madroid 2024-07-26 23:58:52 +08:00
46da74fea2 Unify attention mask in LLMs (#911) otriscon 2024-07-25 19:45:22 -04:00
7a3ab1620a support load model by custom get_model_classes (#899) Anchen 2024-07-26 04:01:17 +10:00
cd8efc7fbc Add support for Llama-3.1 (#907) Alex Cheema 2024-07-23 13:21:32 -07:00
47060a8130 refactor: add force_download parameter to get_model_path function (#800) M. Ali Bayram 2024-07-23 23:10:20 +03:00
3f337e0f0a Add Mistral NeMo (fix) (#895) Prince Canuma 2024-07-22 15:09:24 +02:00
3d365b612a Add support for InternLM-2.5 (#871) Prince Canuma 2024-07-18 01:38:22 +02:00
561dcf5643 Add support for deepseek coder v2 lite (#882) Anchen 2024-07-18 00:23:28 +10:00
f0c6c6e226 keep the server in a valid state (#889) Awni Hannun 2024-07-15 18:35:36 -07:00
bfc1f2763b longrope (#886) JosefAlbers 2024-07-12 23:19:11 +09:00
8bf397e450 Pass use_dora parameter to linear_to_lora_layers (#885) Chime Ogbuji 2024-07-11 17:34:34 -04:00
fbe3247772 Add GPT-neox model (#863) nicolov 2024-07-11 15:13:17 +02:00
9717307ff0 Validation with full data set, results in NaN validation score (#879) James A Capozzoli 2024-07-10 11:36:11 -04:00
63800c8feb Example of response generation with optional arguments (#853) Alex Wozniakowski 2024-07-09 06:49:59 -07:00
68e88d42fb Fix server for openai package (#877) Awni Hannun 2024-07-08 12:34:31 -07:00
20e221f7f7 Add recurrent gemma (#856) Awni Hannun 2024-07-07 12:10:04 -07:00
1e05aef344 Add logit soft capping to gemma, and fix precision issues (#857) n8programs 2024-07-02 10:52:39 -04:00
f212b770d8 Server loads the model on demand from the request (#851) Angelos Katharopoulos 2024-06-27 11:37:57 -07:00
538339b599 gemma2 (#855) Awni Hannun 2024-06-27 10:06:28 -07:00
9f10728145 fix yi (#852) Awni Hannun 2024-06-27 06:38:19 -07:00
7979b84a9e transformer_lm: add --dataset enwik8 (#838) Volodymyr Kyrylov 2024-06-26 20:59:01 +02:00
df6bc09d74 Configuration-based use of HF hub-hosted datasets for training (#701) Chime Ogbuji 2024-06-26 13:20:50 -04:00
1d701a1831 Logprobs info to completion API (#806) Chime Ogbuji 2024-06-23 13:35:13 -04:00
a7598e9456 Fix mypy errors with models/{qwen2,qwen2_moe,startcoder2}.py (#835) Yi Wang 2024-06-14 09:44:50 -07:00
97939cc86e nits openlm Awni Hannun 2024-06-13 07:47:56 -07:00
7c6ced183d openlm Awni Hannun 2024-06-13 07:47:16 -07:00
d8b073e3a7 Add eos token to lora fine-tunes (#818) Awni Hannun 2024-06-12 07:44:21 -07:00
3cc58e17fb Tweaks to run dspy-produced calls to the server, with gemma template. (#810) Nada Amin 2024-06-12 10:17:06 -04:00
6da07fb1b0 make models/phi3.py and models/phi3small.py compatible with mypy (#833) Yi Wang 2024-06-12 06:53:55 -07:00
fda41545a6 Su-RoPE(Rotary Position Embedding) for Phi-3 (#813) JosefAlbers 2024-06-11 22:20:04 +09:00
a54dfd698e Correct the type annotation of cache in llama.py (#828) Yi Wang 2024-06-10 15:18:34 -07:00
bb8227f181 Correct type annotation of llama.ModelArgs.num_key_value_heads (#827) Yi Wang 2024-06-10 14:47:31 -07:00
c5da302fc4 gpu featurization (#824) Awni Hannun 2024-06-07 08:59:44 -07:00
4872727f14 Fixing "NameError: name 'resume_adapter_file' is not defined" (#817) Robin Glauser 2024-06-05 19:07:31 +02:00
43d6deb3c1 mlx_lm: Add Streaming Capability to Generate Function (#807) Michał Kurc 2024-06-03 18:04:39 +02:00
8353bbbf93 Segment Anything Model (#552) Shiyu 2024-06-03 07:45:51 +08:00
89b0b75250 GPT2 Support (#798) Derek Lewis 2024-06-02 16:33:20 -07:00
c457a3f88b LoRA: Extract small function (#614) madroid 2024-06-02 21:38:42 +08:00
81318ad4a8 Port of phi3small (#794) Awni Hannun 2024-05-31 12:54:14 -07:00
09aaeac72c fix moe conversion (#802) Awni Hannun 2024-05-31 12:36:05 -07:00
f49c5f2829 fixed the requirements (#803) Behnam Moh 2024-05-29 09:14:19 -04:00
aac98ca6f4 support internlm2 (#797) Chen Xin 2024-05-27 21:22:21 +08:00
ca7ce60c91 Rename block sparse to gather (#793) Awni Hannun 2024-05-23 19:47:35 -07:00
69700d8431 Add support for Phi-3 Medium (#790) Prince Canuma 2024-05-23 01:47:06 +02:00
b044ce2acf Add support for ibm granite (#758) Prince Canuma 2024-05-22 05:16:31 +02:00
9fc6efbd90 version bump + some fixes (#792) Awni Hannun 2024-05-21 20:09:35 -07:00
9f671228cd Block sparse MM MoEs (#782) Angelos Katharopoulos 2024-05-21 15:58:08 -07:00
199df9e110 fix: Added dedicated error handling to load and get_model_path (#775) AtakanTekparmak 2024-05-20 15:39:05 +02:00
e92de216fd rid warning (#789) Awni Hannun 2024-05-20 06:05:33 -07:00
42458914c8 support dora finetune in mlx-examples/llms/mlx_lm (#779) alexC-nonsense4k 2024-05-16 23:21:26 +08:00
69181e0058 Support non incremental kv cache growth (#766) Awni Hannun 2024-05-15 12:56:24 -07:00
1a86d985d9 Support --add_eos_token argument within Lora training (#760) Jinwu Zhan 2024-05-14 08:17:42 +08:00
10853b57d9 Add model_config parameter to load() and load_model() (#770) JosefAlbers 2024-05-11 02:13:34 +09:00
6f0a69e682 fix lora for openelm (#773) Awni Hannun 2024-05-10 09:51:41 -07:00
fad9598372 Fix llama cache check (#763) Awni Hannun 2024-05-08 08:35:54 -07:00
ee60e2a9d5 Kv cache (#643) Awni Hannun 2024-05-08 08:18:13 -07:00
bfbc0e434a Add optional EOS token for llava example (#753) Albert Avetisian 2024-05-08 09:04:36 -04:00
c0019c4908 Pad mask with zeros for non-square attention matrices (#715) Kevin Wang 2024-05-04 19:32:25 -04:00
f30413b63c chore(mlx-lm): fix the number of validation batches configuration. (#752) Anchen 2024-05-04 23:52:42 +10:00
2bf11c4633 Use stable url for MNIST (#749) Awni Hannun 2024-05-03 17:13:05 -07:00
d1c35fa684 Add MLX Cache Limit setting for mlx_lm.generate and mlx_lm.server CLI (#744) Konstantin Kerekovski 2024-05-03 15:42:48 -04:00
b468091f7f Add model management functionality for local caches (#736) Ivan Fioravanti 2024-05-03 21:20:13 +02:00
92430df0a0 Fix lora for qwen moe (#743) Awni Hannun 2024-05-02 21:55:09 -07:00
5079af62db Update model card describe (#654) madroid 2024-05-03 12:22:04 +08:00
6775d6cb3f Whisper: Add pip distribution configuration to support pip installations. (#739) madroid 2024-05-02 00:00:02 +08:00
4bf2eb17f2 Validate server params & fix logit bias bug (#731) Karim Elmaaroufi 2024-04-30 07:27:40 -07:00
7c0962f4e2 Add Supported Quantized Phi-3-mini-4k-instruct gguf Weight (#717) Jaward Sesay 2024-04-30 11:11:32 +08:00
5513c4e57d Fixes Typo in Starcoder2 (#740) Thomas Lazarus 2024-04-29 15:14:45 -05:00
510d2bde49 Force multi_commits when uploading to HF (#729) Javier de la Rosa 2024-04-29 04:07:17 +02:00
699de35b03 Update lora_config.yaml (#735) 锦此 2024-04-29 01:24:34 +08:00
c012eb173f Add support for OpenELM (#719) Prince Canuma 2024-04-26 01:49:28 +02:00
2c1c9e9024 MiniCPM implementation (#685) Gökdeniz Gülmez 2024-04-26 00:29:28 +02:00
685012c2ad Couple fixes for LoRA (#711) Awni Hannun 2024-04-25 14:16:13 -07:00
109ee2f2f8 Use CORS headers for streaming for MLX Server (#716) Kristian Muñiz 2024-04-25 10:26:04 -04:00
8a265f0d54 Fix incorrect type annotation (#720) Kevin Wang 2024-04-24 18:52:43 -04:00
abcd891851 Add support for phi-3 (#712) Prince Canuma 2024-04-23 18:20:00 +02:00
ecbc6ff1e3 one more quant fix (#708) Awni Hannun 2024-04-22 18:12:52 -07:00
8d5cf5b0c8 use logging in mlx server (#705) Aaron Ng 2024-04-22 07:50:06 -07:00
f20e68fcc0 Load fused model with transformers (#703) AlexandrosChrtn 2024-04-21 19:04:44 +03:00
749cabf299 fix: unicode decoding (#702) Anchen 2024-04-22 01:58:23 +10:00
1484598de1 Add support for logit bias (#697) Karim Elmaaroufi 2024-04-21 06:53:56 -07:00
6abdbe3be8 Fix quant in gguf (#698) Awni Hannun 2024-04-19 20:07:11 -07:00
574ad7f6fe fix dequantization (#693) Awni Hannun 2024-04-19 10:46:59 -07:00
2146bcd7ee Quantize embedding / Update quantize API (#680) Awni Hannun 2024-04-18 18:16:10 -07:00
f5f189e48a fix(mlx-lm): broken server.py (#690) Anchen 2024-04-19 07:26:18 +10:00
35206806ac Create executables for generate, lora, server, merge, convert (#682) Phúc H. Lê Khắc 2024-04-17 00:08:49 +01:00
7d7e236061 - Removed unused Python imports (#683) dmdaksh 2024-04-16 10:50:32 -04:00
e55a9e8cb4 Add an SPM detokenizer that doesn't trim initial space (#681) Angelos Katharopoulos 2024-04-15 14:15:25 -07:00
d3f8e4aee9 Fix argpartition call in Mixtral and other MOES (#676) Awni Hannun 2024-04-12 11:00:56 -07:00
9c5554d8ee Use async eval (#670) Awni Hannun 2024-04-11 13:18:23 -07:00
0250f6f38e feat: Update black-pre-commit-mirror to version 24.3.0 (#675) Nripesh Niketan 2024-04-11 18:28:26 +04:00
9f472dc985 Update transformers for ⌘-R+ (#668) devonthomas35 2024-04-11 07:28:12 -07:00
5a4cad34ef Always resume downloads (#674) da-z 2024-04-11 15:52:32 +02:00
eff6690952 Fix CFG for SDXL (#667) Angelos Katharopoulos 2024-04-09 06:06:41 -07:00
1278994b56 Add streaming detokenizers (#651) Angelos Katharopoulos 2024-04-08 22:36:01 -07:00