* start port of phi3small * fix phi3 * use block sparsity * compile activation * nits in readme / mlx lm version