Files
mlx/docs/build/doctrees/examples/linear_regression.doctree

85 lines
10 KiB
Plaintext
Raw Normal View History

2024-01-17 17:15:29 -08:00
<EFBFBD><05><>(<00>sphinx.addnodes<65><73>document<6E><74><EFBFBD>)<29><>}<7D>(<28> rawsource<63><65><00><>children<65>]<5D>(<28>docutils.nodes<65><73>target<65><74><EFBFBD>)<29><>}<7D>(h<05>.. _linear_regression:<3A>h]<5D><>
attributes<EFBFBD>}<7D>(<28>ids<64>]<5D><>classes<65>]<5D><>names<65>]<5D><>dupnames<65>]<5D><>backrefs<66>]<5D><>refid<69><64>linear-regression<6F>u<EFBFBD>tagname<6D>h
<EFBFBD>line<6E>K<01>parent<6E>h<03> _document<6E>h<03>source<63><65>C/Users/awnihannun/repos/mlx/docs/src/examples/linear_regression.rst<73>ubh <09>section<6F><6E><EFBFBD>)<29><>}<7D>(hhh]<5D>(h <09>title<6C><65><EFBFBD>)<29><>}<7D>(h<05>Linear Regression<6F>h]<5D>h <09>Text<78><74><EFBFBD><EFBFBD>Linear Regression<6F><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h+h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh)h h&h!hh"h#hKubh <09> paragraph<70><68><EFBFBD>)<29><>}<7D>(h<05><>Let's implement a basic linear regression model as a starting point to
learn MLX. First import the core package and setup some problem metadata:<3A>h]<5D>h0<68><30>Lets implement a basic linear regression model as a starting point to
learn MLX. First import the core package and setup some problem metadata:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h=h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h&h!hubh <09> literal_block<63><6B><EFBFBD>)<29><>}<7D>(h<05><>import mlx.core as mx
num_features = 100
num_examples = 1_000
num_iters = 10_000 # iterations of SGD
lr = 0.01 # learning rate for SGD<47>h]<5D>h0<68><30>import mlx.core as mx
num_features = 100
num_examples = 1_000
num_iters = 10_000 # iterations of SGD
lr = 0.01 # learning rate for SGD<47><44><EFBFBD><EFBFBD><EFBFBD>}<7D>h hMsbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><> xml:space<63><65>preserve<76><65>force<63><65><EFBFBD>language<67><65>python<6F><6E>highlight_args<67>}<7D>uhhKh"h#hK h h&h!hubh<)<29><>}<7D>(h<05>&We'll generate a synthetic dataset by:<3A>h]<5D>h0<68>(Well generate a synthetic dataset by:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h hbh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h&h!hubh <09>enumerated_list<73><74><EFBFBD>)<29><>}<7D>(hhh]<5D>(h <09> list_item<65><6D><EFBFBD>)<29><>}<7D>(h<05>!Sampling the design matrix ``X``.<2E>h]<5D>h<)<29><>}<7D>(hhyh]<5D>(h0<68>Sampling the design matrix <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h{h!hh"NhNubh <09>literal<61><6C><EFBFBD>)<29><>}<7D>(h<05>``X``<60>h]<5D>h0<68>X<><58><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh<>h h{ubh0<68>.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h{h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh hwubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhuh hrh!hh"h#hNubhv)<29><>}<7D>(h<05>4Sampling a ground truth parameter vector ``w_star``.<2E>h]<5D>h<)<29><>}<7D>(hh<>h]<5D>(h0<68>)Sampling a ground truth parameter vector <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>
``w_star``<60>h]<5D>h0<68>w_star<61><72><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh<>h h<>ubh0<68>.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhuh hrh!hh"h#hNubhv)<29><>}<7D>(h<05>OCompute the dependent values ``y`` by adding Gaussian noise to ``X @ w_star``.
<EFBFBD>h]<5D>h<)<29><>}<7D>(h<05>NCompute the dependent values ``y`` by adding Gaussian noise to ``X @ w_star``.<2E>h]<5D>(h0<68>Compute the dependent values <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>``y``<60>h]<5D>h0<68>y<><79><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh<>h h<>ubh0<68> by adding Gaussian noise to <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>``X @ w_star``<60>h]<5D>h0<68>
X @ w_star<61><72><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh<>h h<>ubh0<68>.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhuh hrh!hh"h#hNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>enumtype<70><65>arabic<69><63>prefix<69>h<06>suffix<69><78>.<2E>uhhph h&h!hh"h#hKubhL)<29><>}<7D>(h<05><># True parameters
w_star = mx.random.normal((num_features,))
# Input examples (design matrix)
X = mx.random.normal((num_examples, num_features))
# Noisy labels
eps = 1e-2 * mx.random.normal((num_examples,))
y = X @ w_star + eps<70>h]<5D>h0<68><30># True parameters
w_star = mx.random.normal((num_features,))
# Input examples (design matrix)
X = mx.random.normal((num_examples, num_features))
# Noisy labels
eps = 1e-2 * mx.random.normal((num_examples,))
y = X @ w_star + eps<70><73><EFBFBD><EFBFBD><EFBFBD>}<7D>h jsbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h[h\h]<5D>h^<5E>python<6F>h`}<7D>uhhKh"h#hKh h&h!hubh<)<29><>}<7D>(h<05><>We will use SGD to find the optimal weights. To start, define the squared loss
and get the gradient function of the loss with respect to the parameters.<2E>h]<5D>h0<68><30>We will use SGD to find the optimal weights. To start, define the squared loss
and get the gradient function of the loss with respect to the parameters.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j"h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK&h h&h!hubhL)<29><>}<7D>(h<05>Zdef loss_fn(w):
return 0.5 * mx.mean(mx.square(X @ w - y))
grad_fn = mx.grad(loss_fn)<29>h]<5D>h0<68>Zdef loss_fn(w):
return 0.5 * mx.mean(mx.square(X @ w - y))
grad_fn = mx.grad(loss_fn)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j0sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h[h\h]<5D>h^<5E>python<6F>h`}<7D>uhhKh"h#hK)h h&h!hubh<)<29><>}<7D>(h<05><>Start the optimization by initializing the parameters ``w`` randomly. Then
repeatedly update the parameters for ``num_iters`` iterations.<2E>h]<5D>(h0<68>6Start the optimization by initializing the parameters <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j@h!hh"NhNubh<62>)<29><>}<7D>(h<05>``w``<60>h]<5D>h0<68>w<><77><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jHh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh<>h j@ubh0<68>5 randomly. Then
repeatedly update the parameters for <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j@h!hh"NhNubh<62>)<29><>}<7D>(h<05> ``num_iters``<60>h]<5D>h0<68> num_iters<72><73><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jZh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh<>h j@ubh0<68> iterations.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j@h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK0h h&h!hubhL)<29><>}<7D>(h<05><>w = 1e-2 * mx.random.normal((num_features,))
for _ in range(num_iters):
grad = grad_fn(w)
w = w - lr * grad
mx.eval(w)<29>h]<5D>h0<68><30>w = 1e-2 * mx.random.normal((num_features,))
for _ in range(num_iters):
grad = grad_fn(w)
w = w - lr * grad
mx.eval(w)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h jrsbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h[h\h]<5D>h^<5E>python<6F>h`}<7D>uhhKh"h#hK3h h&h!hubh<)<29><>}<7D>(h<05>rFinally, compute the loss of the learned parameters and verify that they are
close to the ground truth parameters.<2E>h]<5D>h0<68>rFinally, compute the loss of the learned parameters and verify that they are
close to the ground truth parameters.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<h h&h!hubhL)<29><>}<7D>(h<05><>loss = loss_fn(w)
error_norm = mx.sum(mx.square(w - w_star)).item() ** 0.5
print(
f"Loss {loss.item():.5f}, |w-w*| = {error_norm:.5f}, "
)
# Should print something close to: Loss 0.00005, |w-w*| = 0.00364<EFBFBD>h]<5D>h0<68><30>loss = loss_fn(w)
error_norm = mx.sum(mx.square(w - w_star)).item() ** 0.5
print(
f"Loss {loss.item():.5f}, |w-w*| = {error_norm:.5f}, "
)
# Should print something close to: Loss 0.00005, |w-w*| = 0.00364<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h[h\h]<5D>h^<5E>python<6F>h`}<7D>uhhKh"h#hK?h h&h!hubh<)<29><>}<7D>(hXComplete `linear regression
<https://github.com/ml-explore/mlx/tree/main/examples/python/linear_regression.py>`_
and `logistic regression
<https://github.com/ml-explore/mlx/tree/main/examples/python/logistic_regression.py>`_
examples are available in the MLX GitHub repo.<2E>h]<5D>(h0<68> Complete <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh <09> reference<63><65><EFBFBD>)<29><>}<7D>(h<05>g`linear regression
<https://github.com/ml-explore/mlx/tree/main/examples/python/linear_regression.py>`_<>h]<5D>h0<68>linear regression<6F><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>name<6D><65>linear regression<6F><6E>refuri<72><69>Phttps://github.com/ml-explore/mlx/tree/main/examples/python/linear_regression.py<70>uhj<>h j<>ubh )<29><>}<7D>(h<05>S
<https://github.com/ml-explore/mlx/tree/main/examples/python/linear_regression.py><3E>h]<5D>h}<7D>(h]<5D><>id2<64>ah]<5D>h]<5D><>linear regression<6F>ah]<5D>h]<5D><>refuri<72>j<EFBFBD>uhh
<EFBFBD>
referenced<EFBFBD>Kh j<>ubh0<68>
and <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubj<62>)<29><>}<7D>(h<05>k`logistic regression
<https://github.com/ml-explore/mlx/tree/main/examples/python/logistic_regression.py>`_<>h]<5D>h0<68>logistic regression<6F><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>name<6D><65>logistic regression<6F>j<EFBFBD><00>Rhttps://github.com/ml-explore/mlx/tree/main/examples/python/logistic_regression.py<70>uhj<>h j<>ubh )<29><>}<7D>(h<05>U
<https://github.com/ml-explore/mlx/tree/main/examples/python/logistic_regression.py><3E>h]<5D>h}<7D>(h]<5D><>logistic-regression<6F>ah]<5D>h]<5D><>logistic regression<6F>ah]<5D>h]<5D><>refuri<72>j<EFBFBD>uhh
j<EFBFBD>Kh j<>ubh0<68>/
examples are available in the MLX GitHub repo.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKIh h&h!hubeh}<7D>(h]<5D>(h<1D>id1<64>eh]<5D>h]<5D><>linear_regression<6F>ah]<5D>j<EFBFBD>ah]<5D>uhh$h hh!hh"h#hKj<>K<01>expect_referenced_by_name<6D>}<7D>j<EFBFBD>h s<>expect_referenced_by_id<69>}<7D>hh subeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>source<63>h#uhh<01>current_source<63>N<EFBFBD> current_line<6E>N<EFBFBD>settings<67><73>docutils.frontend<6E><64>Values<65><73><EFBFBD>)<29><>}<7D>(h)N<> generator<6F>N<EFBFBD> datestamp<6D>N<EFBFBD> source_link<6E>N<EFBFBD>
source_url<EFBFBD>N<EFBFBD> toc_backlinks<6B><73>entry<72><79>footnote_backlinks<6B>K<01> sectnum_xform<72>K<01>strip_comments<74>N<EFBFBD>strip_elements_with_classes<65>N<EFBFBD> strip_classes<65>N<EFBFBD> report_level<65>K<02>
halt_level<EFBFBD>K<05>exit_status_level<65>K<05>debug<75>N<EFBFBD>warning_stream<61>N<EFBFBD> traceback<63><6B><EFBFBD>input_encoding<6E><67> utf-8-sig<69><67>input_encoding_error_handler<65><72>strict<63><74>output_encoding<6E><67>utf-8<><38>output_encoding_error_handler<65>j%<00>error_encoding<6E><67>utf-8<><38>error_encoding_error_handler<65><72>backslashreplace<63><65> language_code<64><65>en<65><6E>record_dependencies<65>N<EFBFBD>config<69>N<EFBFBD> id_prefix<69>h<06>auto_id_prefix<69><78>id<69><64> dump_settings<67>N<EFBFBD>dump_internals<6C>N<EFBFBD>dump_transforms<6D>N<EFBFBD>dump_pseudo_xml<6D>N<EFBFBD>expose_internals<6C>N<EFBFBD>strict_visitor<6F>N<EFBFBD>_disable_config<69>N<EFBFBD>_source<63>h#<23> _destination<6F>N<EFBFBD> _config_files<65>]<5D><>file_insertion_enabled<65><64><EFBFBD> raw_enabled<65>K<01>line_length_limit<69>M'<27>pep_references<65>N<EFBFBD> pep_base_url<72><6C>https://peps.python.org/<2F><>pep_file_url_template<74><65>pep-%04d<34><64>rfc_references<65>N<EFBFBD> rfc_base_url<72><6C>&https://datatracker.ietf.org/doc/html/<2F><> tab_width<74>K<08>trim_footnote_reference_space<63><65><EFBFBD>syntax_highlight<68><74>long<6E><67> smart_quotes<65><73><EFBFBD>smartquotes_locales<65>]<5D><>character_level_inline_markup<75><70><EFBFBD>doctitle_xform<72><6D><EFBFBD> docinfo_xform<72>K<01>sectsubtitle_xform<72><6D><EFBFBD> image_loading<6E><67>link<6E><6B>embed_stylesheet<65><74><EFBFBD>cloak_email_addresses<65><73><EFBFBD>section_self_link<6E><6B><EFBFBD>env<6E>Nub<75>reporter<65>N<EFBFBD>indirect_targets<74>]<5D><>substitution_defs<66>}<7D><>substitution_names<65>}<7D><>refnames<65>}<7D><>refids<64>}<7D>h]<5D>h as<61>nameids<64>}<7D>(j<>h<1D>linear regression<6F>j<EFBFBD>j<>j<>u<> nametypes<65>}<7D>(j<><00>ji<00>j<EFBFBD><00>uh}<7D>(hh&j<>h&j<>j<>j<>j<>u<> footnote_refs<66>}<7D><> citation_refs<66>}<7D><> autofootnotes<65>]<5D><>autofootnote_refs<66>]<5D><>symbol_footnotes<65>]<5D><>symbol_footnote_refs<66>]<5D><> footnotes<65>]<5D><> citations<6E>]<5D><>autofootnote_start<72>K<01>symbol_footnote_start<72>K<00>
id_counter<EFBFBD><EFBFBD> collections<6E><73>Counter<65><72><EFBFBD>}<7D>j3Ks<><73>R<EFBFBD><52>parse_messages<65>]<5D>h <09>system_message<67><65><EFBFBD>)<29><>}<7D>(hhh]<5D>h<)<29><>}<7D>(h<05>4Duplicate implicit target name: "linear regression".<2E>h]<5D>h0<68>8Duplicate implicit target name: “linear regression”.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>j<EFBFBD>a<>level<65>K<01>type<70><65>INFO<46><4F>source<63>h#<23>line<6E>Kuhj<>h h&h!hh"h#hKMuba<62>transform_messages<65>]<5D>j<EFBFBD>)<29><>}<7D>(hhh]<5D>h<)<29><>}<7D>(hhh]<5D>h0<68>7Hyperlink target "linear-regression" is not referenced.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>level<65>K<01>type<70>j<EFBFBD><00>source<63>h#<23>line<6E>Kuhj<>uba<62> transformer<65>N<EFBFBD> include_log<6F>]<5D><>
decoration<EFBFBD>Nh!hub.