Brian Schmidthttps://www.tekhnoal.com/2023-04-20T12:00:00-05:00Logging for ML Model Deployments2023-04-20T12:00:00-05:002023-04-20T12:00:00-05:00Brian Schmidttag:www.tekhnoal.com,2023-04-20:/logging-for-ml-models.html<p>As software systems become more and more complex, the people that build and operate these systems are finding that they are very hard to debug and inspect. To be able to solve this issue, a software system needs to be observable. An observable system is a system that allows an outside observer to infer the internal state of the system based purely on the data that it generates. The quality of "observability" helps the operators of a system to understand the inner workings of the system and to solve issues that may come up, even when the issues may be unprecedented. Just like any other software component, machine learning models need to create a log of events that may be useful later on. For example, we may want to know how many predictions the model made, how many errors occurred, and any other interesting events that we may want to keep track of. In this blog post we'll create a decorator that creates a log for a machine learning model.</p><h1>Logging for ML Model Deployments</h1>
<p>In previous blog posts we <a href="https://www.tekhnoal.com/ml-model-decorators.html">introduced the decorator pattern</a> for ML model deployments and then showed how to use the pattern to build extensions for an ML model deployment. For example, in <a href="https://www.tekhnoal.com/data-enrichment-for-ml-models.html">this blog post</a> we did data enrichment using a PostgreSQL database. The extensions were added without having to modify the machine learning model code at all, we were able to do it by using the decorator pattern. In this blog post we’ll add logging to a model deployment without having to modify the model code, using a decorator. </p>
<p>This blog post is written in a Jupyter notebook and we'll be switching between Python code and shell commands, the formatting will reflect this.</p>
<h2>Introduction</h2>
<p>As software systems become more and more complex, the people that build and operate these systems are finding that they are very hard to debug and inspect. To be able to solve this issue, a software system needs to be observable. An observable system is a system that allows an outside observer to infer the internal state of the system based purely on the data that it generates. The quality of "observability" helps the operators of a system to understand the inner workings of the system and to solve issues that may come up, even when the issues may be unprecedented.</p>
<p>Observability is a non-functional requirement (NFR) of a system. An NFR is a requirement that is placed on the operation of a system that has nothing to do with the specific functions of the system. Rather, it is a cross-cutting concern that needs to be addressed within the whole system design. Logging is a way that we can implement observability in a software system. </p>
<p>In the world of software systems, a "log" is a record of events that happen as software runs. A log is made up of individual records called log records that each represent a single event in the software system. Logs are useful for debugging the system, keeping a permanent record of its activities, and many other purposes. In general, log records are designed for debugging, alerting, and auditing the activities of the system.</p>
<p>Just like any other software component, machine learning models need to create a log of events that may be useful later on. For example, we may want to know how many predictions the model made, how many errors occurred, and any other interesting events that we may want to keep track of. In this blog post we'll create a decorator that creates a log for a machine learning model.</p>
<p>This post is not meant to be a full guide for doing logging in Python, but we'll include some background information to make it easier to understand. Logging in Python can get complicated and there are other places that cover it more thoroughly. <a href="https://realpython.com/python-logging/">Here</a> is a good place to learn more about Python logging.</p>
<p>All of the code is available in <a href="https://github.com/schmidtbri/logging-for-ml-models">this github repository</a>.</p>
<h2>Software Architecture</h2>
<p>The logging decorator will operate within the model service, but it requires outside services to handle the logs that it produces. This makes the software architecture more complicated and requires that we add several more services to the mix. </p>
<p><img alt="Software Architecture" src="https://www.tekhnoal.com/software_architecture_lfmlm.png" width="100%"></p>
<p>The logging decorator is executing right after the prediction request is received from the client and a prediction is made by the model, it will send logs to be handled by other services. The other services are:</p>
<ul>
<li>Log Forwarder: a service that runs on each cluster node that forwards logs from the local hard drive to the log aggregator service.</li>
<li>Log Storage: a service that can store logs and also query them.</li>
<li>Log User Interface: a service with a web interface that provides access to the logs stored in the log storage service.</li>
</ul>
<p>The specific services that we'll use will be detailed later in the blog post.</p>
<h2>Logging Best Practices</h2>
<p>There are certain things that we can do when we create a log for our application that makes it more useful, especially in production settings. For example, attaching a "level" to each log record makes it easy to filter the log according to the severity of the events. For example, a log record is of level "INFO" when it communicates a simple action that the system has taken. A "WARNING" log event is an event that may indicate a problem in the system, but the system can continue to run. A good description of the common log levels is <a href="https://sematext.com/blog/logging-levels/">here</a>.</p>
<p>Another good practice for logs is to include contextual information that can help to debug any problems that may arise in the execution of the code. For example, we can include the location in the codebase where the log record was generated. This information is very helpful during debugging and helps to quickly find the code that caused the event to happen. The information is often presented as the function name, code file name, and line number where the log record was generated. Another piece of useful contextual information is the hostname of the machine where the log was generated.</p>
<p>Logs should be easy to interpret for both humans and machines, this means that log records are often written in text strings. Humans can easily read text, but parsing a text string is complicated for machines. To allow both humans and machines to easily parse a log message, a good middle ground is to use JSON formatting. JSON-formatted logs are easy to parse, but also allow a human to quickly read and understand a log message.</p>
<p>Unique identifiers are useful to include in logs because they allow us to correlate many different log records together into a cohesive picture. For example, a correlation id is a unique ID that is generated to identify a specific transaction or query in a system. Adding unique identifiers to each log record can make it possible to debug complex problems that happen across system boundaries. A good description of correlation ids is <a href="https://hilton.org.uk/blog/microservices-correlation-id">here</a>.</p>
<h2>Logging in Python</h2>
<p>The python standard library has a module that can simplify logging. The logging module is imported and used like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">logging</span>
<span class="n">logger</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">getLogger</span><span class="p">()</span>
<span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">"Warning message."</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Warning message.
</code></pre></div>
<p>To start logging, we instantiated a logger object using the logging.getLogger() function. Then we used the logger object to log a WARNING message.</p>
<p>The log records are being sent to the stderr output of the process by default. We'll change that by instantiating a StreamHandler and pointing it at the stdout stream:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">sys</span>
<span class="n">stream_handler</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">StreamHandler</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">stdout</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">addHandler</span><span class="p">(</span><span class="n">stream_handler</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">"Warning message."</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Warning message.
</code></pre></div>
<p>We just replaced the original log handler that logged messages to stderror with another one that logs to stdout. A log handler is a software component that is able to send log messages to destinations outside of the running process.</p>
<p>We can also log messages at other levels, here is a WARNING and DEBUG message:</p>
<div class="highlight"><pre><span></span><code><span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">"Warning message."</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">debug</span><span class="p">(</span><span class="s2">"Debug message."</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Warning message.
</code></pre></div>
<p>When the code above executed, only the WARNING message was printed because the logger only sends log messages to the output that are at the WARNING level or above by default. This filtering functionality is helpful when you are only interested in logs above a certain level. We can change that by configuring the logger:</p>
<div class="highlight"><pre><span></span><code><span class="n">logger</span><span class="o">.</span><span class="n">setLevel</span><span class="p">(</span><span class="n">logging</span><span class="o">.</span><span class="n">DEBUG</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">"Warning message."</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">debug</span><span class="p">(</span><span class="s2">"Debug message."</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Warning message.
Debug message.
</code></pre></div>
<p>Now we can see the debug message. </p>
<p>We can put in more information to the log record by adding a formatter to the log handler:</p>
<div class="highlight"><pre><span></span><code><span class="n">formatter</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">Formatter</span><span class="p">(</span><span class="s1">'</span><span class="si">%(asctime)s</span><span class="s1">:</span><span class="si">%(name)s</span><span class="s1">:</span><span class="si">%(levelname)s</span><span class="s1">: </span><span class="si">%(message)s</span><span class="s1">'</span><span class="p">)</span>
<span class="n">stream_handler</span><span class="o">.</span><span class="n">setFormatter</span><span class="p">(</span><span class="n">formatter</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">"Warning message."</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">debug</span><span class="p">(</span><span class="s2">"Debug message."</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">2023</span><span class="o">-</span><span class="mf">04</span><span class="o">-</span><span class="mf">23</span><span class="w"> </span><span class="mf">21</span><span class="p">:</span><span class="mf">28</span><span class="p">:</span><span class="mf">47</span><span class="p">,</span><span class="mf">875</span><span class="p">:</span><span class="n">root</span><span class="p">:</span><span class="n">WARNING</span><span class="p">:</span><span class="w"> </span><span class="n">Warning</span><span class="w"> </span><span class="n">message</span><span class="mf">.</span><span class="w"></span>
<span class="mf">2023</span><span class="o">-</span><span class="mf">04</span><span class="o">-</span><span class="mf">23</span><span class="w"> </span><span class="mf">21</span><span class="p">:</span><span class="mf">28</span><span class="p">:</span><span class="mf">47</span><span class="p">,</span><span class="mf">876</span><span class="p">:</span><span class="n">root</span><span class="p">:</span><span class="n">DEBUG</span><span class="p">:</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="n">message</span><span class="mf">.</span><span class="w"></span>
</code></pre></div>
<p>A formatter is a software component that can format log messages according to a desired format. The log record now contains the date and time of the event, the name of the logger that generated the message, the level of the log, and the log message. These are all standard fields that are attached to log messages when they are created, more information about these fields can be found in the Python documentation <a href="https://docs.python.org/3/library/logging.html#logrecord-attributes">here</a>.</p>
<p>Each logger has a name attached to it when it is created, the name of the current logger is "root" because we created the logger without specifying a name. We can create a new logger with a name like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">logger</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">getLogger</span><span class="p">(</span><span class="s2">"test_logger"</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">debug</span><span class="p">(</span><span class="s2">"Debug message."</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">2023</span><span class="o">-</span><span class="mf">04</span><span class="o">-</span><span class="mf">23</span><span class="w"> </span><span class="mf">21</span><span class="p">:</span><span class="mf">28</span><span class="p">:</span><span class="mf">47</span><span class="p">,</span><span class="mf">881</span><span class="p">:</span><span class="n">test_logger</span><span class="p">:</span><span class="n">DEBUG</span><span class="p">:</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="n">message</span><span class="mf">.</span><span class="w"></span>
</code></pre></div>
<p>The log record has the name of the logger, which is not the root logger that we were using before.</p>
<h3>Logging Environment Variables</h3>
<p>To log extra information that is not available by default within each log record we have to extend the logging module by creating Filter classes. A Filter is simply a class that filters log records and can also modify them. This information will come from the environment variables of the process in which the logger is running. </p>
<p>To do this we'll create a Filter that is able to pick up information from the environment variables and add it to each log record. </p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">os</span>
<span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span>
<span class="kn">from</span> <span class="nn">logging</span> <span class="kn">import</span> <span class="n">Filter</span>
<span class="k">class</span> <span class="nc">EnvironmentInfoFilter</span><span class="p">(</span><span class="n">Filter</span><span class="p">):</span>
<span class="sd">"""Logging filter that adds information to log records from environment variables."""</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">env_variables</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_env_variables</span> <span class="o">=</span> <span class="n">env_variables</span>
<span class="k">def</span> <span class="nf">filter</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">record</span><span class="p">):</span>
<span class="k">for</span> <span class="n">env_variable</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_env_variables</span><span class="p">:</span>
<span class="n">record</span><span class="o">.</span><span class="fm">__setattr__</span><span class="p">(</span><span class="n">env_variable</span><span class="o">.</span><span class="n">lower</span><span class="p">(),</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">env_variable</span><span class="p">,</span> <span class="s2">"N/A"</span><span class="p">))</span>
<span class="k">return</span> <span class="kc">True</span>
</code></pre></div>
<p>To try it out we'll have to add an environment variable that will be logged:</p>
<div class="highlight"><pre><span></span><code><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"NODE_IP"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"198.197.196.195"</span>
</code></pre></div>
<p>Next, we'll instantiate the Filter class and add it to a logger instance to see how it works.</p>
<div class="highlight"><pre><span></span><code><span class="n">environment_info_filter</span> <span class="o">=</span> <span class="n">EnvironmentInfoFilter</span><span class="p">(</span><span class="n">env_variables</span><span class="o">=</span><span class="p">[</span><span class="s2">"NODE_IP"</span><span class="p">])</span>
<span class="n">logger</span><span class="o">.</span><span class="n">addFilter</span><span class="p">(</span><span class="n">environment_info_filter</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">formatter</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">Formatter</span><span class="p">(</span><span class="s1">'</span><span class="si">%(asctime)s</span><span class="s1"> : </span><span class="si">%(name)s</span><span class="s1"> : </span><span class="si">%(levelname)s</span><span class="s1"> : </span><span class="si">%(node_ip)s</span><span class="s1"> : </span><span class="si">%(message)s</span><span class="s1">'</span><span class="p">)</span>
<span class="n">stream_handler</span><span class="o">.</span><span class="n">setFormatter</span><span class="p">(</span><span class="n">formatter</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">"Warning message."</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">debug</span><span class="p">(</span><span class="s2">"Debug message."</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">2023</span><span class="o">-</span><span class="mf">04</span><span class="o">-</span><span class="mf">23</span><span class="w"> </span><span class="mf">21</span><span class="p">:</span><span class="mf">28</span><span class="p">:</span><span class="mf">47</span><span class="p">,</span><span class="mf">910</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="n">test_logger</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="n">WARNING</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="mf">198.197.196.195</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="n">Warning</span><span class="w"> </span><span class="n">message</span><span class="mf">.</span><span class="w"></span>
<span class="mf">2023</span><span class="o">-</span><span class="mf">04</span><span class="o">-</span><span class="mf">23</span><span class="w"> </span><span class="mf">21</span><span class="p">:</span><span class="mf">28</span><span class="p">:</span><span class="mf">47</span><span class="p">,</span><span class="mf">911</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="n">test_logger</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="n">DEBUG</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="mf">198.197.196.195</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="n">message</span><span class="mf">.</span><span class="w"></span>
</code></pre></div>
<p>The log record now contains the IP address that we set in the environment variables.</p>
<h3>Logging in JSON</h3>
<p>So far, the logs we've been generated have been in a slightly structured format that we came up with. It uses colons to separate out different sections of the log record. If we want to easily parse the logs to extract information from them, we should instead use JSON records. In this section we'll use the python-json-logger package to format the log records as JSON strings. </p>
<p>First, we'll install the package:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">clear_output</span>
<span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">python</span><span class="o">-</span><span class="n">json</span><span class="o">-</span><span class="n">logger</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>We'll instantiate a JsonFormatter object that will convert the logs to JSON:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pythonjsonlogger</span> <span class="kn">import</span> <span class="n">jsonlogger</span>
<span class="n">json_formatter</span> <span class="o">=</span> <span class="n">jsonlogger</span><span class="o">.</span><span class="n">JsonFormatter</span><span class="p">(</span><span class="s2">"</span><span class="si">%(asctime)s</span><span class="s2"> </span><span class="si">%(name)s</span><span class="s2"> </span><span class="si">%(levelname)s</span><span class="s2"> </span><span class="si">%(node_ip)s</span><span class="s2"> </span><span class="si">%(message)s</span><span class="s2">"</span><span class="p">)</span>
</code></pre></div>
<p>We'll add the formatter to the stream handler that we created above like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">stream_handler</span><span class="o">.</span><span class="n">setFormatter</span><span class="p">(</span><span class="n">json_formatter</span><span class="p">)</span>
</code></pre></div>
<p>Now when we log, the output will be a JSON string:</p>
<div class="highlight"><pre><span></span><code><span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s2">"Error message."</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"asctime": "2023-04-23 21:28:50,037", "name": "test_logger", "levelname": "ERROR", "node_ip": "198.197.196.195", "message": "Error message."}
</code></pre></div>
<p>We can add easily add more fields from the log record to make it more comprehensive:</p>
<div class="highlight"><pre><span></span><code><span class="n">json_formatter</span> <span class="o">=</span> <span class="n">jsonlogger</span><span class="o">.</span><span class="n">JsonFormatter</span><span class="p">(</span><span class="s2">"</span><span class="si">%(asctime)s</span><span class="s2"> </span><span class="si">%(node_ip)s</span><span class="s2"> </span><span class="si">%(process)s</span><span class="s2"> </span><span class="si">%(thread)s</span><span class="s2"> </span><span class="si">%(pathname)s</span><span class="s2"> </span><span class="si">%(lineno)s</span><span class="s2"> </span><span class="si">%(levelname)s</span><span class="s2"> </span><span class="si">%(message)s</span><span class="s2">"</span><span class="p">)</span>
<span class="n">stream_handler</span><span class="o">.</span><span class="n">setFormatter</span><span class="p">(</span><span class="n">json_formatter</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s2">"Error message."</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s2">"asctime"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2023-04-23 21:28:50,047"</span><span class="p">,</span><span class="w"> </span><span class="s2">"node_ip"</span><span class="p">:</span><span class="w"> </span><span class="s2">"198.197.196.195"</span><span class="p">,</span><span class="w"> </span><span class="s2">"process"</span><span class="p">:</span><span class="w"> </span><span class="mi">793</span><span class="p">,</span><span class="w"> </span><span class="s2">"thread"</span><span class="p">:</span><span class="w"> </span><span class="mi">140704422703936</span><span class="p">,</span><span class="w"> </span><span class="s2">"pathname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/var/folders/vb/ym0r3p412kg598rdky_lb5_w0000gn/T/ipykernel_793/2505421541.py"</span><span class="p">,</span><span class="w"> </span><span class="s2">"lineno"</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span><span class="w"> </span><span class="s2">"levelname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ERROR"</span><span class="p">,</span><span class="w"> </span><span class="s2">"message"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Error message."</span><span class="p">}</span><span class="w"></span>
</code></pre></div>
<p>Some of these fields were added by the Filter that we built above, other fields are <a href="https://docs.python.org/3/library/logging.html#logrecord-attributes">default fields</a> provided by the Python logging module.</p>
<p>The JSON formatter can also add extra fields and values to the log record by using the "extra" parameter:</p>
<div class="highlight"><pre><span></span><code><span class="n">extra</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"action"</span><span class="p">:</span> <span class="s2">"predict"</span><span class="p">,</span>
<span class="s2">"model_qualified_name"</span><span class="p">:</span> <span class="s2">"model_qualified_name"</span><span class="p">,</span>
<span class="s2">"model_version"</span><span class="p">:</span> <span class="s2">"model_version"</span><span class="p">,</span>
<span class="s2">"status"</span><span class="p">:</span><span class="s2">"error"</span><span class="p">,</span>
<span class="s2">"error_info"</span><span class="p">:</span> <span class="s2">"error_info"</span>
<span class="p">}</span>
<span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s2">"message"</span><span class="p">,</span> <span class="n">extra</span><span class="o">=</span><span class="n">extra</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s2">"asctime"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2023-04-23 21:28:50,057"</span><span class="p">,</span><span class="w"> </span><span class="s2">"node_ip"</span><span class="p">:</span><span class="w"> </span><span class="s2">"198.197.196.195"</span><span class="p">,</span><span class="w"> </span><span class="s2">"process"</span><span class="p">:</span><span class="w"> </span><span class="mi">793</span><span class="p">,</span><span class="w"> </span><span class="s2">"thread"</span><span class="p">:</span><span class="w"> </span><span class="mi">140704422703936</span><span class="p">,</span><span class="w"> </span><span class="s2">"pathname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/var/folders/vb/ym0r3p412kg598rdky_lb5_w0000gn/T/ipykernel_793/1433050719.py"</span><span class="p">,</span><span class="w"> </span><span class="s2">"lineno"</span><span class="p">:</span><span class="w"> </span><span class="mi">9</span><span class="p">,</span><span class="w"> </span><span class="s2">"levelname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ERROR"</span><span class="p">,</span><span class="w"> </span><span class="s2">"message"</span><span class="p">:</span><span class="w"> </span><span class="s2">"message"</span><span class="p">,</span><span class="w"> </span><span class="s2">"action"</span><span class="p">:</span><span class="w"> </span><span class="s2">"predict"</span><span class="p">,</span><span class="w"> </span><span class="s2">"model_qualified_name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"model_qualified_name"</span><span class="p">,</span><span class="w"> </span><span class="s2">"model_version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"model_version"</span><span class="p">,</span><span class="w"> </span><span class="s2">"status"</span><span class="p">:</span><span class="w"> </span><span class="s2">"error"</span><span class="p">,</span><span class="w"> </span><span class="s2">"error_info"</span><span class="p">:</span><span class="w"> </span><span class="s2">"error_info"</span><span class="p">}</span><span class="w"></span>
</code></pre></div>
<p>The extra fields are:</p>
<ul>
<li>action: the method called on the MLModel instance</li>
<li>model_qualified_name: the qualified name of the model</li>
<li>model_version: the version of the model</li>
<li>status: whether the action succeeded or not, can be "success" or "error"</li>
<li>error_info: extra error information, only present if an error occurred</li>
</ul>
<p>This information would normally be included in the "message" field of the log record as unstructured text, but by breaking it out and putting it into individual fields in the JSON log record we'll be able to parse it later.</p>
<h3>Putting It All Together</h3>
<p>We've done a few things with the logger module, now we need to put it all together into one configuration that we can use to set up the logger the way we want it.</p>
<p>The logging.config.dictConfig() function can accept all of the options of the loggers, formatters, handlers, and filters and set them up with one function call.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">logging.config</span>
<span class="n">logging_config</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"version"</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>
<span class="s2">"disable_existing_loggers"</span><span class="p">:</span> <span class="kc">True</span><span class="p">,</span>
<span class="s2">"loggers"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"root"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"level"</span><span class="p">:</span> <span class="s2">"INFO"</span><span class="p">,</span>
<span class="s2">"handlers"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"stdout"</span><span class="p">],</span>
<span class="s2">"propagate"</span><span class="p">:</span> <span class="kc">False</span>
<span class="p">}</span>
<span class="p">},</span>
<span class="s2">"filters"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"environment_info_filter"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"()"</span><span class="p">:</span> <span class="s2">"__main__.EnvironmentInfoFilter"</span><span class="p">,</span>
<span class="s2">"env_variables"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"NODE_IP"</span><span class="p">]</span>
<span class="p">}</span>
<span class="p">},</span>
<span class="s2">"formatters"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"json_formatter"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"class"</span><span class="p">:</span> <span class="s2">"pythonjsonlogger.jsonlogger.JsonFormatter"</span><span class="p">,</span>
<span class="s2">"format"</span><span class="p">:</span> <span class="s2">"</span><span class="si">%(asctime)s</span><span class="s2"> </span><span class="si">%(node_ip)s</span><span class="s2"> </span><span class="si">%(name)s</span><span class="s2"> </span><span class="si">%(pathname)s</span><span class="s2"> </span><span class="si">%(lineno)s</span><span class="s2"> </span><span class="si">%(levelname)s</span><span class="s2"> </span><span class="si">%(message)s</span><span class="s2">"</span>
<span class="p">}</span>
<span class="p">},</span>
<span class="s2">"handlers"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"stdout"</span><span class="p">:{</span>
<span class="s2">"level"</span><span class="p">:</span><span class="s2">"INFO"</span><span class="p">,</span>
<span class="s2">"class"</span><span class="p">:</span><span class="s2">"logging.StreamHandler"</span><span class="p">,</span>
<span class="s2">"stream"</span><span class="p">:</span> <span class="s2">"ext://sys.stdout"</span><span class="p">,</span>
<span class="s2">"formatter"</span><span class="p">:</span> <span class="s2">"json_formatter"</span><span class="p">,</span>
<span class="s2">"filters"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"environment_info_filter"</span><span class="p">]</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">logging</span><span class="o">.</span><span class="n">config</span><span class="o">.</span><span class="n">dictConfig</span><span class="p">(</span><span class="n">logging_config</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">logger</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">getLogger</span><span class="p">()</span>
<span class="n">logger</span><span class="o">.</span><span class="n">debug</span><span class="p">(</span><span class="s2">"Debug message."</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"Info message."</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s2">"Error message."</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s2">"asctime"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2023-04-23 21:28:50,074"</span><span class="p">,</span><span class="w"> </span><span class="s2">"node_ip"</span><span class="p">:</span><span class="w"> </span><span class="s2">"198.197.196.195"</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"root"</span><span class="p">,</span><span class="w"> </span><span class="s2">"pathname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/var/folders/vb/ym0r3p412kg598rdky_lb5_w0000gn/T/ipykernel_793/4067465749.py"</span><span class="p">,</span><span class="w"> </span><span class="s2">"lineno"</span><span class="p">:</span><span class="w"> </span><span class="mi">4</span><span class="p">,</span><span class="w"> </span><span class="s2">"levelname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"INFO"</span><span class="p">,</span><span class="w"> </span><span class="s2">"message"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Info message."</span><span class="p">}</span><span class="w"></span>
<span class="p">{</span><span class="s2">"asctime"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2023-04-23 21:28:50,076"</span><span class="p">,</span><span class="w"> </span><span class="s2">"node_ip"</span><span class="p">:</span><span class="w"> </span><span class="s2">"198.197.196.195"</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"root"</span><span class="p">,</span><span class="w"> </span><span class="s2">"pathname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/var/folders/vb/ym0r3p412kg598rdky_lb5_w0000gn/T/ipykernel_793/4067465749.py"</span><span class="p">,</span><span class="w"> </span><span class="s2">"lineno"</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span><span class="w"> </span><span class="s2">"levelname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ERROR"</span><span class="p">,</span><span class="w"> </span><span class="s2">"message"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Error message."</span><span class="p">}</span><span class="w"></span>
</code></pre></div>
<p>The logger behaved in the same way as when we created it programmatically.</p>
<h2>Installing a Model</h2>
<p>We won't be training an ML model from scratch in this blog post because it would take a lot of space in the post. We'll be reusing a model that we built in a <a href="https://www.tekhnoal.com/health-checks-for-ml-model-deployments.html">previous blog post</a>. The model's code is hosted in <a href="https://github.com/schmidtbri/health-checks-for-ml-model-deployments">this github repository</a>. The model is used to predict credit risk.</p>
<p>The model itself can be installed as a normal Python package, using the pip command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="o">-</span><span class="n">e</span> <span class="n">git</span><span class="o">+</span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">github</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">schmidtbri</span><span class="o">/</span><span class="n">health</span><span class="o">-</span><span class="n">checks</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ml</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">deployments</span><span class="c1">#egg=credit_risk_model</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Making a prediction with the model is done through the CreditRiskModel class, which we'll import like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">credit_risk_model.prediction.model</span> <span class="kn">import</span> <span class="n">CreditRiskModel</span>
</code></pre></div>
<p>Now we'll instantiate the model class in order to make a prediction.</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">CreditRiskModel</span><span class="p">()</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>In order to make a prediction with the model instance, we'll need to instantiate the input:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">credit_risk_model.prediction.schemas</span> <span class="kn">import</span> <span class="n">CreditRiskModelInput</span><span class="p">,</span> <span class="n">EmploymentLength</span><span class="p">,</span> <span class="n">HomeOwnership</span><span class="p">,</span> \
<span class="n">LoanPurpose</span><span class="p">,</span> <span class="n">LoanPurpose</span><span class="p">,</span> <span class="n">Term</span><span class="p">,</span> <span class="n">VerificationStatus</span>
<span class="n">model_input</span> <span class="o">=</span> <span class="n">CreditRiskModelInput</span><span class="p">(</span>
<span class="n">annual_income</span><span class="o">=</span><span class="mi">273000</span><span class="p">,</span>
<span class="n">collections_in_last_12_months</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span>
<span class="n">delinquencies_in_last_2_years</span><span class="o">=</span><span class="mi">39</span><span class="p">,</span>
<span class="n">debt_to_income_ratio</span><span class="o">=</span><span class="mf">42.64</span><span class="p">,</span>
<span class="n">employment_length</span><span class="o">=</span><span class="n">EmploymentLength</span><span class="o">.</span><span class="n">less_than_1_year</span><span class="p">,</span>
<span class="n">home_ownership</span><span class="o">=</span><span class="n">HomeOwnership</span><span class="o">.</span><span class="n">MORTGAGE</span><span class="p">,</span>
<span class="n">number_of_delinquent_accounts</span><span class="o">=</span><span class="mi">6</span><span class="p">,</span>
<span class="n">interest_rate</span><span class="o">=</span><span class="mf">28.99</span><span class="p">,</span>
<span class="n">last_payment_amount</span><span class="o">=</span><span class="mf">36475.59</span><span class="p">,</span>
<span class="n">loan_amount</span><span class="o">=</span><span class="mi">35000</span><span class="p">,</span>
<span class="n">derogatory_public_record_count</span><span class="o">=</span><span class="mi">86</span><span class="p">,</span>
<span class="n">loan_purpose</span><span class="o">=</span><span class="n">LoanPurpose</span><span class="o">.</span><span class="n">debt_consolidation</span><span class="p">,</span>
<span class="n">revolving_line_utilization_rate</span><span class="o">=</span><span class="mf">892.3</span><span class="p">,</span>
<span class="n">term</span><span class="o">=</span><span class="n">Term</span><span class="o">.</span><span class="n">thirty_six_months</span><span class="p">,</span>
<span class="n">total_payments_to_date</span><span class="o">=</span><span class="mf">57777.58</span><span class="p">,</span>
<span class="n">verification_status</span><span class="o">=</span><span class="n">VerificationStatus</span><span class="o">.</span><span class="n">source_verified</span>
<span class="p">)</span>
</code></pre></div>
<p>The model's input schema is called CreditRiskModelInput and it holds all of the features required by the model to make a prediction.</p>
<p>Now we can make a prediction with the model by calling the predict() method with an instance of the CreditRiskModelInput class.</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>CreditRiskModelOutput(credit_risk=<CreditRisk.safe: 'safe'>)
</code></pre></div>
<p>The model predicts that the client's risk is safe.</p>
<p>The output is also provided as an object, and because the model is a classification model, the output is an Enum. We can view the schema of the model output by requesting the JSON schema from the object:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span><span class="o">.</span><span class="n">output_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'CreditRiskModelOutput'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Credit risk model output schema.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'object'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'properties'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'credit_risk'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Whether or not the loan is risky.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/CreditRisk'</span><span class="p">}]}},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'required'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'credit_risk'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'definitions'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'CreditRisk'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'CreditRisk'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Indicates if loan is risky.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'safe'</span><span class="p">,</span><span class="w"> </span><span class="s1">'risky'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">}}}</span><span class="w"></span>
</code></pre></div>
<p>The two possible outputs of the model are "safe" and "risky".</p>
<h2>Creating the Logging Decorator</h2>
<p>As you saw above, the model did not produce any logs. To be able to emit some logs about the model's activity, we'll create a Decorator that will do logging around an MLModel instance. </p>
<p>In order to build a MLModel decorator class, we'll need to inherit from the MLModelDecorator class and add some functionality.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span><span class="p">,</span> <span class="n">Optional</span>
<span class="kn">import</span> <span class="nn">logging</span>
<span class="kn">from</span> <span class="nn">ml_base.decorator</span> <span class="kn">import</span> <span class="n">MLModelDecorator</span>
<span class="kn">from</span> <span class="nn">ml_base.ml_model</span> <span class="kn">import</span> <span class="n">MLModelSchemaValidationException</span>
<span class="k">class</span> <span class="nc">LoggingDecorator</span><span class="p">(</span><span class="n">MLModelDecorator</span><span class="p">):</span>
<span class="sd">"""Decorator to do logging around an MLModel instance."""</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">input_fields</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span>
<span class="n">output_fields</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">)</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">input_fields</span><span class="o">=</span><span class="n">input_fields</span><span class="p">,</span> <span class="n">output_fields</span><span class="o">=</span><span class="n">output_fields</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_logger"</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_logger"</span><span class="p">]</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_logger"</span><span class="p">]</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">getLogger</span><span class="p">(</span><span class="s2">"</span><span class="si">{}</span><span class="s2">_</span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">,</span> <span class="s2">"logger"</span><span class="p">))</span>
<span class="c1"># extra fields to be added to the log record</span>
<span class="n">extra</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"action"</span><span class="p">:</span> <span class="s2">"predict"</span><span class="p">,</span>
<span class="s2">"model_qualified_name"</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">,</span>
<span class="s2">"model_version"</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">version</span>
<span class="p">}</span>
<span class="c1"># adding model input fields to the extra fields to be logged</span>
<span class="n">new_extra</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="n">extra</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"input_fields"</span><span class="p">]</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">for</span> <span class="n">input_field</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"input_fields"</span><span class="p">]:</span>
<span class="n">new_extra</span><span class="p">[</span><span class="n">input_field</span><span class="p">]</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">input_field</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_logger"</span><span class="p">]</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"Prediction requested."</span><span class="p">,</span> <span class="n">extra</span><span class="o">=</span><span class="n">new_extra</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="n">extra</span><span class="p">[</span><span class="s2">"status"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"success"</span>
<span class="c1"># adding model output fields to the extra fields to be logged</span>
<span class="n">new_extra</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="n">extra</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"output_fields"</span><span class="p">]</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">for</span> <span class="n">output_field</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"output_fields"</span><span class="p">]:</span>
<span class="n">new_extra</span><span class="p">[</span><span class="n">output_field</span><span class="p">]</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">prediction</span><span class="p">,</span> <span class="n">output_field</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_logger"</span><span class="p">]</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"Prediction created."</span><span class="p">,</span> <span class="n">extra</span><span class="o">=</span><span class="n">new_extra</span><span class="p">)</span>
<span class="k">return</span> <span class="n">prediction</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">extra</span><span class="p">[</span><span class="s2">"status"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"error"</span>
<span class="n">extra</span><span class="p">[</span><span class="s2">"error_info"</span><span class="p">]</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_logger"</span><span class="p">]</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s2">"Prediction exception."</span><span class="p">,</span> <span class="n">extra</span><span class="o">=</span><span class="n">extra</span><span class="p">)</span>
<span class="k">raise</span> <span class="n">e</span>
</code></pre></div>
<p>The LoggingDecorator class has most of its logic in the predict() method. This method simply instantiates a logger object and logs a message before a prediction is made, after it is made, and in the case when an exception is raised. Notice that the exception information is logged, but the exception is re-raised immediately after. We don't want to keep the exception from being handled by whatever code is using the model, we just need to emit a log of the event.</p>
<p>The decorator also adds a few fields to the log message:</p>
<ul>
<li>action: the action that the model is performing, in this case "prediction"</li>
<li>model_qualified_name: the qualified name of the model performing the action</li>
<li>model_version: the version of the model performing the action</li>
<li>status: the result of the action, can be either "success" or "error"</li>
<li>error_info: an optional field that adds error information when an exception is raised</li>
</ul>
<p>These fields are added on top of all the regular fields that the logging package provides. The extra information should allow us to easily filter logs later.</p>
<h2>Decorating the Model</h2>
<p>To test out the decorator we’ll first instantiate the model object that we want to use with the decorator.</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">CreditRiskModel</span><span class="p">()</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Next, we’ll instantiate the decorator:</p>
<div class="highlight"><pre><span></span><code><span class="n">logging_decorator</span> <span class="o">=</span> <span class="n">LoggingDecorator</span><span class="p">()</span>
</code></pre></div>
<p>We can add the model instance to the decorator after it’s been instantiated like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_model</span> <span class="o">=</span> <span class="n">logging_decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<p>We can see the decorator and the model objects by printing the reference to the decorator:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_model</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>LoggingDecorator(CreditRiskModel)
</code></pre></div>
<p>The decorator object is printing out its own type along with the type of the model that it is decorating.</p>
<p>Now we can try out the logging decorator by making a prediction:</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s2">"asctime"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2023-04-23 21:28:57,431"</span><span class="p">,</span><span class="w"> </span><span class="s2">"node_ip"</span><span class="p">:</span><span class="w"> </span><span class="s2">"198.197.196.195"</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"credit_risk_model_logger"</span><span class="p">,</span><span class="w"> </span><span class="s2">"pathname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/var/folders/vb/ym0r3p412kg598rdky_lb5_w0000gn/T/ipykernel_793/3804123212.py"</span><span class="p">,</span><span class="w"> </span><span class="s2">"lineno"</span><span class="p">:</span><span class="w"> </span><span class="mi">33</span><span class="p">,</span><span class="w"> </span><span class="s2">"levelname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"INFO"</span><span class="p">,</span><span class="w"> </span><span class="s2">"message"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Prediction requested."</span><span class="p">,</span><span class="w"> </span><span class="s2">"action"</span><span class="p">:</span><span class="w"> </span><span class="s2">"predict"</span><span class="p">,</span><span class="w"> </span><span class="s2">"model_qualified_name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"credit_risk_model"</span><span class="p">,</span><span class="w"> </span><span class="s2">"model_version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"0.1.0"</span><span class="p">}</span><span class="w"></span>
<span class="p">{</span><span class="s2">"asctime"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2023-04-23 21:28:57,452"</span><span class="p">,</span><span class="w"> </span><span class="s2">"node_ip"</span><span class="p">:</span><span class="w"> </span><span class="s2">"198.197.196.195"</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"credit_risk_model_logger"</span><span class="p">,</span><span class="w"> </span><span class="s2">"pathname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/var/folders/vb/ym0r3p412kg598rdky_lb5_w0000gn/T/ipykernel_793/3804123212.py"</span><span class="p">,</span><span class="w"> </span><span class="s2">"lineno"</span><span class="p">:</span><span class="w"> </span><span class="mi">44</span><span class="p">,</span><span class="w"> </span><span class="s2">"levelname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"INFO"</span><span class="p">,</span><span class="w"> </span><span class="s2">"message"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Prediction created."</span><span class="p">,</span><span class="w"> </span><span class="s2">"action"</span><span class="p">:</span><span class="w"> </span><span class="s2">"predict"</span><span class="p">,</span><span class="w"> </span><span class="s2">"model_qualified_name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"credit_risk_model"</span><span class="p">,</span><span class="w"> </span><span class="s2">"model_version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"0.1.0"</span><span class="p">,</span><span class="w"> </span><span class="s2">"status"</span><span class="p">:</span><span class="w"> </span><span class="s2">"success"</span><span class="p">}</span><span class="w"></span>
<span class="n">CreditRiskModelOutput</span><span class="p">(</span><span class="n">credit_risk</span><span class="o">=<</span><span class="n">CreditRisk</span><span class="o">.</span><span class="n">safe</span><span class="p">:</span><span class="w"> </span><span class="s1">'safe'</span><span class="o">></span><span class="p">)</span><span class="w"></span>
</code></pre></div>
<p>Calling the predict() method on the decorated model now emits two log messages. The first message is a "Prediction requested." message and happens before the model's predict method is called. The second is a "Prediction created." message and happens after the prediction is returned by the model to the decorator. The decorator can also log exceptions made by the model.</p>
<p>The logging decorator is also able to grab fields from the model's input and output and log those alongside the other fields. Here is how to configure the logging decorator to do this:</p>
<div class="highlight"><pre><span></span><code><span class="n">logging_decorator</span> <span class="o">=</span> <span class="n">LoggingDecorator</span><span class="p">(</span><span class="n">input_fields</span><span class="o">=</span><span class="p">[</span><span class="s2">"collections_in_last_12_months"</span><span class="p">,</span> <span class="s2">"debt_to_income_ratio"</span><span class="p">],</span>
<span class="n">output_fields</span><span class="o">=</span><span class="p">[</span><span class="s2">"credit_risk"</span><span class="p">])</span>
<span class="n">decorated_model</span> <span class="o">=</span> <span class="n">logging_decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s2">"asctime"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2023-04-23 21:28:57,461"</span><span class="p">,</span><span class="w"> </span><span class="s2">"node_ip"</span><span class="p">:</span><span class="w"> </span><span class="s2">"198.197.196.195"</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"credit_risk_model_logger"</span><span class="p">,</span><span class="w"> </span><span class="s2">"pathname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/var/folders/vb/ym0r3p412kg598rdky_lb5_w0000gn/T/ipykernel_793/3804123212.py"</span><span class="p">,</span><span class="w"> </span><span class="s2">"lineno"</span><span class="p">:</span><span class="w"> </span><span class="mi">33</span><span class="p">,</span><span class="w"> </span><span class="s2">"levelname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"INFO"</span><span class="p">,</span><span class="w"> </span><span class="s2">"message"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Prediction requested."</span><span class="p">,</span><span class="w"> </span><span class="s2">"action"</span><span class="p">:</span><span class="w"> </span><span class="s2">"predict"</span><span class="p">,</span><span class="w"> </span><span class="s2">"model_qualified_name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"credit_risk_model"</span><span class="p">,</span><span class="w"> </span><span class="s2">"model_version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"0.1.0"</span><span class="p">,</span><span class="w"> </span><span class="s2">"collections_in_last_12_months"</span><span class="p">:</span><span class="w"> </span><span class="mi">20</span><span class="p">,</span><span class="w"> </span><span class="s2">"debt_to_income_ratio"</span><span class="p">:</span><span class="w"> </span><span class="mf">42.64</span><span class="p">}</span><span class="w"></span>
<span class="p">{</span><span class="s2">"asctime"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2023-04-23 21:28:57,480"</span><span class="p">,</span><span class="w"> </span><span class="s2">"node_ip"</span><span class="p">:</span><span class="w"> </span><span class="s2">"198.197.196.195"</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"credit_risk_model_logger"</span><span class="p">,</span><span class="w"> </span><span class="s2">"pathname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/var/folders/vb/ym0r3p412kg598rdky_lb5_w0000gn/T/ipykernel_793/3804123212.py"</span><span class="p">,</span><span class="w"> </span><span class="s2">"lineno"</span><span class="p">:</span><span class="w"> </span><span class="mi">44</span><span class="p">,</span><span class="w"> </span><span class="s2">"levelname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"INFO"</span><span class="p">,</span><span class="w"> </span><span class="s2">"message"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Prediction created."</span><span class="p">,</span><span class="w"> </span><span class="s2">"action"</span><span class="p">:</span><span class="w"> </span><span class="s2">"predict"</span><span class="p">,</span><span class="w"> </span><span class="s2">"model_qualified_name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"credit_risk_model"</span><span class="p">,</span><span class="w"> </span><span class="s2">"model_version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"0.1.0"</span><span class="p">,</span><span class="w"> </span><span class="s2">"status"</span><span class="p">:</span><span class="w"> </span><span class="s2">"success"</span><span class="p">,</span><span class="w"> </span><span class="s2">"credit_risk"</span><span class="p">:</span><span class="w"> </span><span class="s2">"safe"</span><span class="p">}</span><span class="w"></span>
<span class="n">CreditRiskModelOutput</span><span class="p">(</span><span class="n">credit_risk</span><span class="o">=<</span><span class="n">CreditRisk</span><span class="o">.</span><span class="n">safe</span><span class="p">:</span><span class="w"> </span><span class="s1">'safe'</span><span class="o">></span><span class="p">)</span><span class="w"></span>
</code></pre></div>
<p>The "Prediction requested." log message now has two extra fields, the "collections_in_last_12_months" field and the "debt_to_income_ratio" field which were directly copied from the model input. The "Prediction created." log message also has the "credit_risk" field, which is the prediction returned by the model.</p>
<p>We now have a working logging decorator that can help us to do logging if the model does not do logging for itself.</p>
<h2>Adding the Decorator to a Deployed Model</h2>
<p>Now that we have a decorator that works locally, we can deploy it with a model inside of a service. The <a href="https://pypi.org/project/rest-model-service/">rest_model_service package</a> is able to host ML models and create a RESTful API for each individual model. We don't need to write any code to do this because the service can decorate the models that it hosts with decorators that we provide. You can learn more about the package in <a href="https://www.tekhnoal.com/rest-model-service.html">this blog post</a>. You can learn how the rest_model_service package can be configured to add decorators to a model in <a href="https://www.tekhnoal.com/ml-model-decorators.html">this blog post</a>.</p>
<p>To install the service package, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">rest_model_service</span><span class="o">>=</span><span class="mf">0.3.0</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>The configuration for our model and decorator looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Credit Risk Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">credit_risk_model.prediction.model.CreditRiskModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
<span class="w"> </span><span class="nt">decorators</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">ml_model_logging.logging_decorator.LoggingDecorator</span><span class="w"></span>
<span class="w"> </span><span class="nt">configuration</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">input_fields</span><span class="p">:</span><span class="w"> </span><span class="p p-Indicator">[</span><span class="s">"collections_in_last_12_months"</span><span class="p p-Indicator">,</span><span class="w"> </span><span class="s">"debt_to_income_ratio"</span><span class="p p-Indicator">]</span><span class="w"></span>
<span class="w"> </span><span class="nt">output_fields</span><span class="p">:</span><span class="w"> </span><span class="p p-Indicator">[</span><span class="s">"credit_risk"</span><span class="p p-Indicator">]</span><span class="w"></span>
<span class="nt">logging</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">version</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1</span><span class="w"></span>
<span class="w"> </span><span class="nt">disable_existing_loggers</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">false</span><span class="w"></span>
<span class="w"> </span><span class="nt">formatters</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">json_formatter</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">class</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">pythonjsonlogger.jsonlogger.JsonFormatter</span><span class="w"></span>
<span class="w"> </span><span class="nt">format</span><span class="p">:</span><span class="w"> </span><span class="s">"%(asctime)s</span><span class="nv"> </span><span class="s">%(node_ip)s</span><span class="nv"> </span><span class="s">%(name)s</span><span class="nv"> </span><span class="s">%(levelname)s</span><span class="nv"> </span><span class="s">%(message)s"</span><span class="w"></span>
<span class="w"> </span><span class="nt">filters</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">environment_info_filter</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="s">"()"</span><span class="p p-Indicator">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">ml_model_logging.filters.EnvironmentInfoFilter</span><span class="w"></span>
<span class="w"> </span><span class="nt">env_variables</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">NODE_IP</span><span class="w"></span>
<span class="w"> </span><span class="nt">handlers</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">stdout</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">level</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">INFO</span><span class="w"></span>
<span class="w"> </span><span class="nt">class</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">logging.StreamHandler</span><span class="w"></span>
<span class="w"> </span><span class="nt">stream</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">ext://sys.stdout</span><span class="w"></span>
<span class="w"> </span><span class="nt">formatter</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">json_formatter</span><span class="w"></span>
<span class="w"> </span><span class="nt">filters</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">environment_info_filter</span><span class="w"></span>
<span class="w"> </span><span class="nt">loggers</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">root</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">level</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">INFO</span><span class="w"></span>
<span class="w"> </span><span class="nt">handlers</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">stdout</span><span class="w"></span>
<span class="w"> </span><span class="nt">propagate</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
</code></pre></div>
<p>The two main sections in the file are the "models" section and the "logging" section. The models section is simpler and lists the CreditRiskModel, along with the LoggingDecorator. The decorators configuration simply adds an instance of the LoggingDecorator to the CreditRiskModel when the service starts up.</p>
<p>The logging configuration is set up exactly like we set it up in the examples above except that it is in YAML format. The YAML is converted to a dictionary and passed directly into the logging.config.dictConfig() function.</p>
<p>To run the service locally, execute these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">NODE_IP</span><span class="o">=</span><span class="m">123</span>.123.123.123
<span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>./configuration/rest_configuration.yaml
uvicorn rest_model_service.main:app --reload
</code></pre></div>
<p>The NODE_IP environment variable is set so that the value can be added to the log messages through the filter we built above. The service should come up and can be accessed in a web browser at http://127.0.0.1:8000. When you access that URL you will be redirected to the documentation page that is generated by the FastAPI package:</p>
<p><img alt="Service Documentation" src="https://www.tekhnoal.com/service_documentation_lfmlm.png" width="100%"></p>
<p>The documentation allows you to make requests against the API in order to try it out. Here's a prediction request against the insurance charges model:</p>
<p><img alt="Prediction Request" src="https://www.tekhnoal.com/prediction_request_lfmlm.png" width="100%"></p>
<p>And the prediction result:</p>
<p><img alt="Prediction Response" src="https://www.tekhnoal.com/prediction_response_lfmlm.png" width="100%"></p>
<p>The prediction made by the model had to go through the logging decorator that we configured into the service, so we got these two log records from the process:</p>
<p><img alt="Prediction Log" src="https://www.tekhnoal.com/prediction_log_lfmlm.png" width="100%"></p>
<p>The local web service process emits the logs to stdout just as we configured it.</p>
<h2>Deploying the Model Service</h2>
<p>Now that we have a working service that is running locally, we can work on deploying it to Kubernetes.</p>
<h3>Creating a Docker Image</h3>
<p>Kubernetes needs to have a Docker image in order to deploy something, we'll build an image using this Dockerfile:</p>
<div class="highlight"><pre><span></span><code><span class="c"># syntax=docker/dockerfile:1</span>
<span class="k">FROM</span><span class="w"> </span><span class="s">python:3.9-slim</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="s">base</span>
<span class="k">WORKDIR</span><span class="w"> </span><span class="s">/dependencies</span>
<span class="c"># installing git because we need to install the model package from the github repository</span>
<span class="k">RUN</span><span class="w"> </span>apt-get update -y <span class="o">&&</span> <span class="se">\</span>
apt-get install -y --no-install-recommends git
<span class="c"># creating and activating a virtual environment</span>
<span class="k">ENV</span><span class="w"> </span><span class="nv">VIRTUAL_ENV</span><span class="o">=</span>/opt/venv
<span class="k">RUN</span><span class="w"> </span>python3 -m venv <span class="nv">$VIRTUAL_ENV</span>
<span class="k">ENV</span><span class="w"> </span><span class="nv">PATH</span><span class="o">=</span><span class="s2">"</span><span class="nv">$VIRTUAL_ENV</span><span class="s2">/bin:</span><span class="nv">$PATH</span><span class="s2">"</span>
<span class="c"># installing dependencies</span>
<span class="k">COPY</span><span class="w"> </span>./service_requirements.txt ./service_requirements.txt
<span class="k">RUN</span><span class="w"> </span>pip install --no-cache -r service_requirements.txt
<span class="k">FROM</span><span class="w"> </span><span class="s">base</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="s">runtime</span>
<span class="k">ARG</span><span class="w"> </span>DATE_CREATED
<span class="k">ARG</span><span class="w"> </span>REVISION
<span class="k">ARG</span><span class="w"> </span>VERSION
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.title<span class="o">=</span><span class="s2">"Logging for ML Models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.description<span class="o">=</span><span class="s2">"Logging for machine learning models."</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.created<span class="o">=</span><span class="nv">$DATE_CREATED</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.authors<span class="o">=</span><span class="s2">"6666331+schmidtbri@users.noreply.github.com"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.source<span class="o">=</span><span class="s2">"https://github.com/schmidtbri/logging-for-ml-models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.version<span class="o">=</span><span class="nv">$VERSION</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.revision<span class="o">=</span><span class="nv">$REVISION</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.licenses<span class="o">=</span><span class="s2">"MIT License"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.base.name<span class="o">=</span><span class="s2">"python:3.9-slim"</span>
<span class="k">WORKDIR</span><span class="w"> </span><span class="s">/service</span>
<span class="c"># install packages</span>
<span class="k">RUN</span><span class="w"> </span>apt-get update -y <span class="o">&&</span> <span class="se">\</span>
apt-get install -y --no-install-recommends libgomp1 <span class="o">&&</span> <span class="se">\</span>
apt-get clean <span class="o">&&</span> <span class="se">\</span>
rm -rf /var/lib/apt/lists/*
<span class="k">COPY</span><span class="w"> </span>--from<span class="o">=</span>base /opt/venv ./venv
<span class="k">COPY</span><span class="w"> </span>./ml_model_logging ./ml_model_logging
<span class="k">COPY</span><span class="w"> </span>./LICENSE ./LICENSE
<span class="k">ENV</span><span class="w"> </span>PATH /service/venv/bin:<span class="nv">$PATH</span>
<span class="k">ENV</span><span class="w"> </span><span class="nv">PYTHONPATH</span><span class="o">=</span><span class="s2">"</span><span class="si">${</span><span class="nv">PYTHONPATH</span><span class="si">}</span><span class="s2">:/service"</span>
<span class="k">CMD</span><span class="w"> </span><span class="p">[</span><span class="s2">"uvicorn"</span><span class="p">,</span><span class="w"> </span><span class="s2">"rest_model_service.main:app"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--host"</span><span class="p">,</span><span class="w"> </span><span class="s2">"0.0.0.0"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--port"</span><span class="p">,</span><span class="w"> </span><span class="s2">"8000"</span><span class="p">]</span>
</code></pre></div>
<p>The Dockerfile includes a set of labels from the <a href="https://github.com/opencontainers/image-spec/blob/main/annotations.md">Open Containers annotations specification</a>. Most of the labels are hardcoded in the Dockerfile, but there are three that we need to add from the outside: the date created, the version, and the revision. To do this we'll pull some information into environment variables:</p>
<div class="highlight"><pre><span></span><code><span class="n">DATE_CREATED</span><span class="o">=</span><span class="err">!</span><span class="n">date</span> <span class="o">+</span><span class="s2">"%Y-%m-</span><span class="si">%d</span><span class="s2"> %T"</span>
<span class="n">REVISION</span><span class="o">=</span><span class="err">!</span><span class="n">git</span> <span class="n">rev</span><span class="o">-</span><span class="n">parse</span> <span class="n">HEAD</span>
<span class="err">!</span><span class="n">echo</span> <span class="s2">"$DATE_CREATED"</span>
<span class="err">!</span><span class="n">echo</span> <span class="s2">"$REVISION"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="k">['2023-04-23 21:30:31']</span><span class="w"></span>
<span class="k">['88a78deb3ed38e5bff5f0633fa4a4bf6202b868f']</span><span class="w"></span>
</code></pre></div>
<p>Now we can use the values to build the image. We'll also provide the version as a build argument.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">build</span> \
<span class="o">--</span><span class="n">build</span><span class="o">-</span><span class="n">arg</span> <span class="n">DATE_CREATED</span><span class="o">=</span><span class="s2">"$DATE_CREATED"</span> \
<span class="o">--</span><span class="n">build</span><span class="o">-</span><span class="n">arg</span> <span class="n">VERSION</span><span class="o">=</span><span class="s2">"0.1.0"</span> \
<span class="o">--</span><span class="n">build</span><span class="o">-</span><span class="n">arg</span> <span class="n">REVISION</span><span class="o">=</span><span class="s2">"$REVISION"</span> \
<span class="o">-</span><span class="n">t</span> <span class="n">credit_risk_model_service</span><span class="p">:</span><span class="mf">0.1.0</span> <span class="o">..</span>\
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To find the image we just built, we'll search through the local docker images:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">images</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">credit_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>credit_risk_model_service 0.1.0 10985e3d96bd 9 seconds ago 922MB
</code></pre></div>
<p>Next, we'll start the image to see if everything is working as expected.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">d</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">8000</span><span class="p">:</span><span class="mi">8000</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">REST_CONFIG</span><span class="o">=./</span><span class="n">configuration</span><span class="o">/</span><span class="n">rest_configuration</span><span class="o">.</span><span class="n">yaml</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">NODE_IP</span><span class="o">=</span><span class="s2">"123.123.123.123"</span> \
<span class="o">-</span><span class="n">v</span> <span class="err">$</span><span class="p">(</span><span class="n">pwd</span><span class="p">)</span><span class="o">/../</span><span class="n">configuration</span><span class="p">:</span><span class="o">/</span><span class="n">service</span><span class="o">/</span><span class="n">configuration</span> \
<span class="o">--</span><span class="n">name</span> <span class="n">credit_risk_model_service</span> \
<span class="n">credit_risk_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">265</span><span class="n">c3f15cae7c9b9788f0c1c96c66dcf28e7bba7b48f002671dc674cf1982f19</span><span class="w"></span>
</code></pre></div>
<p>Notice that we added an environment variable called NODE_IP, this is just so we have a value to pull into the logs later, its not the real node IP address.</p>
<p>The service is up and running in the docker container. To view the logs coming out of the process, we'll use the docker logs command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">logs</span> <span class="n">credit_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{<span class="s2">"</span><span class="s">asctime</span><span class="s2">"</span>: <span class="s2">"</span><span class="s">2023-04-24 01:31:03,901</span><span class="s2">"</span>, <span class="s2">"</span><span class="s">node_ip</span><span class="s2">"</span>: <span class="s2">"</span><span class="s">123.123.123.123</span><span class="s2">"</span>, <span class="s2">"</span><span class="s">name</span><span class="s2">"</span>: <span class="s2">"</span><span class="s">rest_model_service.helpers</span><span class="s2">"</span>, <span class="s2">"</span><span class="s">levelname</span><span class="s2">"</span>: <span class="s2">"</span><span class="s">INFO</span><span class="s2">"</span>, <span class="s2">"</span><span class="s">message</span><span class="s2">"</span>: <span class="s2">"</span><span class="s">Creating FastAPI app for: 'Credit Risk Model Service'.</span><span class="s2">"</span>}
<span class="nv">INFO</span>: <span class="nv">Started</span> <span class="nv">server</span> <span class="nv">process</span> [<span class="mi">1</span>]
<span class="nv">INFO</span>: <span class="nv">Waiting</span> <span class="k">for</span> <span class="nv">application</span> <span class="nv">startup</span>.
<span class="nv">INFO</span>: <span class="nv">Application</span> <span class="nv">startup</span> <span class="nv">complete</span>.
<span class="nv">INFO</span>: <span class="nv">Uvicorn</span> <span class="nv">running</span> <span class="nv">on</span> <span class="nv">http</span>:<span class="o">//</span><span class="mi">0</span>.<span class="mi">0</span>.<span class="mi">0</span>.<span class="mi">0</span>:<span class="mi">8000</span> <span class="ss">(</span><span class="nv">Press</span> <span class="nv">CTRL</span><span class="o">+</span><span class="nv">C</span> <span class="nv">to</span> <span class="nv">quit</span><span class="ss">)</span>
</code></pre></div>
<p>As we expected, the logs are coming out in JSON format, although there are some that are not. These logs are being emitted from logger objects that were initialized before the rest_model_service package got a chance to be initialized.</p>
<p>The service should be accessible on port 8000 of localhost, so we'll try to make a prediction using the curl command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:8000/api/models/credit_risk_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s1">'{ </span><span class="se">\</span>
<span class="s1"> "annual_income": 273000, </span><span class="se">\</span>
<span class="s1"> "collections_in_last_12_months": 20, </span><span class="se">\</span>
<span class="s1"> "delinquencies_in_last_2_years": 39, </span><span class="se">\</span>
<span class="s1"> "debt_to_income_ratio": 42.64, </span><span class="se">\</span>
<span class="s1"> "employment_length": "< 1 year", </span><span class="se">\</span>
<span class="s1"> "home_ownership": "MORTGAGE", </span><span class="se">\</span>
<span class="s1"> "number_of_delinquent_accounts": 6, </span><span class="se">\</span>
<span class="s1"> "interest_rate": 28.99, </span><span class="se">\</span>
<span class="s1"> "last_payment_amount": 36475.59, </span><span class="se">\</span>
<span class="s1"> "loan_amount": 35000, </span><span class="se">\</span>
<span class="s1"> "derogatory_public_record_count": 86, </span><span class="se">\</span>
<span class="s1"> "loan_purpose": "debt_consolidation", </span><span class="se">\</span>
<span class="s1"> "revolving_line_utilization_rate": 892.3, </span><span class="se">\</span>
<span class="s1"> "term": " 36 months", </span><span class="se">\</span>
<span class="s1"> "total_payments_to_date": 57777.58, </span><span class="se">\</span>
<span class="s1"> "verification_status": "Source Verified" </span><span class="se">\</span>
<span class="s1">}'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"credit_risk":"safe"}
</code></pre></div>
<p>We're done with the docker container so we'll stop it and stop it and remove it.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">credit_risk_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">credit_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>credit_risk_model_service
credit_risk_model_service
</code></pre></div>
<h2>Creating a Kubernetes Cluster</h2>
<p>To show the system in action, we’ll deploy the model service and the minio service to a Kubernetes cluster. A local cluster can be easily started by using <a href="https://minikube.sigs.k8s.io/docs/">minikube</a>. Installation instructions can be found <a href="https://minikube.sigs.k8s.io/docs/start/">here</a>.</p>
<p>To start the minikube cluster execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">start</span> <span class="o">--</span><span class="n">memory</span> <span class="mi">4196</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>😄 <span class="nv">minikube</span> <span class="nv">v1</span>.<span class="mi">30</span>.<span class="mi">1</span> <span class="nv">on</span> <span class="nv">Darwin</span> <span class="mi">13</span>.<span class="mi">3</span>.<span class="mi">1</span>
✨ <span class="nv">Using</span> <span class="nv">the</span> <span class="nv">docker</span> <span class="nv">driver</span> <span class="nv">based</span> <span class="nv">on</span> <span class="nv">existing</span> <span class="nv">profile</span>
👍 <span class="nv">Starting</span> <span class="nv">control</span> <span class="nv">plane</span> <span class="nv">node</span> <span class="nv">minikube</span> <span class="nv">in</span> <span class="nv">cluster</span> <span class="nv">minikube</span>
🚜 <span class="nv">Pulling</span> <span class="nv">base</span> <span class="nv">image</span> ...
🔄 <span class="nv">Restarting</span> <span class="nv">existing</span> <span class="nv">docker</span> <span class="nv">container</span> <span class="k">for</span> <span class="s2">"</span><span class="s">minikube</span><span class="s2">"</span> ...
🐳 <span class="nv">Preparing</span> <span class="nv">Kubernetes</span> <span class="nv">v1</span>.<span class="mi">26</span>.<span class="mi">3</span> <span class="nv">on</span> <span class="nv">Docker</span> <span class="mi">23</span>.<span class="mi">0</span>.<span class="mi">2</span> ...[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>
🔗 <span class="nv">Configuring</span> <span class="nv">bridge</span> <span class="nv">CNI</span> <span class="ss">(</span><span class="nv">Container</span> <span class="nv">Networking</span> <span class="nv">Interface</span><span class="ss">)</span> ...
🔎 <span class="nv">Verifying</span> <span class="nv">Kubernetes</span> <span class="nv">components</span>...
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">gcr</span>.<span class="nv">io</span><span class="o">/</span><span class="nv">k8s</span><span class="o">-</span><span class="nv">minikube</span><span class="o">/</span><span class="nv">storage</span><span class="o">-</span><span class="nv">provisioner</span>:<span class="nv">v5</span>
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">docker</span>.<span class="nv">io</span><span class="o">/</span><span class="nv">kubernetesui</span><span class="o">/</span><span class="nv">dashboard</span>:<span class="nv">v2</span>.<span class="mi">7</span>.<span class="mi">0</span>
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">docker</span>.<span class="nv">io</span><span class="o">/</span><span class="nv">kubernetesui</span><span class="o">/</span><span class="nv">metrics</span><span class="o">-</span><span class="nv">scraper</span>:<span class="nv">v1</span>.<span class="mi">0</span>.<span class="mi">8</span>
💡 <span class="nv">Some</span> <span class="nv">dashboard</span> <span class="nv">features</span> <span class="nv">require</span> <span class="nv">the</span> <span class="nv">metrics</span><span class="o">-</span><span class="nv">server</span> <span class="nv">addon</span>. <span class="nv">To</span> <span class="nv">enable</span> <span class="nv">all</span> <span class="nv">features</span> <span class="nv">please</span> <span class="nv">run</span>:
<span class="nv">minikube</span> <span class="nv">addons</span> <span class="nv">enable</span> <span class="nv">metrics</span><span class="o">-</span><span class="nv">server</span>
🌟 <span class="nv">Enabled</span> <span class="nv">addons</span>: <span class="nv">storage</span><span class="o">-</span><span class="nv">provisioner</span>, <span class="nv">default</span><span class="o">-</span><span class="nv">storageclass</span>, <span class="nv">dashboard</span>
🏄 <span class="nv">Done</span><span class="o">!</span> <span class="nv">kubectl</span> <span class="nv">is</span> <span class="nv">now</span> <span class="nv">configured</span> <span class="nv">to</span> <span class="nv">use</span> <span class="s2">"</span><span class="s">minikube</span><span class="s2">"</span> <span class="nv">cluster</span> <span class="nv">and</span> <span class="s2">"</span><span class="s">default</span><span class="s2">"</span> <span class="nv">namespace</span> <span class="nv">by</span> <span class="nv">default</span>
</code></pre></div>
<p>Let's view all of the pods running in the minikube cluster to make sure we can connect to it using the kubectl command.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">-</span><span class="n">A</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-787d4945fb-48bzx 1/1 Running 2 (3d5h ago) 4d23h
kube-system etcd-minikube 1/1 Running 2 (3d5h ago) 4d23h
kube-system kube-apiserver-minikube 1/1 Running 2 (3d5h ago) 4d23h
kube-system kube-controller-manager-minikube 1/1 Running 2 (3d5h ago) 4d23h
kube-system kube-proxy-jj4pz 1/1 Running 2 (3d5h ago) 4d23h
kube-system kube-scheduler-minikube 1/1 Running 2 (3d5h ago) 4d23h
kube-system storage-provisioner 1/1 Running 6 (33s ago) 4d23h
kubernetes-dashboard dashboard-metrics-scraper-5c6664855-fgpqq 1/1 Running 2 (3d5h ago) 4d23h
kubernetes-dashboard kubernetes-dashboard-55c4cbbc7c-ddx2q 1/1 Running 4 (32s ago) 4d23h
</code></pre></div>
<p>Looks like we can connect, we're ready to start deploying the model service to the cluster.</p>
<h3>Creating a Namespace</h3>
<p>Now that we have a cluster and are connected to it, we'll create a namespace to hold the resources for our model deployment. The resource definition is in the kubernetes/namespace.yaml file. To apply the manifest to the cluster, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace/model-services created
</code></pre></div>
<p>To take a look at the namespaces, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">namespace</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME STATUS AGE
default Active 4d23h
kube-node-lease Active 4d23h
kube-public Active 4d23h
kube-system Active 4d23h
kubernetes-dashboard Active 4d23h
model-services Active 1s
</code></pre></div>
<p>The new namespace should appear in the listing along with other namespaces created by default by the system. </p>
<h3>Creating the Model Service</h3>
<p>The model service is deployed by using Kubernetes resources. These are:</p>
<ul>
<li>ConfigMap: a set of configuration options, in this case it is a simple YAML file that will be loaded into the running container as a volume mount. This resource allows us to change the configuration of the model service without having to modify the Docker image.</li>
<li>Deployment: a declarative way to manage a set of Pods, the model service pods are managed through the Deployment.</li>
<li>Service: a way to expose a set of Pods in a Deployment, the model service is made available to the outside world through the Service.</li>
</ul>
<p>These resources are defined in the kubernetes/model_service.yaml file, the file is long so we won't list it here. The env section in the container's definition in the Deployment has a special section which is allowing us to access information about the pod and the node:</p>
<div class="highlight"><pre><span></span><code><span class="nn">...</span><span class="w"></span>
<span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">REST_CONFIG</span><span class="w"></span>
<span class="w"> </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">./configuration/kubernetes_rest_config.yaml</span><span class="w"></span>
<span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">POD_NAME</span><span class="w"></span>
<span class="w"> </span><span class="nt">valueFrom</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">fieldRef</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">fieldPath</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">metadata.name</span><span class="w"></span>
<span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">NODE_NAME</span><span class="w"></span>
<span class="w"> </span><span class="nt">valueFrom</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">fieldRef</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">fieldPath</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">spec.nodeName</span><span class="w"></span>
<span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">APP_NAME</span><span class="w"></span>
<span class="w"> </span><span class="nt">valueFrom</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">fieldRef</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">fieldPath</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">metadata.labels['app']</span><span class="w"></span>
<span class="nn">...</span><span class="w"></span>
</code></pre></div>
<p>The pod definition is using the <a href="https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/">downward API provided by Kubernetes</a> to access the node name, the pod name, and the contents of the 'app' label. This information is made available as environment variables. We'll be adding this information to the log by adding the names of the environment variables to the logger configuration that we'll give to the model service. We built a logging context class above for the purpose of adding environment variables to log records.</p>
<p>We're almost ready to deploy the model service, but before starting it we'll need to send the docker image from the local docker daemon to the minikube image cache:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">image</span> <span class="n">load</span> <span class="n">credit_risk_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<p>We can view the images in the minikube cache with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">image</span> <span class="n">ls</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">credit_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>docker.io/library/credit_risk_model_service:0.1.0
</code></pre></div>
<p>The model service will need to access the YAML configuration file that we used for the local service above. This is file is in the /configuration folder and is called "kubernetes_rest_config.yaml", its customized for the kubernetes environment we're building.</p>
<p>To create a <a href="https://kubernetes.io/docs/concepts/configuration/configmap/">ConfigMap</a> for the service, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="n">configmap</span> <span class="o">-</span><span class="n">n</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span> <span class="n">model</span><span class="o">-</span><span class="n">service</span><span class="o">-</span><span class="n">configuration</span> \
<span class="o">--</span><span class="n">from</span><span class="o">-</span><span class="n">file</span><span class="o">=../</span><span class="n">configuration</span><span class="o">/</span><span class="n">kubernetes_rest_config</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap/model-service-configuration created
</code></pre></div>
<p>The service is deployed to the Kubernetes cluster with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">n</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps/credit-risk-model-deployment created
service/credit-risk-model-service created
</code></pre></div>
<p>The deployment and service for the model service were created together. Lets view the Deployment to see if it is available yet:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">deployments</span> <span class="o">-</span><span class="n">n</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY UP-TO-DATE AVAILABLE AGE
credit-risk-model-deployment 1/1 1 1 33s
</code></pre></div>
<p>You can also view the pods that are running the service:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">-</span><span class="n">n</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span> <span class="o">-</span><span class="n">l</span> <span class="n">app</span><span class="o">=</span><span class="n">credit</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY STATUS RESTARTS AGE
credit-risk-model-deployment-554575f4f-5rl5s 1/1 Running 0 35s
</code></pre></div>
<p>The Kubernetes Service details look like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">services</span> <span class="o">-</span><span class="n">n</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
credit-risk-model-service NodePort 10.104.38.144 <none> 80:32268/TCP 37s
</code></pre></div>
<p>We'll run a proxy process locally to be able to access the model service endpoint:</p>
<div class="highlight"><pre><span></span><code>minikube service credit-risk-model-service --url -n model-services
</code></pre></div>
<p>The command outputs this URL:</p>
<p>http://127.0.0.1:50222</p>
<p>We can send a request to the model service through the local endpoint like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:50222/api/models/credit_risk_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s1">'{ </span><span class="se">\</span>
<span class="s1"> "annual_income": 273000, </span><span class="se">\</span>
<span class="s1"> "collections_in_last_12_months": 20, </span><span class="se">\</span>
<span class="s1"> "delinquencies_in_last_2_years": 39, </span><span class="se">\</span>
<span class="s1"> "debt_to_income_ratio": 42.64, </span><span class="se">\</span>
<span class="s1"> "employment_length": "< 1 year", </span><span class="se">\</span>
<span class="s1"> "home_ownership": "MORTGAGE", </span><span class="se">\</span>
<span class="s1"> "number_of_delinquent_accounts": 6, </span><span class="se">\</span>
<span class="s1"> "interest_rate": 28.99, </span><span class="se">\</span>
<span class="s1"> "last_payment_amount": 36475.59, </span><span class="se">\</span>
<span class="s1"> "loan_amount": 35000, </span><span class="se">\</span>
<span class="s1"> "derogatory_public_record_count": 86, </span><span class="se">\</span>
<span class="s1"> "loan_purpose": "debt_consolidation", </span><span class="se">\</span>
<span class="s1"> "revolving_line_utilization_rate": 892.3, </span><span class="se">\</span>
<span class="s1"> "term": " 36 months", </span><span class="se">\</span>
<span class="s1"> "total_payments_to_date": 57777.58, </span><span class="se">\</span>
<span class="s1"> "verification_status": "Source Verified" </span><span class="se">\</span>
<span class="s1">}'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"credit_risk":"safe"}
</code></pre></div>
<p>The model is deployed within Kubernetes!</p>
<h3>Accessing the Logs</h3>
<p>Kubernetes has a built-in system that receives the stdout and stderr outputs of the running containers and saves them to the hard drive of the node for a limited time. You can view the logs emitted by the containers by using this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">logs</span> <span class="o">-</span><span class="n">n</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span> <span class="n">credit</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">deployment</span><span class="o">-</span><span class="mi">554575</span><span class="n">f4f</span><span class="o">-</span><span class="mi">5</span><span class="n">rl5s</span> <span class="o">-</span><span class="n">c</span> <span class="n">credit</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span> <span class="o">|</span> <span class="n">grep</span> <span class="s2">"</span><span class="se">\"</span><span class="s2">action</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">predict</span><span class="se">\"</span><span class="s2">"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"asctime": "2023-04-24 01:36:40,696", "pod_name": "credit-risk-model-deployment-554575f4f-5rl5s", "node_name": "minikube", "app_name": "credit-risk-model-service", "name": "credit_risk_model_logger", "levelname": "INFO", "message": "Prediction requested.", "action": "predict", "model_qualified_name": "credit_risk_model", "model_version": "0.1.0", "collections_in_last_12_months": 20, "debt_to_income_ratio": 42.64}
{"asctime": "2023-04-24 01:36:40,781", "pod_name": "credit-risk-model-deployment-554575f4f-5rl5s", "node_name": "minikube", "app_name": "credit-risk-model-service", "name": "credit_risk_model_logger", "levelname": "INFO", "message": "Prediction created.", "action": "predict", "model_qualified_name": "credit_risk_model", "model_version": "0.1.0", "status": "success", "credit_risk": "safe"}
</code></pre></div>
<p>The logs contain every field that we configured and they are in JSON format, as we expected. The log records also contain the pod_name, node_name, and app_name fields that we added through the downward API.</p>
<p>Although we can view the logs like this, this is not the ideal way to hold logs. We need to be able to search through the logs generated across the whole system. To do this we'll need to export the logs to an external logging system. We'll be working on that in another section of this blog post.</p>
<h2>Creating the Logging System</h2>
<p>The complexity of modern cloud environment makes it hard to manage logs in individual servers since we really don't know where our workloads are going to be scheduled ahead of time. Kubernetes workloads are highly distributed, meaning that an application can be replicated in many different nodes in a cluster. This makes it necessary to gather logs together in one place so that we can more easily view and analyze them.</p>
<p>A logging system is responsible for gathering log records from all of the instances of a running application and make them searchable from one centralized location. In this section, we'll add such a logging system to the cluster and use it to monitor the model service we've deployed.</p>
<p>We'll be installing the Elastic Cloud on Kubernetes operator in order to view our logs. The operator installs and manages ElasticSearch, Kibana, and Filebeat services.</p>
<p>To begin, lets install the <a href="https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/">custom resource definitions</a> needed by the operator:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="o">-</span><span class="n">f</span> <span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">download</span><span class="o">.</span><span class="n">elastic</span><span class="o">.</span><span class="n">co</span><span class="o">/</span><span class="n">downloads</span><span class="o">/</span><span class="n">eck</span><span class="o">/</span><span class="mf">2.7.0</span><span class="o">/</span><span class="n">crds</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>customresourcedefinition.apiextensions.k8s.io/agents.agent.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/apmservers.apm.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/beats.beat.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/elasticmapsservers.maps.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/elasticsearchautoscalers.autoscaling.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/elasticsearches.elasticsearch.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/enterprisesearches.enterprisesearch.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/kibanas.kibana.k8s.elastic.co created
customresourcedefinition.apiextensions.k8s.io/stackconfigpolicies.stackconfigpolicy.k8s.elastic.co created
</code></pre></div>
<p>We'll be using theses CRDs:</p>
<ul>
<li>elasticsearch.k8s.elastic.co, to deploy ElasticSearch for storing and indexing logs</li>
<li>kibana.k8s.elastic.co, to deploy Kibana for viewing logs</li>
<li>beat.k8s.elastic.co, to deploy Filebeat on each node to forward logs to ElasticSearch</li>
</ul>
<p>The CRDs are used by the ECK operator to manage resources in the cluster. To install the ECK operator itself, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">download</span><span class="o">.</span><span class="n">elastic</span><span class="o">.</span><span class="n">co</span><span class="o">/</span><span class="n">downloads</span><span class="o">/</span><span class="n">eck</span><span class="o">/</span><span class="mf">2.7.0</span><span class="o">/</span><span class="n">operator</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">namespace</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">system</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="n">serviceaccount</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">operator</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="n">secret</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">webhook</span><span class="o">-</span><span class="n">server</span><span class="o">-</span><span class="n">cert</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="n">configmap</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">operator</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="n">clusterrole</span><span class="p">.</span><span class="n">rbac</span><span class="p">.</span><span class="n">authorization</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">io</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">operator</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="n">clusterrole</span><span class="p">.</span><span class="n">rbac</span><span class="p">.</span><span class="n">authorization</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">io</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">operator</span><span class="o">-</span><span class="n">view</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="n">clusterrole</span><span class="p">.</span><span class="n">rbac</span><span class="p">.</span><span class="n">authorization</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">io</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">operator</span><span class="o">-</span><span class="n">edit</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="n">clusterrolebinding</span><span class="p">.</span><span class="n">rbac</span><span class="p">.</span><span class="n">authorization</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">io</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">operator</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="n">service</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">webhook</span><span class="o">-</span><span class="n">server</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="n">statefulset</span><span class="p">.</span><span class="n">apps</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">operator</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="n">validatingwebhookconfiguration</span><span class="p">.</span><span class="n">admissionregistration</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">io</span><span class="o">/</span><span class="n">elastic</span><span class="o">-</span><span class="n">webhook</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">elastic</span><span class="p">.</span><span class="n">co</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
</code></pre></div>
<h3>ElasticSearch</h3>
<p>We'll be storing logs in <a href="https://www.elastic.co/elasticsearch/">ElasticSearch</a>. ElasticSearch is a distributed full-text search engine with a RESTful API. The ElasticSearch service is ideal for our needs because our logs are made up of text strings.</p>
<p>Now we're ready to install the service by applying the "ElasticSearch" custom resource definition:</p>
<div class="highlight"><pre><span></span><code><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">elasticsearch.k8s.elastic.co/v1</span><span class="w"></span>
<span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Elasticsearch</span><span class="w"></span>
<span class="nt">metadata</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">quickstart</span><span class="w"></span>
<span class="nt">spec</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">version</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8.7.0</span><span class="w"></span>
<span class="w"> </span><span class="nt">nodeSets</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">default</span><span class="w"></span>
<span class="w"> </span><span class="nt">count</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1</span><span class="w"></span>
<span class="w"> </span><span class="nt">config</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">node.store.allow_mmap</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">false</span><span class="w"></span>
</code></pre></div>
<p>The CRD is stored in the kubernetes/elastic_search.yaml file. The CRD is applied with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">elastic_search</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>elasticsearch.elasticsearch.k8s.elastic.co/quickstart created
</code></pre></div>
<p>To get a list of ElasticSearch clusters currently defined in the cluster, execute this comand:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">elasticsearch</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME HEALTH NODES VERSION PHASE AGE
quickstart green 1 8.7.0 Ready 116s
</code></pre></div>
<p>We can look at the pods running the ElasticSearch cluster:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span> <span class="o">--</span><span class="n">selector</span><span class="o">=</span><span class="s1">'elasticsearch.k8s.elastic.co/cluster-name=quickstart'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY STATUS RESTARTS AGE
quickstart-es-default-0 1/1 Running 0 116s
</code></pre></div>
<p>A Kubernetes service is created to make the ElasticSearch service available to other services in the cluster:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">service</span> <span class="n">quickstart</span><span class="o">-</span><span class="n">es</span><span class="o">-</span><span class="n">http</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
quickstart-es-http ClusterIP 10.106.185.54 <none> 9200/TCP 2m2s
</code></pre></div>
<p>A user named "elastic" is automatically in the ElasticSearch services with the password stored in a Kubernetes secret. Let's access the password:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">secret</span> <span class="n">quickstart</span><span class="o">-</span><span class="n">es</span><span class="o">-</span><span class="n">elastic</span><span class="o">-</span><span class="n">user</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span> <span class="o">-</span><span class="n">o</span><span class="o">=</span><span class="n">jsonpath</span><span class="o">=</span><span class="s1">'{.data.elastic}'</span> <span class="o">|</span> <span class="n">base64</span> <span class="o">--</span><span class="n">decode</span><span class="p">;</span> <span class="n">echo</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>DD097Fe67Qs320Uw6JHIy2Vb
</code></pre></div>
<h3>Kibana</h3>
<p>To view the logs we'll be using <a href="https://www.elastic.co/kibana/">Kibana</a>. Kibana is a web application that can provide access to and visualize logs stored in ElasticSearch.</p>
<p>The CRD for Kibana looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">kibana.k8s.elastic.co/v1</span><span class="w"></span>
<span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Kibana</span><span class="w"></span>
<span class="nt">metadata</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">quickstart</span><span class="w"></span>
<span class="nt">spec</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">version</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8.7.0</span><span class="w"></span>
<span class="w"> </span><span class="nt">count</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1</span><span class="w"></span>
<span class="w"> </span><span class="nt">elasticsearchRef</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">quickstart</span><span class="w"></span>
</code></pre></div>
<p>We'll apply the CRD with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">kibana</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>kibana.kibana.k8s.elastic.co/quickstart created
</code></pre></div>
<p>Similar to Elasticsearch, you can retrieve details about Kibana instances:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">kibana</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME HEALTH NODES VERSION AGE
quickstart green 1 8.7.0 51s
</code></pre></div>
<p>We can also view the associated Pods:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pod</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span> <span class="o">--</span><span class="n">selector</span><span class="o">=</span><span class="s1">'kibana.k8s.elastic.co/name=quickstart'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY STATUS RESTARTS AGE
quickstart-kb-589dc4f75b-ncpd7 1/1 Running 0 53s
</code></pre></div>
<p>A ClusterIP Service is automatically created for Kibana:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">service</span> <span class="n">quickstart</span><span class="o">-</span><span class="n">kb</span><span class="o">-</span><span class="n">http</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
quickstart-kb-http ClusterIP 10.111.166.26 <none> 5601/TCP 57s
</code></pre></div>
<p>We'll use kubectl port-forward to access Kibana from a local web browser:</p>
<div class="highlight"><pre><span></span><code>kubectl port-forward service/quickstart-kb-http <span class="m">5601</span> -n elastic-system
</code></pre></div>
<p>Now we can access the Kibana service from this URL:</p>
<div class="highlight"><pre><span></span><code>http://localhost:5601
</code></pre></div>
<p>Open the URL in your browser to view the Kibana UI. Login as the "elastic" user. The password is the one we retrieved above.</p>
<h3>Filebeat</h3>
<p>In order to centralize access to logs, we'll first need a way to get the logs off of the individual cluster nodes and forward them to the aggregator service. The service we'll use to do this is called <a href="https://www.elastic.co/beats/filebeat">Filebeat</a>. Filebeat is a lightweight service that can forward logs stored in files to an outside service. We'll deploy Filebeat as a DaemonSet to ensure there’s a running instance on each node of the cluster.</p>
<p>The Filebeat CRD looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">beat.k8s.elastic.co/v1beta1</span><span class="w"></span>
<span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Beat</span><span class="w"></span>
<span class="nt">metadata</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">quickstart</span><span class="w"></span>
<span class="nt">spec</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">filebeat</span><span class="w"></span>
<span class="w"> </span><span class="nt">version</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8.7.0</span><span class="w"></span>
<span class="w"> </span><span class="nt">elasticsearchRef</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">quickstart</span><span class="w"></span>
<span class="w"> </span><span class="nt">kibanaRef</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">quickstart</span><span class="w"></span>
<span class="w"> </span><span class="nt">config</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">processors</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">decode_json_fields</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">fields</span><span class="p">:</span><span class="w"> </span><span class="p p-Indicator">[</span><span class="s">"message"</span><span class="p p-Indicator">]</span><span class="w"></span>
<span class="w"> </span><span class="nt">max_depth</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">3</span><span class="w"></span>
<span class="w"> </span><span class="nt">target</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">parsed_message</span><span class="w"></span>
<span class="w"> </span><span class="nt">add_error_key</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">false</span><span class="w"></span>
<span class="w"> </span><span class="nt">filebeat.inputs</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">container</span><span class="w"></span>
<span class="w"> </span><span class="nt">paths</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/var/log/containers/*.log</span><span class="w"></span>
<span class="w"> </span><span class="nt">daemonSet</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">podTemplate</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">spec</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">dnsPolicy</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">ClusterFirstWithHostNet</span><span class="w"></span>
<span class="w"> </span><span class="nt">hostNetwork</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
<span class="w"> </span><span class="nt">securityContext</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">runAsUser</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">0</span><span class="w"></span>
<span class="w"> </span><span class="nt">containers</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">filebeat</span><span class="w"></span>
<span class="w"> </span><span class="nt">volumeMounts</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">varlogcontainers</span><span class="w"></span>
<span class="w"> </span><span class="nt">mountPath</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/var/log/containers</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">varlogpods</span><span class="w"></span>
<span class="w"> </span><span class="nt">mountPath</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/var/log/pods</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">varlibdockercontainers</span><span class="w"></span>
<span class="w"> </span><span class="nt">mountPath</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/var/lib/docker/containers</span><span class="w"></span>
<span class="w"> </span><span class="nt">volumes</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">varlogcontainers</span><span class="w"></span>
<span class="w"> </span><span class="nt">hostPath</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/var/log/containers</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">varlogpods</span><span class="w"></span>
<span class="w"> </span><span class="nt">hostPath</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/var/log/pods</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">varlibdockercontainers</span><span class="w"></span>
<span class="w"> </span><span class="nt">hostPath</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/var/lib/docker/containers</span><span class="w"></span>
</code></pre></div>
<p>The container logs host folder (/var/log/containers) is mounted on the Filebeat container. The filebeat process also has a processor defined:</p>
<ul>
<li>decode_json_fields, which decodes fields containing JSON strings and replaces the strings with valid JSON objects</li>
</ul>
<p>Let's apply the CRD to create the Filebeat DaemonSet:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">filebeat</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>beat.beat.k8s.elastic.co/quickstart created
</code></pre></div>
<p>Details about the Filebeat service can be viewed like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">beat</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME HEALTH AVAILABLE EXPECTED TYPE VERSION AGE
quickstart green 1 1 filebeat 8.7.0 35s
</code></pre></div>
<p>The pods running the service can be listed like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span> <span class="o">--</span><span class="n">selector</span><span class="o">=</span><span class="s1">'beat.k8s.elastic.co/name=quickstart'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY STATUS RESTARTS AGE
quickstart-beat-filebeat-znwsf 1/1 Running 0 38s
</code></pre></div>
<p>The Filebeat service is running on the single node in the cluster.</p>
<p>The logs are being forwarded to ElasticSearch and can be viewed in Kibana:</p>
<p><img alt="Prediction Log Stream" src="https://www.tekhnoal.com/log_stream_lfmlm.png" width="100%"></p>
<p>We have logs arriving from the model service and can view them in Kibana!</p>
<h2>Deleting the Resources</h2>
<p>To delete the Filebeat DaemonSet, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">filebeat</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>beat.beat.k8s.elastic.co "quickstart" deleted
</code></pre></div>
<p>To delete the Kibana service, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">kibana</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>kibana.kibana.k8s.elastic.co "quickstart" deleted
</code></pre></div>
<p>To delete the ElasticSearch service, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">n</span> <span class="n">elastic</span><span class="o">-</span><span class="n">system</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">elastic_search</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>elasticsearch.elasticsearch.k8s.elastic.co "quickstart" deleted
</code></pre></div>
<p>To remove all Elastic resources in all namespaces:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">namespaces</span> <span class="o">--</span><span class="n">no</span><span class="o">-</span><span class="n">headers</span> <span class="o">-</span><span class="n">o</span> <span class="n">custom</span><span class="o">-</span><span class="n">columns</span><span class="o">=</span><span class="p">:</span><span class="n">metadata</span><span class="o">.</span><span class="n">name</span> <span class="o">|</span> <span class="n">xargs</span> <span class="o">-</span><span class="n">n1</span> <span class="n">kubectl</span> <span class="n">delete</span> <span class="n">elastic</span> <span class="o">--</span><span class="nb">all</span> <span class="o">-</span><span class="n">n</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>No resources found
No resources found
No resources found
No resources found
No resources found
No resources found
No resources found
</code></pre></div>
<p>To uninstall the ECK operator:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">download</span><span class="o">.</span><span class="n">elastic</span><span class="o">.</span><span class="n">co</span><span class="o">/</span><span class="n">downloads</span><span class="o">/</span><span class="n">eck</span><span class="o">/</span><span class="mf">2.7.0</span><span class="o">/</span><span class="n">operator</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">namespace</span><span class="w"> </span><span class="s">"elastic-system"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">serviceaccount</span><span class="w"> </span><span class="s">"elastic-operator"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">secret</span><span class="w"> </span><span class="s">"elastic-webhook-server-cert"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">configmap</span><span class="w"> </span><span class="s">"elastic-operator"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">clusterrole</span><span class="p">.</span><span class="n">rbac</span><span class="p">.</span><span class="n">authorization</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">io</span><span class="w"> </span><span class="s">"elastic-operator"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">clusterrole</span><span class="p">.</span><span class="n">rbac</span><span class="p">.</span><span class="n">authorization</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">io</span><span class="w"> </span><span class="s">"elastic-operator-view"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">clusterrole</span><span class="p">.</span><span class="n">rbac</span><span class="p">.</span><span class="n">authorization</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">io</span><span class="w"> </span><span class="s">"elastic-operator-edit"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">clusterrolebinding</span><span class="p">.</span><span class="n">rbac</span><span class="p">.</span><span class="n">authorization</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">io</span><span class="w"> </span><span class="s">"elastic-operator"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">service</span><span class="w"> </span><span class="s">"elastic-webhook-server"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">statefulset</span><span class="p">.</span><span class="n">apps</span><span class="w"> </span><span class="s">"elastic-operator"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">validatingwebhookconfiguration</span><span class="p">.</span><span class="n">admissionregistration</span><span class="p">.</span><span class="n">k8s</span><span class="p">.</span><span class="n">io</span><span class="w"> </span><span class="s">"elastic-webhook.k8s.elastic.co"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="o">^</span><span class="n">C</span><span class="w"></span>
</code></pre></div>
<p>To uninstall the Custom Resource Definitions for the ECK operator:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">download</span><span class="o">.</span><span class="n">elastic</span><span class="o">.</span><span class="n">co</span><span class="o">/</span><span class="n">downloads</span><span class="o">/</span><span class="n">eck</span><span class="o">/</span><span class="mf">2.7.0</span><span class="o">/</span><span class="n">crds</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>customresourcedefinition.apiextensions.k8s.io "agents.agent.k8s.elastic.co" deleted
customresourcedefinition.apiextensions.k8s.io "apmservers.apm.k8s.elastic.co" deleted
customresourcedefinition.apiextensions.k8s.io "beats.beat.k8s.elastic.co" deleted
customresourcedefinition.apiextensions.k8s.io "elasticmapsservers.maps.k8s.elastic.co" deleted
customresourcedefinition.apiextensions.k8s.io "elasticsearchautoscalers.autoscaling.k8s.elastic.co" deleted
customresourcedefinition.apiextensions.k8s.io "elasticsearches.elasticsearch.k8s.elastic.co" deleted
customresourcedefinition.apiextensions.k8s.io "enterprisesearches.enterprisesearch.k8s.elastic.co" deleted
customresourcedefinition.apiextensions.k8s.io "kibanas.kibana.k8s.elastic.co" deleted
customresourcedefinition.apiextensions.k8s.io "stackconfigpolicies.stackconfigpolicy.k8s.elastic.co" deleted
</code></pre></div>
<p>To delete the model service kubernetes resources, we'll execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">n</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps "credit-risk-model-deployment" deleted
service "credit-risk-model-service" deleted
</code></pre></div>
<p>We'll also delete the ConfigMap:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">n</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span> <span class="n">configmap</span> <span class="n">model</span><span class="o">-</span><span class="n">service</span><span class="o">-</span><span class="n">configuration</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap "model-service-configuration" deleted
</code></pre></div>
<p>Then the model service namespace:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace "model-services" deleted
</code></pre></div>
<p>To shut down the minikube cluster:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">stop</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>✋ Stopping node "minikube" ...
🛑 Powering off "minikube" via SSH ...
🛑 1 node stopped.
</code></pre></div>
<h2>Closing</h2>
<p>In this blog post we showed how to do logging with the Python logging package, and how to create a decorator that can help us to do logging around an MLModel instance. We also set up and used a logging system within a Kubernetes cluster and used it to aggregate logs and view them. Logging is usually the first thing that is implemented when we need to monitor how a system performs, and machine learning models are no exception to this. The logging decorator allowed us to do complex logging without having to modify the implementation of the model at all, thus simplifying a common aspect of software observability.</p>
<p>One of the benefits of using the decorator pattern is that we are able to build up complex behaviors around an object. The LoggingDecorator class is very configurable, since we are able to configure it to log input and output fields from the model. This approach makes the implementation very flexible, since we do not need to modify the decorator's code to add fields to the log. The EnvironmentInfoFilter class that we implemented to grab information from the environment for logs is also built this way. We were able to get information about the Kubernetes deployment from the logs without having to modify the code.</p>
<p>The LoggingDecorator class is designed to work with MLModel classes, and this is the only hard requirement of the code. This makes the decorator very portable, because we are able to deploy it inside of any other model deployment service we may choose to build in the future. For example, we can just as easily decorate an MLModel instance running inside of an gRPC service, since the decorator would work exactly the same way. This is due to interface-driven approach that we took when designing the MLModel interface.</p>
<p>We added logging to the ML model instance from the "outside" and we were not able to access information about the internals of the model. This is a limitation of the decorator approach to logging which only has access to the model inputs, model outputs, and exceptions raised by the model. This approach is best used to add logging functionality to an ML model implementation that we do not control, or in simple situations in which the limitations of the approach do not affect us. If any logging of internal model state is needed, we'll need to generate logs from within the MLModel class. </p>Signed Parameters for Secure ML Model Deployments2023-03-17T22:00:00-05:002023-03-17T22:00:00-05:00Brian Schmidttag:www.tekhnoal.com,2023-03-17:/securing-parameters-for-ml-models.html<p>In the Python ecosystem, using pickle to serialize machine learning models is very common. Pickle is a built-in Python library module that makes it easy to convert in-memory objects into bytestreams that can be saved to a hard drive or sent over networks. Pickling an object is very quick and simple and is the easiest way to persist a complex Python object for later use. However, pickle is not a secure serialization standard. The documentation for the pickle module in the Python standard library explicitly mentions the insecure nature of the pickle format: Warning The pickle module is not secure. Only unpickle data you trust. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never unpickle data that could have come from an untrusted source, or that could have been tampered with. In this blog post, we'll be downloading a dataset, exploring it, training a model, signing the model parameters, and deploying the model parameters and model to a Kubernetes cluster as a RESTful service. We'll also be loading the model parameters from a network storage service to show how to secure the model parameters while they are stored separately from the model deployment.</p><h1>Signed Parameters for Secure ML Model Deployments</h1>
<p>This blog post was written in a Jupyter notebook, the code and commands found in it reflect this.</p>
<p>All of the code for this blog post is in <a href="https://github.com/schmidtbri/securing-parameters-for-ml-models">this github repository</a>.</p>
<h2>Introduction</h2>
<p>In the Python ecosystem, using pickle to serialize machine learning models is very common. Pickle is a built-in Python library module that makes it easy to convert in-memory objects into bytestreams that can be saved to a hard drive or sent over networks. Pickling an object is very quick and simple and is the easiest way to persist a complex Python object for later use. However, pickle is not a secure serialization standard. The <a href="https://docs.python.org/3/library/pickle.html">documentation</a> for the pickle module in the Python standard library explicitly mentions the insecure nature of the pickle format:</p>
<div class="highlight"><pre><span></span><code><span class="n">Warning</span><span class="w"> </span><span class="n">The</span><span class="w"> </span><span class="n">pickle</span><span class="w"> </span><span class="n">module</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="n">secure</span><span class="o">.</span><span class="w"> </span><span class="n">Only</span><span class="w"> </span><span class="n">unpickle</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="n">you</span><span class="w"> </span><span class="n">trust</span><span class="o">.</span><span class="w"></span>
<span class="n">It</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="n">possible</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">construct</span><span class="w"> </span><span class="n">malicious</span><span class="w"> </span><span class="n">pickle</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="n">which</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">execute</span><span class="w"> </span><span class="n">arbitrary</span><span class="w"> </span><span class="n">code</span><span class="w"> </span><span class="n">during</span><span class="w"> </span><span class="n">unpickling</span><span class="o">.</span><span class="w"> </span><span class="n">Never</span><span class="w"> </span><span class="n">unpickle</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="n">that</span><span class="w"> </span><span class="n">could</span><span class="w"> </span><span class="n">have</span><span class="w"> </span><span class="n">come</span><span class="w"> </span><span class="n">from</span><span class="w"> </span><span class="n">an</span><span class="w"> </span><span class="n">untrusted</span><span class="w"> </span><span class="n">source</span><span class="p">,</span><span class="w"> </span><span class="ow">or</span><span class="w"> </span><span class="n">that</span><span class="w"> </span><span class="n">could</span><span class="w"> </span><span class="n">have</span><span class="w"> </span><span class="n">been</span><span class="w"> </span><span class="n">tampered</span><span class="w"> </span><span class="n">with</span><span class="o">.</span><span class="w"></span>
</code></pre></div>
<p>What can we do about this? Pickling is the easiest way to save model objectsand using pickle for model serialization is ubiquitous in Data Science. One thing that we can do is make sure that the pickle files that hold our models are not modified in the time between the training process and the prediction process. This way, we can be sure that the contents of the file are benign. This is especially important in models that are deployed in production services that are running in sensitive environments. If we allow the model service that is hosting the model to load a pickle file that has been compromised, we can allow arbritrary code execution on the server. </p>
<p>One way to prevent the pickle file from being modified is by "signing" it. Signing a file means processing the data and creating a "signature" that we can use later to make sure that the contents of the file have not been changed since it was signed. In order to still be able to use pickle in a production setting, we'll require that the model parameters be signed right after they are created, then we'll check the signature before we load the parameters within the model service. If the signature does not match, we'll know that the model parameters are not safe to load. However, signing model parameters does not encrypt them, so it is still possible for someone with access to the pickle files to view the model parameters.</p>
<p>In this blog post, we'll be downloading a dataset, exploring it, training a model, signing the model parameters, and deploying the model parameters and model to a Kubernetes cluster as a RESTful service. We'll also be loading the model parameters from a network storage service to show how to secure the model parameters while they are stored separately from the model deployment. </p>
<h2>Getting Data</h2>
<p>In order to train a model, we'll need a dataset. The dataset we've chosen is the <a href="https://www.kaggle.com/datasets/alexteboul/diabetes-health-indicators-dataset">Diabetes Health Indicators Dataset</a> available from Kaggle. The dataset contains data about health and the incidence of diabetetes. We'll be using the dataset to train a model that predicts whether or not a person is likely to have diabetes.</p>
<p>To make it easy to download the data, we'll install the <a href="https://pypi.org/project/kaggle/">kaggle python package</a>.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">clear_output</span>
<span class="o">%</span><span class="n">pip</span> <span class="n">install</span> <span class="n">kaggle</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Next, we'll execute these commands to download the data and unzip it into the data folder in the project:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">mkdir</span> <span class="o">-</span><span class="n">p</span> <span class="o">../</span><span class="n">data</span>
<span class="err">!</span><span class="n">kaggle</span> <span class="n">datasets</span> <span class="n">download</span> <span class="o">-</span><span class="n">d</span> <span class="n">alexteboul</span><span class="o">/</span><span class="n">diabetes</span><span class="o">-</span><span class="n">health</span><span class="o">-</span><span class="n">indicators</span><span class="o">-</span><span class="n">dataset</span> <span class="o">-</span><span class="n">p</span> <span class="o">../</span><span class="n">data</span> <span class="o">--</span><span class="n">unzip</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>The files downloaded look like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">ls</span> <span class="o">-</span><span class="n">la</span> <span class="o">../</span><span class="n">data</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>total 101232
drwxr-xr-x 5 brian staff 160 Mar 17 22:55 [34m.[m[m
drwxr-xr-x 25 brian staff 800 Mar 17 22:55 [34m..[m[m
-rw-r--r-- 1 brian staff 22738151 Mar 17 22:55 diabetes_012_health_indicators_BRFSS2015.csv
-rw-r--r-- 1 brian staff 6347570 Mar 17 22:55 diabetes_binary_5050split_health_indicators_BRFSS2015.csv
-rw-r--r-- 1 brian staff 22738154 Mar 17 22:55 diabetes_binary_health_indicators_BRFSS2015.csv
</code></pre></div>
<p>We'll focus on the "diabetes_binary_5050split_health_indicators_BRFSS2015.csv" dataset. Let's load the dataset into a Pandas dataframe:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="sa">f</span><span class="s1">'../data/diabetes_binary_5050split_health_indicators_BRFSS2015.csv'</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(70692, 22)
</code></pre></div>
<p>The unprocessed dataset has 70692 rows and 22 columns.</p>
<p>The dataframe columns are these:</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span><span class="o">.</span><span class="n">info</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><class 'pandas.core.frame.DataFrame'>
RangeIndex: 70692 entries, 0 to 70691
Data columns (total 22 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Diabetes_binary 70692 non-null float64
1 HighBP 70692 non-null float64
2 HighChol 70692 non-null float64
3 CholCheck 70692 non-null float64
4 BMI 70692 non-null float64
5 Smoker 70692 non-null float64
6 Stroke 70692 non-null float64
7 HeartDiseaseorAttack 70692 non-null float64
8 PhysActivity 70692 non-null float64
9 Fruits 70692 non-null float64
10 Veggies 70692 non-null float64
11 HvyAlcoholConsump 70692 non-null float64
12 AnyHealthcare 70692 non-null float64
13 NoDocbcCost 70692 non-null float64
14 GenHlth 70692 non-null float64
15 MentHlth 70692 non-null float64
16 PhysHlth 70692 non-null float64
17 DiffWalk 70692 non-null float64
18 Sex 70692 non-null float64
19 Age 70692 non-null float64
20 Education 70692 non-null float64
21 Income 70692 non-null float64
dtypes: float64(22)
memory usage: 11.9 MB
</code></pre></div>
<p>The columns names are not all easy to understand so we'll rename some of them:</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">rename</span><span class="p">(</span><span class="n">columns</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"Diabetes_binary"</span><span class="p">:</span> <span class="s2">"Diabetes"</span><span class="p">,</span>
<span class="s2">"HighBP"</span><span class="p">:</span> <span class="s2">"HighBloodPressure"</span><span class="p">,</span>
<span class="s2">"HighChol"</span><span class="p">:</span> <span class="s2">"HighCholesterol"</span><span class="p">,</span>
<span class="s2">"CholCheck"</span><span class="p">:</span> <span class="s2">"CholesterolChecked"</span><span class="p">,</span>
<span class="s2">"HeartDiseaseorAttack"</span><span class="p">:</span> <span class="s2">"HeartDiseaseOrHeartAttack"</span><span class="p">,</span>
<span class="s2">"PhysActivity"</span><span class="p">:</span> <span class="s2">"PhysicalActivity"</span><span class="p">,</span>
<span class="s2">"HvyAlcoholConsump"</span><span class="p">:</span> <span class="s2">"HeavyAlchoholConsumption"</span><span class="p">,</span>
<span class="s2">"NoDocbcCost"</span><span class="p">:</span> <span class="s2">"NoDoctorsVisitBecauseOfCost"</span><span class="p">,</span>
<span class="s2">"GenHlth"</span><span class="p">:</span> <span class="s2">"GeneralHealth"</span><span class="p">,</span>
<span class="s2">"MentHlth"</span><span class="p">:</span> <span class="s2">"MentalHealth"</span><span class="p">,</span>
<span class="s2">"PhysHlth"</span><span class="p">:</span> <span class="s2">"PhysicalHealth"</span><span class="p">,</span>
<span class="s2">"DiffWalk"</span><span class="p">:</span> <span class="s2">"DifficultyWalking"</span>
<span class="p">})</span>
</code></pre></div>
<h2>Profiling the Data</h2>
<p>In order to profile the data, we'll use the <a href="https://github.com/fbdesignpro/sweetviz">sweetviz</a> package. Let's install the package:</p>
<div class="highlight"><pre><span></span><code><span class="o">%</span><span class="n">pip</span> <span class="n">install</span> <span class="n">sweetviz</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To profile the data, all that is needed is two lines of code:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">sweetviz</span> <span class="k">as</span> <span class="nn">sv</span>
<span class="n">report</span> <span class="o">=</span> <span class="n">sv</span><span class="o">.</span><span class="n">analyze</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Once the report is created, we'll save it to disk as an HTML file.</p>
<div class="highlight"><pre><span></span><code><span class="n">report</span><span class="o">.</span><span class="n">show_html</span><span class="p">(</span><span class="n">filepath</span><span class="o">=</span><span class="s2">"../diabetes_risk_model/model_files/data_report.html"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">Report</span><span class="w"> </span><span class="p">..</span><span class="o">/</span><span class="n">diabetes_risk_model</span><span class="o">/</span><span class="n">model_files</span><span class="o">/</span><span class="n">data_report</span><span class="p">.</span><span class="n">html</span><span class="w"> </span><span class="n">was</span><span class="w"> </span><span class="n">generated</span><span class="o">!</span><span class="w"> </span><span class="n">NOTEBOOK</span><span class="o">/</span><span class="n">COLAB</span><span class="w"> </span><span class="nl">USERS:</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">web</span><span class="w"> </span><span class="n">browser</span><span class="w"> </span><span class="n">MAY</span><span class="w"> </span><span class="k">not</span><span class="w"> </span><span class="n">pop</span><span class="w"> </span><span class="n">up</span><span class="p">,</span><span class="w"> </span><span class="n">regardless</span><span class="p">,</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">report</span><span class="w"> </span><span class="n">IS</span><span class="w"> </span><span class="n">saved</span><span class="w"> </span><span class="n">in</span><span class="w"> </span><span class="n">your</span><span class="w"> </span><span class="n">notebook</span><span class="o">/</span><span class="n">colab</span><span class="w"> </span><span class="n">files</span><span class="p">.</span><span class="w"></span>
</code></pre></div>
<p>Right away the profile will tell us a few key details about the dataset:</p>
<p><img alt="Data Overview" src="https://www.tekhnoal.com/data_overview_spfmlm.png" width="100%"></p>
<p>The dataset has 1635 duplicate rows, it has 22 features, 18 of which are categorical and 4 of which are numerical. The profile has a description for each variable. Here's the description for the "Diabetes" variable, which we'll use as the target variable.</p>
<p><img alt="Variable Overview" src="https://www.tekhnoal.com/variable_overview_sdfmlm.png" width="100%"></p>
<p>By using the sweetviz package we can avoid writing the most common data profiling code. From the report we can tell that there are a few things we'll need to deal with:</p>
<ul>
<li>There are highly correlated variables.</li>
<li>Some variables have outliers.</li>
</ul>
<h2>Training a Model</h2>
<p>To train a model we'll be using the <a href="https://pycaret.org/">pycaret package</a>.</p>
<p>Let's install the package first:</p>
<div class="highlight"><pre><span></span><code><span class="o">%</span><span class="n">pip</span> <span class="n">install</span> <span class="o">--</span><span class="n">pre</span> <span class="n">pycaret</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>We'll setup the experiment like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pycaret.classification</span> <span class="kn">import</span> <span class="n">setup</span>
<span class="n">diabetes_experiment</span> <span class="o">=</span> <span class="n">setup</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">,</span>
<span class="n">target</span><span class="o">=</span><span class="s2">"Diabetes"</span><span class="p">,</span>
<span class="n">data_split_stratify</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">fix_imbalance</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">remove_outliers</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">normalize</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">feature_selection</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">remove_multicollinearity</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">session_id</span><span class="o">=</span><span class="mi">42</span><span class="p">)</span>
</code></pre></div>
<style type="text/css">
#T_8dfdd_row8_col1, #T_8dfdd_row12_col1, #T_8dfdd_row14_col1, #T_8dfdd_row16_col1, #T_8dfdd_row18_col1 {
background-color: lightgreen;
}
</style>
<table id="T_8dfdd">
<thead>
<tr>
<th class="blank level0" > </th>
<th id="T_8dfdd_level0_col0" class="col_heading level0 col0" >Description</th>
<th id="T_8dfdd_level0_col1" class="col_heading level0 col1" >Value</th>
</tr>
</thead>
<tbody>
<tr>
<th id="T_8dfdd_level0_row0" class="row_heading level0 row0" >0</th>
<td id="T_8dfdd_row0_col0" class="data row0 col0" >Session id</td>
<td id="T_8dfdd_row0_col1" class="data row0 col1" >42</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row1" class="row_heading level0 row1" >1</th>
<td id="T_8dfdd_row1_col0" class="data row1 col0" >Target</td>
<td id="T_8dfdd_row1_col1" class="data row1 col1" >Diabetes</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row2" class="row_heading level0 row2" >2</th>
<td id="T_8dfdd_row2_col0" class="data row2 col0" >Target type</td>
<td id="T_8dfdd_row2_col1" class="data row2 col1" >Binary</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row3" class="row_heading level0 row3" >3</th>
<td id="T_8dfdd_row3_col0" class="data row3 col0" >Original data shape</td>
<td id="T_8dfdd_row3_col1" class="data row3 col1" >(70692, 22)</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row4" class="row_heading level0 row4" >4</th>
<td id="T_8dfdd_row4_col0" class="data row4 col0" >Transformed data shape</td>
<td id="T_8dfdd_row4_col1" class="data row4 col1" >(68683, 5)</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row5" class="row_heading level0 row5" >5</th>
<td id="T_8dfdd_row5_col0" class="data row5 col0" >Transformed train set shape</td>
<td id="T_8dfdd_row5_col1" class="data row5 col1" >(47421, 5)</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row6" class="row_heading level0 row6" >6</th>
<td id="T_8dfdd_row6_col0" class="data row6 col0" >Transformed test set shape</td>
<td id="T_8dfdd_row6_col1" class="data row6 col1" >(21208, 5)</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row7" class="row_heading level0 row7" >7</th>
<td id="T_8dfdd_row7_col0" class="data row7 col0" >Numeric features</td>
<td id="T_8dfdd_row7_col1" class="data row7 col1" >21</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row8" class="row_heading level0 row8" >8</th>
<td id="T_8dfdd_row8_col0" class="data row8 col0" >Preprocess</td>
<td id="T_8dfdd_row8_col1" class="data row8 col1" >True</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row9" class="row_heading level0 row9" >9</th>
<td id="T_8dfdd_row9_col0" class="data row9 col0" >Imputation type</td>
<td id="T_8dfdd_row9_col1" class="data row9 col1" >simple</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row10" class="row_heading level0 row10" >10</th>
<td id="T_8dfdd_row10_col0" class="data row10 col0" >Numeric imputation</td>
<td id="T_8dfdd_row10_col1" class="data row10 col1" >mean</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row11" class="row_heading level0 row11" >11</th>
<td id="T_8dfdd_row11_col0" class="data row11 col0" >Categorical imputation</td>
<td id="T_8dfdd_row11_col1" class="data row11 col1" >mode</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row12" class="row_heading level0 row12" >12</th>
<td id="T_8dfdd_row12_col0" class="data row12 col0" >Remove multicollinearity</td>
<td id="T_8dfdd_row12_col1" class="data row12 col1" >True</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row13" class="row_heading level0 row13" >13</th>
<td id="T_8dfdd_row13_col0" class="data row13 col0" >Multicollinearity threshold</td>
<td id="T_8dfdd_row13_col1" class="data row13 col1" >0.900000</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row14" class="row_heading level0 row14" >14</th>
<td id="T_8dfdd_row14_col0" class="data row14 col0" >Remove outliers</td>
<td id="T_8dfdd_row14_col1" class="data row14 col1" >True</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row15" class="row_heading level0 row15" >15</th>
<td id="T_8dfdd_row15_col0" class="data row15 col0" >Outliers threshold</td>
<td id="T_8dfdd_row15_col1" class="data row15 col1" >0.050000</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row16" class="row_heading level0 row16" >16</th>
<td id="T_8dfdd_row16_col0" class="data row16 col0" >Normalize</td>
<td id="T_8dfdd_row16_col1" class="data row16 col1" >True</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row17" class="row_heading level0 row17" >17</th>
<td id="T_8dfdd_row17_col0" class="data row17 col0" >Normalize method</td>
<td id="T_8dfdd_row17_col1" class="data row17 col1" >zscore</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row18" class="row_heading level0 row18" >18</th>
<td id="T_8dfdd_row18_col0" class="data row18 col0" >Feature selection</td>
<td id="T_8dfdd_row18_col1" class="data row18 col1" >True</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row19" class="row_heading level0 row19" >19</th>
<td id="T_8dfdd_row19_col0" class="data row19 col0" >Feature selection method</td>
<td id="T_8dfdd_row19_col1" class="data row19 col1" >classic</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row20" class="row_heading level0 row20" >20</th>
<td id="T_8dfdd_row20_col0" class="data row20 col0" >Feature selection estimator</td>
<td id="T_8dfdd_row20_col1" class="data row20 col1" >lightgbm</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row21" class="row_heading level0 row21" >21</th>
<td id="T_8dfdd_row21_col0" class="data row21 col0" >Number of features selected</td>
<td id="T_8dfdd_row21_col1" class="data row21 col1" >0.200000</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row22" class="row_heading level0 row22" >22</th>
<td id="T_8dfdd_row22_col0" class="data row22 col0" >Fold Generator</td>
<td id="T_8dfdd_row22_col1" class="data row22 col1" >StratifiedKFold</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row23" class="row_heading level0 row23" >23</th>
<td id="T_8dfdd_row23_col0" class="data row23 col0" >Fold Number</td>
<td id="T_8dfdd_row23_col1" class="data row23 col1" >10</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row24" class="row_heading level0 row24" >24</th>
<td id="T_8dfdd_row24_col0" class="data row24 col0" >CPU Jobs</td>
<td id="T_8dfdd_row24_col1" class="data row24 col1" >-1</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row25" class="row_heading level0 row25" >25</th>
<td id="T_8dfdd_row25_col0" class="data row25 col0" >Use GPU</td>
<td id="T_8dfdd_row25_col1" class="data row25 col1" >False</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row26" class="row_heading level0 row26" >26</th>
<td id="T_8dfdd_row26_col0" class="data row26 col0" >Log Experiment</td>
<td id="T_8dfdd_row26_col1" class="data row26 col1" >False</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row27" class="row_heading level0 row27" >27</th>
<td id="T_8dfdd_row27_col0" class="data row27 col0" >Experiment Name</td>
<td id="T_8dfdd_row27_col1" class="data row27 col1" >clf-default-name</td>
</tr>
<tr>
<th id="T_8dfdd_level0_row28" class="row_heading level0 row28" >28</th>
<td id="T_8dfdd_row28_col0" class="data row28 col0" >USI</td>
<td id="T_8dfdd_row28_col1" class="data row28 col1" >bd08</td>
</tr>
</tbody>
</table>
<p>We're telling pycaret that the target column is target="Diabetes". We're also asking the pycaret package to take care of several problems in the dataset. The fix_imbalance parameter tells pycaret to not try to balance the target variable. The remove_outliers parameter tells the package to remove outliers using PCA linear dimensionality reduction. The feature_selection option tells the package to remove unnecessary features from the training set. The remove_multicollinearity option tells the package to drop a feature if it is highly linearly correlated with other features.</p>
<p>After analyzing the dataset, we can see that pycaret removed some samples and some columns from the dataset. The original dataset had 70,692 samples, the preprocessed dataset has 68,683 samples. Pycaret also removed features, we had 21 features starting out, after preprocessing only 5 features remained. Pycaret has also added data imputers in the prediction pipeline, we'll use these later to deal with missing values when making predictions.</p>
<p>Once pycaret has been setup, we're ready to train some models. </p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pycaret.classification</span> <span class="kn">import</span> <span class="n">compare_models</span>
<span class="n">best_model</span> <span class="o">=</span> <span class="n">compare_models</span><span class="p">()</span>
</code></pre></div>
<style type="text/css">
#T_990fb th {
text-align: left;
}
#T_990fb_row0_col0, #T_990fb_row0_col3, #T_990fb_row0_col4, #T_990fb_row1_col0, #T_990fb_row1_col1, #T_990fb_row1_col2, #T_990fb_row1_col4, #T_990fb_row1_col5, #T_990fb_row1_col6, #T_990fb_row1_col7, #T_990fb_row2_col0, #T_990fb_row2_col1, #T_990fb_row2_col2, #T_990fb_row2_col3, #T_990fb_row2_col4, #T_990fb_row2_col5, #T_990fb_row2_col6, #T_990fb_row2_col7, #T_990fb_row3_col0, #T_990fb_row3_col1, #T_990fb_row3_col2, #T_990fb_row3_col3, #T_990fb_row3_col4, #T_990fb_row3_col5, #T_990fb_row3_col6, #T_990fb_row3_col7, #T_990fb_row4_col0, #T_990fb_row4_col1, #T_990fb_row4_col2, #T_990fb_row4_col3, #T_990fb_row4_col4, #T_990fb_row4_col5, #T_990fb_row4_col6, #T_990fb_row4_col7, #T_990fb_row5_col0, #T_990fb_row5_col1, #T_990fb_row5_col2, #T_990fb_row5_col3, #T_990fb_row5_col5, #T_990fb_row5_col6, #T_990fb_row5_col7, #T_990fb_row6_col0, #T_990fb_row6_col1, #T_990fb_row6_col2, #T_990fb_row6_col3, #T_990fb_row6_col4, #T_990fb_row6_col5, #T_990fb_row6_col6, #T_990fb_row6_col7, #T_990fb_row7_col0, #T_990fb_row7_col1, #T_990fb_row7_col2, #T_990fb_row7_col3, #T_990fb_row7_col4, #T_990fb_row7_col5, #T_990fb_row7_col6, #T_990fb_row7_col7, #T_990fb_row8_col0, #T_990fb_row8_col1, #T_990fb_row8_col2, #T_990fb_row8_col3, #T_990fb_row8_col4, #T_990fb_row8_col5, #T_990fb_row8_col6, #T_990fb_row8_col7, #T_990fb_row9_col0, #T_990fb_row9_col1, #T_990fb_row9_col2, #T_990fb_row9_col3, #T_990fb_row9_col4, #T_990fb_row9_col5, #T_990fb_row9_col6, #T_990fb_row9_col7, #T_990fb_row10_col0, #T_990fb_row10_col1, #T_990fb_row10_col2, #T_990fb_row10_col3, #T_990fb_row10_col4, #T_990fb_row10_col5, #T_990fb_row10_col6, #T_990fb_row10_col7, #T_990fb_row11_col0, #T_990fb_row11_col1, #T_990fb_row11_col2, #T_990fb_row11_col3, #T_990fb_row11_col4, #T_990fb_row11_col5, #T_990fb_row11_col6, #T_990fb_row11_col7, #T_990fb_row12_col0, #T_990fb_row12_col1, #T_990fb_row12_col2, #T_990fb_row12_col3, #T_990fb_row12_col4, #T_990fb_row12_col5, #T_990fb_row12_col6, #T_990fb_row12_col7, #T_990fb_row13_col0, #T_990fb_row13_col1, #T_990fb_row13_col2, #T_990fb_row13_col3, #T_990fb_row13_col4, #T_990fb_row13_col5, #T_990fb_row13_col6, #T_990fb_row13_col7 {
text-align: left;
}
#T_990fb_row0_col1, #T_990fb_row0_col2, #T_990fb_row0_col5, #T_990fb_row0_col6, #T_990fb_row0_col7, #T_990fb_row1_col3, #T_990fb_row5_col4 {
text-align: left;
background-color: yellow;
}
#T_990fb_row0_col8, #T_990fb_row1_col8, #T_990fb_row2_col8, #T_990fb_row3_col8, #T_990fb_row4_col8, #T_990fb_row5_col8, #T_990fb_row6_col8, #T_990fb_row7_col8, #T_990fb_row8_col8, #T_990fb_row9_col8, #T_990fb_row10_col8, #T_990fb_row11_col8, #T_990fb_row12_col8 {
text-align: left;
background-color: lightgrey;
}
#T_990fb_row13_col8 {
text-align: left;
background-color: yellow;
background-color: lightgrey;
}
</style>
<table id="T_990fb">
<thead>
<tr>
<th class="blank level0" > </th>
<th id="T_990fb_level0_col0" class="col_heading level0 col0" >Model</th>
<th id="T_990fb_level0_col1" class="col_heading level0 col1" >Accuracy</th>
<th id="T_990fb_level0_col2" class="col_heading level0 col2" >AUC</th>
<th id="T_990fb_level0_col3" class="col_heading level0 col3" >Recall</th>
<th id="T_990fb_level0_col4" class="col_heading level0 col4" >Prec.</th>
<th id="T_990fb_level0_col5" class="col_heading level0 col5" >F1</th>
<th id="T_990fb_level0_col6" class="col_heading level0 col6" >Kappa</th>
<th id="T_990fb_level0_col7" class="col_heading level0 col7" >MCC</th>
<th id="T_990fb_level0_col8" class="col_heading level0 col8" >TT (Sec)</th>
</tr>
</thead>
<tbody>
<tr>
<th id="T_990fb_level0_row0" class="row_heading level0 row0" >gbc</th>
<td id="T_990fb_row0_col0" class="data row0 col0" >Gradient Boosting Classifier</td>
<td id="T_990fb_row0_col1" class="data row0 col1" >0.7318</td>
<td id="T_990fb_row0_col2" class="data row0 col2" >0.8069</td>
<td id="T_990fb_row0_col3" class="data row0 col3" >0.7818</td>
<td id="T_990fb_row0_col4" class="data row0 col4" >0.7108</td>
<td id="T_990fb_row0_col5" class="data row0 col5" >0.7446</td>
<td id="T_990fb_row0_col6" class="data row0 col6" >0.4636</td>
<td id="T_990fb_row0_col7" class="data row0 col7" >0.4660</td>
<td id="T_990fb_row0_col8" class="data row0 col8" >0.6810</td>
</tr>
<tr>
<th id="T_990fb_level0_row1" class="row_heading level0 row1" >lightgbm</th>
<td id="T_990fb_row1_col0" class="data row1 col0" >Light Gradient Boosting Machine</td>
<td id="T_990fb_row1_col1" class="data row1 col1" >0.7313</td>
<td id="T_990fb_row1_col2" class="data row1 col2" >0.8056</td>
<td id="T_990fb_row1_col3" class="data row1 col3" >0.7823</td>
<td id="T_990fb_row1_col4" class="data row1 col4" >0.7100</td>
<td id="T_990fb_row1_col5" class="data row1 col5" >0.7444</td>
<td id="T_990fb_row1_col6" class="data row1 col6" >0.4627</td>
<td id="T_990fb_row1_col7" class="data row1 col7" >0.4652</td>
<td id="T_990fb_row1_col8" class="data row1 col8" >0.2000</td>
</tr>
<tr>
<th id="T_990fb_level0_row2" class="row_heading level0 row2" >ada</th>
<td id="T_990fb_row2_col0" class="data row2 col0" >Ada Boost Classifier</td>
<td id="T_990fb_row2_col1" class="data row2 col1" >0.7309</td>
<td id="T_990fb_row2_col2" class="data row2 col2" >0.8053</td>
<td id="T_990fb_row2_col3" class="data row2 col3" >0.7616</td>
<td id="T_990fb_row2_col4" class="data row2 col4" >0.7176</td>
<td id="T_990fb_row2_col5" class="data row2 col5" >0.7389</td>
<td id="T_990fb_row2_col6" class="data row2 col6" >0.4618</td>
<td id="T_990fb_row2_col7" class="data row2 col7" >0.4627</td>
<td id="T_990fb_row2_col8" class="data row2 col8" >0.3710</td>
</tr>
<tr>
<th id="T_990fb_level0_row3" class="row_heading level0 row3" >ridge</th>
<td id="T_990fb_row3_col0" class="data row3 col0" >Ridge Classifier</td>
<td id="T_990fb_row3_col1" class="data row3 col1" >0.7281</td>
<td id="T_990fb_row3_col2" class="data row3 col2" >0.0000</td>
<td id="T_990fb_row3_col3" class="data row3 col3" >0.7444</td>
<td id="T_990fb_row3_col4" class="data row3 col4" >0.7210</td>
<td id="T_990fb_row3_col5" class="data row3 col5" >0.7325</td>
<td id="T_990fb_row3_col6" class="data row3 col6" >0.4562</td>
<td id="T_990fb_row3_col7" class="data row3 col7" >0.4565</td>
<td id="T_990fb_row3_col8" class="data row3 col8" >0.1030</td>
</tr>
<tr>
<th id="T_990fb_level0_row4" class="row_heading level0 row4" >lda</th>
<td id="T_990fb_row4_col0" class="data row4 col0" >Linear Discriminant Analysis</td>
<td id="T_990fb_row4_col1" class="data row4 col1" >0.7281</td>
<td id="T_990fb_row4_col2" class="data row4 col2" >0.8007</td>
<td id="T_990fb_row4_col3" class="data row4 col3" >0.7444</td>
<td id="T_990fb_row4_col4" class="data row4 col4" >0.7210</td>
<td id="T_990fb_row4_col5" class="data row4 col5" >0.7325</td>
<td id="T_990fb_row4_col6" class="data row4 col6" >0.4562</td>
<td id="T_990fb_row4_col7" class="data row4 col7" >0.4565</td>
<td id="T_990fb_row4_col8" class="data row4 col8" >0.1040</td>
</tr>
<tr>
<th id="T_990fb_level0_row5" class="row_heading level0 row5" >lr</th>
<td id="T_990fb_row5_col0" class="data row5 col0" >Logistic Regression</td>
<td id="T_990fb_row5_col1" class="data row5 col1" >0.7279</td>
<td id="T_990fb_row5_col2" class="data row5 col2" >0.8015</td>
<td id="T_990fb_row5_col3" class="data row5 col3" >0.7409</td>
<td id="T_990fb_row5_col4" class="data row5 col4" >0.7222</td>
<td id="T_990fb_row5_col5" class="data row5 col5" >0.7314</td>
<td id="T_990fb_row5_col6" class="data row5 col6" >0.4559</td>
<td id="T_990fb_row5_col7" class="data row5 col7" >0.4561</td>
<td id="T_990fb_row5_col8" class="data row5 col8" >1.6740</td>
</tr>
<tr>
<th id="T_990fb_level0_row6" class="row_heading level0 row6" >svm</th>
<td id="T_990fb_row6_col0" class="data row6 col0" >SVM - Linear Kernel</td>
<td id="T_990fb_row6_col1" class="data row6 col1" >0.7265</td>
<td id="T_990fb_row6_col2" class="data row6 col2" >0.0000</td>
<td id="T_990fb_row6_col3" class="data row6 col3" >0.7552</td>
<td id="T_990fb_row6_col4" class="data row6 col4" >0.7148</td>
<td id="T_990fb_row6_col5" class="data row6 col5" >0.7338</td>
<td id="T_990fb_row6_col6" class="data row6 col6" >0.4529</td>
<td id="T_990fb_row6_col7" class="data row6 col7" >0.4545</td>
<td id="T_990fb_row6_col8" class="data row6 col8" >0.1080</td>
</tr>
<tr>
<th id="T_990fb_level0_row7" class="row_heading level0 row7" >qda</th>
<td id="T_990fb_row7_col0" class="data row7 col0" >Quadratic Discriminant Analysis</td>
<td id="T_990fb_row7_col1" class="data row7 col1" >0.7265</td>
<td id="T_990fb_row7_col2" class="data row7 col2" >0.7940</td>
<td id="T_990fb_row7_col3" class="data row7 col3" >0.7610</td>
<td id="T_990fb_row7_col4" class="data row7 col4" >0.7119</td>
<td id="T_990fb_row7_col5" class="data row7 col5" >0.7356</td>
<td id="T_990fb_row7_col6" class="data row7 col6" >0.4530</td>
<td id="T_990fb_row7_col7" class="data row7 col7" >0.4541</td>
<td id="T_990fb_row7_col8" class="data row7 col8" >0.1000</td>
</tr>
<tr>
<th id="T_990fb_level0_row8" class="row_heading level0 row8" >nb</th>
<td id="T_990fb_row8_col0" class="data row8 col0" >Naive Bayes</td>
<td id="T_990fb_row8_col1" class="data row8 col1" >0.7210</td>
<td id="T_990fb_row8_col2" class="data row8 col2" >0.7939</td>
<td id="T_990fb_row8_col3" class="data row8 col3" >0.7207</td>
<td id="T_990fb_row8_col4" class="data row8 col4" >0.7212</td>
<td id="T_990fb_row8_col5" class="data row8 col5" >0.7209</td>
<td id="T_990fb_row8_col6" class="data row8 col6" >0.4420</td>
<td id="T_990fb_row8_col7" class="data row8 col7" >0.4420</td>
<td id="T_990fb_row8_col8" class="data row8 col8" >0.1090</td>
</tr>
<tr>
<th id="T_990fb_level0_row9" class="row_heading level0 row9" >rf</th>
<td id="T_990fb_row9_col0" class="data row9 col0" >Random Forest Classifier</td>
<td id="T_990fb_row9_col1" class="data row9 col1" >0.7001</td>
<td id="T_990fb_row9_col2" class="data row9 col2" >0.7617</td>
<td id="T_990fb_row9_col3" class="data row9 col3" >0.7204</td>
<td id="T_990fb_row9_col4" class="data row9 col4" >0.6923</td>
<td id="T_990fb_row9_col5" class="data row9 col5" >0.7061</td>
<td id="T_990fb_row9_col6" class="data row9 col6" >0.4002</td>
<td id="T_990fb_row9_col7" class="data row9 col7" >0.4006</td>
<td id="T_990fb_row9_col8" class="data row9 col8" >1.1040</td>
</tr>
<tr>
<th id="T_990fb_level0_row10" class="row_heading level0 row10" >knn</th>
<td id="T_990fb_row10_col0" class="data row10 col0" >K Neighbors Classifier</td>
<td id="T_990fb_row10_col1" class="data row10 col1" >0.6942</td>
<td id="T_990fb_row10_col2" class="data row10 col2" >0.7485</td>
<td id="T_990fb_row10_col3" class="data row10 col3" >0.7166</td>
<td id="T_990fb_row10_col4" class="data row10 col4" >0.6859</td>
<td id="T_990fb_row10_col5" class="data row10 col5" >0.7008</td>
<td id="T_990fb_row10_col6" class="data row10 col6" >0.3883</td>
<td id="T_990fb_row10_col7" class="data row10 col7" >0.3888</td>
<td id="T_990fb_row10_col8" class="data row10 col8" >0.2640</td>
</tr>
<tr>
<th id="T_990fb_level0_row11" class="row_heading level0 row11" >et</th>
<td id="T_990fb_row11_col0" class="data row11 col0" >Extra Trees Classifier</td>
<td id="T_990fb_row11_col1" class="data row11 col1" >0.6900</td>
<td id="T_990fb_row11_col2" class="data row11 col2" >0.7467</td>
<td id="T_990fb_row11_col3" class="data row11 col3" >0.6760</td>
<td id="T_990fb_row11_col4" class="data row11 col4" >0.6955</td>
<td id="T_990fb_row11_col5" class="data row11 col5" >0.6856</td>
<td id="T_990fb_row11_col6" class="data row11 col6" >0.3800</td>
<td id="T_990fb_row11_col7" class="data row11 col7" >0.3802</td>
<td id="T_990fb_row11_col8" class="data row11 col8" >1.0550</td>
</tr>
<tr>
<th id="T_990fb_level0_row12" class="row_heading level0 row12" >dt</th>
<td id="T_990fb_row12_col0" class="data row12 col0" >Decision Tree Classifier</td>
<td id="T_990fb_row12_col1" class="data row12 col1" >0.6867</td>
<td id="T_990fb_row12_col2" class="data row12 col2" >0.7368</td>
<td id="T_990fb_row12_col3" class="data row12 col3" >0.6688</td>
<td id="T_990fb_row12_col4" class="data row12 col4" >0.6937</td>
<td id="T_990fb_row12_col5" class="data row12 col5" >0.6810</td>
<td id="T_990fb_row12_col6" class="data row12 col6" >0.3735</td>
<td id="T_990fb_row12_col7" class="data row12 col7" >0.3738</td>
<td id="T_990fb_row12_col8" class="data row12 col8" >0.1070</td>
</tr>
<tr>
<th id="T_990fb_level0_row13" class="row_heading level0 row13" >dummy</th>
<td id="T_990fb_row13_col0" class="data row13 col0" >Dummy Classifier</td>
<td id="T_990fb_row13_col1" class="data row13 col1" >0.5000</td>
<td id="T_990fb_row13_col2" class="data row13 col2" >0.5000</td>
<td id="T_990fb_row13_col3" class="data row13 col3" >0.0000</td>
<td id="T_990fb_row13_col4" class="data row13 col4" >0.0000</td>
<td id="T_990fb_row13_col5" class="data row13 col5" >0.0000</td>
<td id="T_990fb_row13_col6" class="data row13 col6" >0.0000</td>
<td id="T_990fb_row13_col7" class="data row13 col7" >0.0000</td>
<td id="T_990fb_row13_col8" class="data row13 col8" >0.0890</td>
</tr>
</tbody>
</table>
<p>The function displays a table of the model metrics, highlighting the models with the highest metrics in each category. The function also returns the best model found:</p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">best_model</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>GradientBoostingClassifier(ccp_alpha=0.0, criterion='friedman_mse', init=None,
learning_rate=0.1, loss='log_loss', max_depth=3,
max_features=None, max_leaf_nodes=None,
min_impurity_decrease=0.0, min_samples_leaf=1,
min_samples_split=2, min_weight_fraction_leaf=0.0,
n_estimators=100, n_iter_no_change=None,
random_state=42, subsample=1.0, tol=0.0001,
validation_fraction=0.1, verbose=0,
warm_start=False)
</code></pre></div>
<p>In this case, pycaret returned the GradientBoostingClassifier as the best model. The model selected has the highest accuracy, AUC, recall, and F1 score, but does not have the highest precision. This first step is only to get an idea of the way the different types of models perform on the problem. We'll need to choose among the models for the one that meets our requirements. </p>
<p>There are other things to take into account when selecting a model. For example, certain models take a lot more memory and CPU time to make predictions. In certain situations, it would be better to select a model with lower accuracy but that is able to meet the requirements of the deployment environment.</p>
<p>It looks like the Gradient Boosting Classifier model has the highest F1 score, while also having a high accuracy. So we'll select it to keep working with. To train a gbc model, we'll call the pycaret create_model() function.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pycaret.classification</span> <span class="kn">import</span> <span class="n">create_model</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">create_model</span><span class="p">(</span><span class="s2">"gbc"</span><span class="p">)</span>
</code></pre></div>
<style type="text/css">
#T_95af0_row10_col0, #T_95af0_row10_col1, #T_95af0_row10_col2, #T_95af0_row10_col3, #T_95af0_row10_col4, #T_95af0_row10_col5, #T_95af0_row10_col6 {
background: yellow;
}
</style>
<table id="T_95af0">
<thead>
<tr>
<th class="blank level0" > </th>
<th id="T_95af0_level0_col0" class="col_heading level0 col0" >Accuracy</th>
<th id="T_95af0_level0_col1" class="col_heading level0 col1" >AUC</th>
<th id="T_95af0_level0_col2" class="col_heading level0 col2" >Recall</th>
<th id="T_95af0_level0_col3" class="col_heading level0 col3" >Prec.</th>
<th id="T_95af0_level0_col4" class="col_heading level0 col4" >F1</th>
<th id="T_95af0_level0_col5" class="col_heading level0 col5" >Kappa</th>
<th id="T_95af0_level0_col6" class="col_heading level0 col6" >MCC</th>
</tr>
<tr>
<th class="index_name level0" >Fold</th>
<th class="blank col0" > </th>
<th class="blank col1" > </th>
<th class="blank col2" > </th>
<th class="blank col3" > </th>
<th class="blank col4" > </th>
<th class="blank col5" > </th>
<th class="blank col6" > </th>
</tr>
</thead>
<tbody>
<tr>
<th id="T_95af0_level0_row0" class="row_heading level0 row0" >0</th>
<td id="T_95af0_row0_col0" class="data row0 col0" >0.7264</td>
<td id="T_95af0_row0_col1" class="data row0 col1" >0.8041</td>
<td id="T_95af0_row0_col2" class="data row0 col2" >0.7770</td>
<td id="T_95af0_row0_col3" class="data row0 col3" >0.7057</td>
<td id="T_95af0_row0_col4" class="data row0 col4" >0.7396</td>
<td id="T_95af0_row0_col5" class="data row0 col5" >0.4528</td>
<td id="T_95af0_row0_col6" class="data row0 col6" >0.4551</td>
</tr>
<tr>
<th id="T_95af0_level0_row1" class="row_heading level0 row1" >1</th>
<td id="T_95af0_row1_col0" class="data row1 col0" >0.7367</td>
<td id="T_95af0_row1_col1" class="data row1 col1" >0.8050</td>
<td id="T_95af0_row1_col2" class="data row1 col2" >0.7907</td>
<td id="T_95af0_row1_col3" class="data row1 col3" >0.7137</td>
<td id="T_95af0_row1_col4" class="data row1 col4" >0.7502</td>
<td id="T_95af0_row1_col5" class="data row1 col5" >0.4734</td>
<td id="T_95af0_row1_col6" class="data row1 col6" >0.4762</td>
</tr>
<tr>
<th id="T_95af0_level0_row2" class="row_heading level0 row2" >2</th>
<td id="T_95af0_row2_col0" class="data row2 col0" >0.7298</td>
<td id="T_95af0_row2_col1" class="data row2 col1" >0.8048</td>
<td id="T_95af0_row2_col2" class="data row2 col2" >0.7684</td>
<td id="T_95af0_row2_col3" class="data row2 col3" >0.7133</td>
<td id="T_95af0_row2_col4" class="data row2 col4" >0.7398</td>
<td id="T_95af0_row2_col5" class="data row2 col5" >0.4597</td>
<td id="T_95af0_row2_col6" class="data row2 col6" >0.4611</td>
</tr>
<tr>
<th id="T_95af0_level0_row3" class="row_heading level0 row3" >3</th>
<td id="T_95af0_row3_col0" class="data row3 col0" >0.7357</td>
<td id="T_95af0_row3_col1" class="data row3 col1" >0.8053</td>
<td id="T_95af0_row3_col2" class="data row3 col2" >0.7979</td>
<td id="T_95af0_row3_col3" class="data row3 col3" >0.7096</td>
<td id="T_95af0_row3_col4" class="data row3 col4" >0.7511</td>
<td id="T_95af0_row3_col5" class="data row3 col5" >0.4714</td>
<td id="T_95af0_row3_col6" class="data row3 col6" >0.4751</td>
</tr>
<tr>
<th id="T_95af0_level0_row4" class="row_heading level0 row4" >4</th>
<td id="T_95af0_row4_col0" class="data row4 col0" >0.7318</td>
<td id="T_95af0_row4_col1" class="data row4 col1" >0.8098</td>
<td id="T_95af0_row4_col2" class="data row4 col2" >0.7765</td>
<td id="T_95af0_row4_col3" class="data row4 col3" >0.7128</td>
<td id="T_95af0_row4_col4" class="data row4 col4" >0.7433</td>
<td id="T_95af0_row4_col5" class="data row4 col5" >0.4636</td>
<td id="T_95af0_row4_col6" class="data row4 col6" >0.4655</td>
</tr>
<tr>
<th id="T_95af0_level0_row5" class="row_heading level0 row5" >5</th>
<td id="T_95af0_row5_col0" class="data row5 col0" >0.7314</td>
<td id="T_95af0_row5_col1" class="data row5 col1" >0.8075</td>
<td id="T_95af0_row5_col2" class="data row5 col2" >0.7765</td>
<td id="T_95af0_row5_col3" class="data row5 col3" >0.7123</td>
<td id="T_95af0_row5_col4" class="data row5 col4" >0.7430</td>
<td id="T_95af0_row5_col5" class="data row5 col5" >0.4628</td>
<td id="T_95af0_row5_col6" class="data row5 col6" >0.4647</td>
</tr>
<tr>
<th id="T_95af0_level0_row6" class="row_heading level0 row6" >6</th>
<td id="T_95af0_row6_col0" class="data row6 col0" >0.7268</td>
<td id="T_95af0_row6_col1" class="data row6 col1" >0.7999</td>
<td id="T_95af0_row6_col2" class="data row6 col2" >0.7817</td>
<td id="T_95af0_row6_col3" class="data row6 col3" >0.7043</td>
<td id="T_95af0_row6_col4" class="data row6 col4" >0.7410</td>
<td id="T_95af0_row6_col5" class="data row6 col5" >0.4535</td>
<td id="T_95af0_row6_col6" class="data row6 col6" >0.4563</td>
</tr>
<tr>
<th id="T_95af0_level0_row7" class="row_heading level0 row7" >7</th>
<td id="T_95af0_row7_col0" class="data row7 col0" >0.7357</td>
<td id="T_95af0_row7_col1" class="data row7 col1" >0.8104</td>
<td id="T_95af0_row7_col2" class="data row7 col2" >0.7890</td>
<td id="T_95af0_row7_col3" class="data row7 col3" >0.7129</td>
<td id="T_95af0_row7_col4" class="data row7 col4" >0.7490</td>
<td id="T_95af0_row7_col5" class="data row7 col5" >0.4713</td>
<td id="T_95af0_row7_col6" class="data row7 col6" >0.4740</td>
</tr>
<tr>
<th id="T_95af0_level0_row8" class="row_heading level0 row8" >8</th>
<td id="T_95af0_row8_col0" class="data row8 col0" >0.7310</td>
<td id="T_95af0_row8_col1" class="data row8 col1" >0.8060</td>
<td id="T_95af0_row8_col2" class="data row8 col2" >0.7789</td>
<td id="T_95af0_row8_col3" class="data row8 col3" >0.7108</td>
<td id="T_95af0_row8_col4" class="data row8 col4" >0.7433</td>
<td id="T_95af0_row8_col5" class="data row8 col5" >0.4620</td>
<td id="T_95af0_row8_col6" class="data row8 col6" >0.4641</td>
</tr>
<tr>
<th id="T_95af0_level0_row9" class="row_heading level0 row9" >9</th>
<td id="T_95af0_row9_col0" class="data row9 col0" >0.7328</td>
<td id="T_95af0_row9_col1" class="data row9 col1" >0.8164</td>
<td id="T_95af0_row9_col2" class="data row9 col2" >0.7813</td>
<td id="T_95af0_row9_col3" class="data row9 col3" >0.7122</td>
<td id="T_95af0_row9_col4" class="data row9 col4" >0.7452</td>
<td id="T_95af0_row9_col5" class="data row9 col5" >0.4656</td>
<td id="T_95af0_row9_col6" class="data row9 col6" >0.4678</td>
</tr>
<tr>
<th id="T_95af0_level0_row10" class="row_heading level0 row10" >Mean</th>
<td id="T_95af0_row10_col0" class="data row10 col0" >0.7318</td>
<td id="T_95af0_row10_col1" class="data row10 col1" >0.8069</td>
<td id="T_95af0_row10_col2" class="data row10 col2" >0.7818</td>
<td id="T_95af0_row10_col3" class="data row10 col3" >0.7108</td>
<td id="T_95af0_row10_col4" class="data row10 col4" >0.7446</td>
<td id="T_95af0_row10_col5" class="data row10 col5" >0.4636</td>
<td id="T_95af0_row10_col6" class="data row10 col6" >0.4660</td>
</tr>
<tr>
<th id="T_95af0_level0_row11" class="row_heading level0 row11" >Std</th>
<td id="T_95af0_row11_col0" class="data row11 col0" >0.0034</td>
<td id="T_95af0_row11_col1" class="data row11 col1" >0.0042</td>
<td id="T_95af0_row11_col2" class="data row11 col2" >0.0081</td>
<td id="T_95af0_row11_col3" class="data row11 col3" >0.0031</td>
<td id="T_95af0_row11_col4" class="data row11 col4" >0.0040</td>
<td id="T_95af0_row11_col5" class="data row11 col5" >0.0068</td>
<td id="T_95af0_row11_col6" class="data row11 col6" >0.0070</td>
</tr>
</tbody>
</table>
<p>Once the model has been created, we can do hyperparameter tuning with the tune_model() function.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pycaret.classification</span> <span class="kn">import</span> <span class="n">tune_model</span>
<span class="n">tuned_model</span> <span class="o">=</span> <span class="n">tune_model</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">n_iter</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">optimize</span><span class="o">=</span><span class="s2">"F1"</span><span class="p">)</span>
</code></pre></div>
<style type="text/css">
#T_c0a20_row10_col0, #T_c0a20_row10_col1, #T_c0a20_row10_col2, #T_c0a20_row10_col3, #T_c0a20_row10_col4, #T_c0a20_row10_col5, #T_c0a20_row10_col6 {
background: yellow;
}
</style>
<table id="T_c0a20">
<thead>
<tr>
<th class="blank level0" > </th>
<th id="T_c0a20_level0_col0" class="col_heading level0 col0" >Accuracy</th>
<th id="T_c0a20_level0_col1" class="col_heading level0 col1" >AUC</th>
<th id="T_c0a20_level0_col2" class="col_heading level0 col2" >Recall</th>
<th id="T_c0a20_level0_col3" class="col_heading level0 col3" >Prec.</th>
<th id="T_c0a20_level0_col4" class="col_heading level0 col4" >F1</th>
<th id="T_c0a20_level0_col5" class="col_heading level0 col5" >Kappa</th>
<th id="T_c0a20_level0_col6" class="col_heading level0 col6" >MCC</th>
</tr>
<tr>
<th class="index_name level0" >Fold</th>
<th class="blank col0" > </th>
<th class="blank col1" > </th>
<th class="blank col2" > </th>
<th class="blank col3" > </th>
<th class="blank col4" > </th>
<th class="blank col5" > </th>
<th class="blank col6" > </th>
</tr>
</thead>
<tbody>
<tr>
<th id="T_c0a20_level0_row0" class="row_heading level0 row0" >0</th>
<td id="T_c0a20_row0_col0" class="data row0 col0" >0.7296</td>
<td id="T_c0a20_row0_col1" class="data row0 col1" >0.8041</td>
<td id="T_c0a20_row0_col2" class="data row0 col2" >0.7826</td>
<td id="T_c0a20_row0_col3" class="data row0 col3" >0.7077</td>
<td id="T_c0a20_row0_col4" class="data row0 col4" >0.7433</td>
<td id="T_c0a20_row0_col5" class="data row0 col5" >0.4593</td>
<td id="T_c0a20_row0_col6" class="data row0 col6" >0.4619</td>
</tr>
<tr>
<th id="T_c0a20_level0_row1" class="row_heading level0 row1" >1</th>
<td id="T_c0a20_row1_col0" class="data row1 col0" >0.7377</td>
<td id="T_c0a20_row1_col1" class="data row1 col1" >0.8051</td>
<td id="T_c0a20_row1_col2" class="data row1 col2" >0.7952</td>
<td id="T_c0a20_row1_col3" class="data row1 col3" >0.7133</td>
<td id="T_c0a20_row1_col4" class="data row1 col4" >0.7520</td>
<td id="T_c0a20_row1_col5" class="data row1 col5" >0.4754</td>
<td id="T_c0a20_row1_col6" class="data row1 col6" >0.4786</td>
</tr>
<tr>
<th id="T_c0a20_level0_row2" class="row_heading level0 row2" >2</th>
<td id="T_c0a20_row2_col0" class="data row2 col0" >0.7254</td>
<td id="T_c0a20_row2_col1" class="data row2 col1" >0.8024</td>
<td id="T_c0a20_row2_col2" class="data row2 col2" >0.7668</td>
<td id="T_c0a20_row2_col3" class="data row2 col3" >0.7081</td>
<td id="T_c0a20_row2_col4" class="data row2 col4" >0.7363</td>
<td id="T_c0a20_row2_col5" class="data row2 col5" >0.4508</td>
<td id="T_c0a20_row2_col6" class="data row2 col6" >0.4524</td>
</tr>
<tr>
<th id="T_c0a20_level0_row3" class="row_heading level0 row3" >3</th>
<td id="T_c0a20_row3_col0" class="data row3 col0" >0.7341</td>
<td id="T_c0a20_row3_col1" class="data row3 col1" >0.8054</td>
<td id="T_c0a20_row3_col2" class="data row3 col2" >0.7971</td>
<td id="T_c0a20_row3_col3" class="data row3 col3" >0.7078</td>
<td id="T_c0a20_row3_col4" class="data row3 col4" >0.7498</td>
<td id="T_c0a20_row3_col5" class="data row3 col5" >0.4682</td>
<td id="T_c0a20_row3_col6" class="data row3 col6" >0.4720</td>
</tr>
<tr>
<th id="T_c0a20_level0_row4" class="row_heading level0 row4" >4</th>
<td id="T_c0a20_row4_col0" class="data row4 col0" >0.7316</td>
<td id="T_c0a20_row4_col1" class="data row4 col1" >0.8088</td>
<td id="T_c0a20_row4_col2" class="data row4 col2" >0.7736</td>
<td id="T_c0a20_row4_col3" class="data row4 col3" >0.7136</td>
<td id="T_c0a20_row4_col4" class="data row4 col4" >0.7424</td>
<td id="T_c0a20_row4_col5" class="data row4 col5" >0.4632</td>
<td id="T_c0a20_row4_col6" class="data row4 col6" >0.4649</td>
</tr>
<tr>
<th id="T_c0a20_level0_row5" class="row_heading level0 row5" >5</th>
<td id="T_c0a20_row5_col0" class="data row5 col0" >0.7357</td>
<td id="T_c0a20_row5_col1" class="data row5 col1" >0.8055</td>
<td id="T_c0a20_row5_col2" class="data row5 col2" >0.7793</td>
<td id="T_c0a20_row5_col3" class="data row5 col3" >0.7167</td>
<td id="T_c0a20_row5_col4" class="data row5 col4" >0.7467</td>
<td id="T_c0a20_row5_col5" class="data row5 col5" >0.4713</td>
<td id="T_c0a20_row5_col6" class="data row5 col6" >0.4731</td>
</tr>
<tr>
<th id="T_c0a20_level0_row6" class="row_heading level0 row6" >6</th>
<td id="T_c0a20_row6_col0" class="data row6 col0" >0.7241</td>
<td id="T_c0a20_row6_col1" class="data row6 col1" >0.7996</td>
<td id="T_c0a20_row6_col2" class="data row6 col2" >0.7805</td>
<td id="T_c0a20_row6_col3" class="data row6 col3" >0.7014</td>
<td id="T_c0a20_row6_col4" class="data row6 col4" >0.7389</td>
<td id="T_c0a20_row6_col5" class="data row6 col5" >0.4483</td>
<td id="T_c0a20_row6_col6" class="data row6 col6" >0.4511</td>
</tr>
<tr>
<th id="T_c0a20_level0_row7" class="row_heading level0 row7" >7</th>
<td id="T_c0a20_row7_col0" class="data row7 col0" >0.7411</td>
<td id="T_c0a20_row7_col1" class="data row7 col1" >0.8086</td>
<td id="T_c0a20_row7_col2" class="data row7 col2" >0.7882</td>
<td id="T_c0a20_row7_col3" class="data row7 col3" >0.7204</td>
<td id="T_c0a20_row7_col4" class="data row7 col4" >0.7528</td>
<td id="T_c0a20_row7_col5" class="data row7 col5" >0.4822</td>
<td id="T_c0a20_row7_col6" class="data row7 col6" >0.4844</td>
</tr>
<tr>
<th id="T_c0a20_level0_row8" class="row_heading level0 row8" >8</th>
<td id="T_c0a20_row8_col0" class="data row8 col0" >0.7342</td>
<td id="T_c0a20_row8_col1" class="data row8 col1" >0.8055</td>
<td id="T_c0a20_row8_col2" class="data row8 col2" >0.7858</td>
<td id="T_c0a20_row8_col3" class="data row8 col3" >0.7123</td>
<td id="T_c0a20_row8_col4" class="data row8 col4" >0.7473</td>
<td id="T_c0a20_row8_col5" class="data row8 col5" >0.4685</td>
<td id="T_c0a20_row8_col6" class="data row8 col6" >0.4710</td>
</tr>
<tr>
<th id="T_c0a20_level0_row9" class="row_heading level0 row9" >9</th>
<td id="T_c0a20_row9_col0" class="data row9 col0" >0.7330</td>
<td id="T_c0a20_row9_col1" class="data row9 col1" >0.8149</td>
<td id="T_c0a20_row9_col2" class="data row9 col2" >0.7789</td>
<td id="T_c0a20_row9_col3" class="data row9 col3" >0.7134</td>
<td id="T_c0a20_row9_col4" class="data row9 col4" >0.7447</td>
<td id="T_c0a20_row9_col5" class="data row9 col5" >0.4660</td>
<td id="T_c0a20_row9_col6" class="data row9 col6" >0.4680</td>
</tr>
<tr>
<th id="T_c0a20_level0_row10" class="row_heading level0 row10" >Mean</th>
<td id="T_c0a20_row10_col0" class="data row10 col0" >0.7327</td>
<td id="T_c0a20_row10_col1" class="data row10 col1" >0.8060</td>
<td id="T_c0a20_row10_col2" class="data row10 col2" >0.7828</td>
<td id="T_c0a20_row10_col3" class="data row10 col3" >0.7115</td>
<td id="T_c0a20_row10_col4" class="data row10 col4" >0.7454</td>
<td id="T_c0a20_row10_col5" class="data row10 col5" >0.4653</td>
<td id="T_c0a20_row10_col6" class="data row10 col6" >0.4677</td>
</tr>
<tr>
<th id="T_c0a20_level0_row11" class="row_heading level0 row11" >Std</th>
<td id="T_c0a20_row11_col0" class="data row11 col0" >0.0050</td>
<td id="T_c0a20_row11_col1" class="data row11 col1" >0.0039</td>
<td id="T_c0a20_row11_col2" class="data row11 col2" >0.0088</td>
<td id="T_c0a20_row11_col3" class="data row11 col3" >0.0051</td>
<td id="T_c0a20_row11_col4" class="data row11 col4" >0.0051</td>
<td id="T_c0a20_row11_col5" class="data row11 col5" >0.0099</td>
<td id="T_c0a20_row11_col6" class="data row11 col6" >0.0100</td>
</tr>
</tbody>
</table>
<div class="highlight"><pre><span></span><code><span class="nv">Fitting</span> <span class="mi">10</span> <span class="nv">folds</span> <span class="k">for</span> <span class="nv">each</span> <span class="nv">of</span> <span class="mi">10</span> <span class="nv">candidates</span>, <span class="nv">totalling</span> <span class="mi">100</span> <span class="nv">fits</span>
</code></pre></div>
<p>We asked pycaret to maximize the F1 score of the model. By tuning the hyperameters, we were able to raise the F1 score from 0.7446 to 0.7454. </p>
<h2>Validating the Model</h2>
<p>Pycaret is integrated with the <a href="https://www.scikit-yb.org/en/latest/">yellowbrick package</a> for creating visualizations. We can easily generate many standard plots to show the performance of our model.</p>
<p>The area under the curve plot can be generated like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pycaret.classification</span> <span class="kn">import</span> <span class="n">plot_model</span>
<span class="n">plot_model</span><span class="p">(</span><span class="n">tuned_model</span><span class="p">,</span> <span class="n">plot</span><span class="o">=</span><span class="s2">"auc"</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p><img alt="AUC" src="https://www.tekhnoal.com/auc_sdfmlm.png" width="50%"></p>
<p>The AUC plot is useful for understanding the tradeoffs between the true positive rate and the false positive rate of the model's predictions.</p>
<p>The confusion matrix can be plotted like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">plot_model</span><span class="p">(</span><span class="n">tuned_model</span><span class="p">,</span> <span class="n">plot</span><span class="o">=</span><span class="s2">"confusion_matrix"</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p><img alt="Confusion Matrix" src="https://www.tekhnoal.com/confusion_matrix_sdfmlm.png" width="50%"></p>
<p>The confusion matrix is useful for understanding which classes are being "confused" for each other by the model. The confusion matrix shows how many predictions were correctly and incorrectly made for each combination of classes.</p>
<p>The classification report can be plotted like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">plot</span> <span class="o">=</span> <span class="n">plot_model</span><span class="p">(</span><span class="n">tuned_model</span><span class="p">,</span> <span class="n">plot</span><span class="o">=</span><span class="s2">"class_report"</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p><img alt="Classification Report" src="https://www.tekhnoal.com/class_report_sdfmlm.png" width="50%"></p>
<p>The classification report shows the precision, recall, F1, and support metrics of the model for each class.</p>
<p>The class prediction error can be plotted like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">plot</span> <span class="o">=</span> <span class="n">plot_model</span><span class="p">(</span><span class="n">tuned_model</span><span class="p">,</span> <span class="n">plot</span><span class="o">=</span><span class="s2">"error"</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p><img alt="Class Prediction Error" src="https://www.tekhnoal.com/prediction_error_sdfmlm.png" width="50%"></p>
<p>The class prediction error is similar to the classification report and confusion matrix, but highlights the per-class prediction error of the model.</p>
<p>The feature importance can be plotted like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">plot</span> <span class="o">=</span> <span class="n">plot_model</span><span class="p">(</span><span class="n">tuned_model</span><span class="p">,</span> <span class="n">plot</span><span class="o">=</span><span class="s2">"feature"</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p><img alt="Feature Importance" src="https://www.tekhnoal.com/feature_importance_sdfmlm.png" width="50%"></p>
<p>The feature importance plot is for understanding which features are most useful for making accurate predictions.</p>
<p>The learning curve can be plotted like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">plot</span> <span class="o">=</span> <span class="n">plot_model</span><span class="p">(</span><span class="n">tuned_model</span><span class="p">,</span> <span class="n">plot</span><span class="o">=</span><span class="s2">"learning"</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p><img alt="Learning Curve" src="https://www.tekhnoal.com/learning_curve_sdfmlm.png" width="50%"></p>
<p>The learning curve shows how the quality of the model varies when tested with the training set and the validation set, when using a varying number of training samples. This is useful for showing whether the model is underfit or overfit on the dataset.</p>
<h2>Finalizing the Model</h2>
<p>Once we have a tuned and validated model, we can use the entire dataset to train it again, in order to leverage the data samples that were held out for the testing and validation sets. </p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pycaret.classification</span> <span class="kn">import</span> <span class="n">finalize_model</span>
<span class="n">finalized_model</span> <span class="o">=</span> <span class="n">finalize_model</span><span class="p">(</span><span class="n">tuned_model</span><span class="p">)</span>
</code></pre></div>
<p>Now that we have a trained, validated, and finalized model, we'll save it disk for later use.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">pickle</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"../diabetes_risk_model/model_files/model.pkl"</span><span class="p">,</span> <span class="s2">"wb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">pickle</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">finalized_model</span><span class="p">,</span> <span class="n">file</span><span class="p">)</span>
</code></pre></div>
<h2>Signing the Model Parameters</h2>
<p>Once we have the model parameters saved as a pickle file, we can sign the model parameters cryptographically. Signing the model parameters will enable us to ensure that the bytes that we are saving are exactly the same bytes that will be used to make predictions. The process involves creating a "signature" for the model parameters, and later verifying the signature.</p>
<p>To sign the model parameters we'll use the <a href="https://itsdangerous.palletsprojects.com/en/2.1.x/">itsdangerous package</a>. This package is useful for sending data through untrusted channels, where there is a chance that an attacker can modify the data.</p>
<p>Let's install the package:</p>
<div class="highlight"><pre><span></span><code><span class="o">%</span><span class="n">pip</span> <span class="n">install</span> <span class="n">itsdangerous</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Signing messages requires that we come up with a secret key that is only known to us. We'll create a key and store it in a string variable:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">secrets</span>
<span class="kn">import</span> <span class="nn">string</span>
<span class="n">secret_key</span> <span class="o">=</span> <span class="s2">""</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">secrets</span><span class="o">.</span><span class="n">choice</span><span class="p">(</span><span class="n">string</span><span class="o">.</span><span class="n">ascii_uppercase</span> <span class="o">+</span> <span class="n">string</span><span class="o">.</span><span class="n">ascii_lowercase</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">64</span><span class="p">))</span>
<span class="n">secret_key</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>'wjtRFppXQpxTChQnNcQJKGlLHKJBmAHMepfFbqvOoUrnuxIsKdiLCrrypYFQsqcw'
</code></pre></div>
<p>Next, we'll load the model parameters that we just saved into a bytes object so that we can sign them:</p>
<div class="highlight"><pre><span></span><code><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"../diabetes_risk_model/model_files/model.pkl"</span><span class="p">,</span> <span class="s2">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">model_bytes</span> <span class="o">=</span> <span class="n">file</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
</code></pre></div>
<p>The signing process looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">itsdangerous</span> <span class="kn">import</span> <span class="n">Signer</span>
<span class="n">signer</span> <span class="o">=</span> <span class="n">Signer</span><span class="p">(</span><span class="n">secret_key</span><span class="p">)</span>
<span class="n">signed_model_bytes</span> <span class="o">=</span> <span class="n">signer</span><span class="o">.</span><span class="n">sign</span><span class="p">(</span><span class="n">model_bytes</span><span class="p">)</span>
</code></pre></div>
<p>The signed model bytes now have a signature appended to them, which means that the model can't be deserialized using pickle anymore. We have to unsign them to be able to do that. Here is how the unisigning process looks like:</p>
<div class="highlight"><pre><span></span><code><span class="n">unsigned_model_bytes</span> <span class="o">=</span> <span class="n">signer</span><span class="o">.</span><span class="n">unsign</span><span class="p">(</span><span class="n">signed_model_bytes</span><span class="p">)</span>
</code></pre></div>
<p>The model bytes were verified using the secret key, and the signature was removed from the bytes object. Now we can unpickle the model object as we normally would:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">pickle</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">model_bytes</span><span class="p">)</span>
<span class="nb">type</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>pycaret.internal.pipeline.Pipeline
</code></pre></div>
<p>To show how the process would go if the model bytes were modified, let's add a single byte to the end of the signed bytes:</p>
<div class="highlight"><pre><span></span><code><span class="n">changed_signed_model_bytes</span> <span class="o">=</span> <span class="n">signed_model_bytes</span> <span class="o">+</span> <span class="nb">bytes</span><span class="p">([</span><span class="mi">1</span><span class="p">])</span>
</code></pre></div>
<p>Now let's try to unsign the bytes object:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">itsdangerous</span> <span class="kn">import</span> <span class="n">BadSignature</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">signer</span><span class="o">.</span><span class="n">unsign</span><span class="p">(</span><span class="n">changed_signed_model_bytes</span><span class="p">)</span>
<span class="k">except</span> <span class="n">BadSignature</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"BadSignature exception raised!"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>BadSignature exception raised!
</code></pre></div>
<p>Because the bytes were modified, the unsign method raised an exception.</p>
<p>Let's save the signed model bytes to disk, alongside the original model pickle file we created above:</p>
<div class="highlight"><pre><span></span><code><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"../diabetes_risk_model/model_files/signed_model.pkl"</span><span class="p">,</span> <span class="s2">"wb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">file</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">signed_model_bytes</span><span class="p">)</span>
</code></pre></div>
<h2>Packaging the Model Parameters</h2>
<p>We now have signed model parameters. In order to deploy them we'll package them together with other results of the training process.</p>
<p>The model parameters are in the model_files folder:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">ls</span> <span class="o">-</span><span class="n">la</span> <span class="o">../</span><span class="n">diabetes_risk_model</span><span class="o">/</span><span class="n">model_files</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>total 11936
drwxr-xr-x 6 brian staff 192 Mar 17 23:31 [34m.[m[m
drwxr-xr-x 8 brian staff 256 Feb 25 23:50 [34m..[m[m
-rw-r--r--@ 1 brian staff 6148 Mar 15 22:40 .DS_Store
-rw-r--r--@ 1 brian staff 1261313 Mar 17 22:57 data_report.html
-rw-r--r-- 1 brian staff 2419848 Mar 17 23:20 model.pkl
-rw-r--r-- 1 brian staff 2419876 Mar 17 23:31 signed_model.pkl
</code></pre></div>
<p>In the process of training this model, we created a few files containing the descriptive of the training set and other things. We'll save those files alongside the model parameters in a zip file.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">shutil</span>
<span class="n">shutil</span><span class="o">.</span><span class="n">make_archive</span><span class="p">(</span><span class="s2">"../diabetes_risk_model/diabetes_risk_model-0.1.0-2023_03_17"</span><span class="p">,</span>
<span class="s2">"zip"</span><span class="p">,</span>
<span class="s2">"../diabetes_risk_model/model_files"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="s1">'</span><span class="s">/Users/brian/Code/securing-parameters-for-ml-models/diabetes_risk_model/diabetes_risk_model-0.1.0-2023_03_17.zip</span><span class="s1">'</span>
</code></pre></div>
<p>The command created a .zip file with all of the files in the model_files folder. The name of the zip file has the model name, model version, and today's date in it. This allows us to easily understand what the contents of the zip file are.</p>
<p>Now that we have the model files in a .zip file, we can delete the original files from the folder:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">rm</span> <span class="o">../</span><span class="n">diabetes_risk_model</span><span class="o">/</span><span class="n">model_files</span><span class="o">/</span><span class="n">data_report</span><span class="o">.</span><span class="n">html</span>
<span class="err">!</span><span class="n">rm</span> <span class="o">../</span><span class="n">diabetes_risk_model</span><span class="o">/</span><span class="n">model_files</span><span class="o">/</span><span class="n">model</span><span class="o">.</span><span class="n">pkl</span>
<span class="err">!</span><span class="n">rm</span> <span class="o">../</span><span class="n">diabetes_risk_model</span><span class="o">/</span><span class="n">model_files</span><span class="o">/</span><span class="n">signed_model</span><span class="o">.</span><span class="n">pkl</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">mv</span> <span class="o">../</span><span class="n">diabetes_risk_model</span><span class="o">/</span><span class="n">diabetes_risk_model</span><span class="o">-</span><span class="mf">0.1.0</span><span class="o">-</span><span class="mf">2023_03_17.</span><span class="n">zip</span> <span class="o">../</span><span class="n">diabetes_risk_model</span><span class="o">/</span><span class="n">model_files</span><span class="o">/</span><span class="n">diabetes_risk_model</span><span class="o">-</span><span class="mf">0.1.0</span><span class="o">-</span><span class="mf">2023_03_17.</span><span class="n">zip</span>
</code></pre></div>
<p>This packaging process ensures that all of the results of the model training process end up in one archive that we can use later. All of the data and model check results are packaged along with the serialized model so its easy to review the model training process.</p>
<h2>Storing the Model Parameters</h2>
<p>In order to store the model parameters, we'll be using a local S3 compatible service called <a href="https://min.io/">minio</a>. The minio project replicates the S3 API, and also provides a docker image. </p>
<p>To use the minio service, we'll first need a folder to store the files that it will host:</p>
<div class="highlight"><pre><span></span><code><span class="n">mkdir</span> <span class="o">-</span><span class="n">p</span> <span class="o">../</span><span class="n">minio_data</span>
</code></pre></div>
<p>To run an instance of minio locally, use this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">d</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">9000</span><span class="p">:</span><span class="mi">9000</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">9001</span><span class="p">:</span><span class="mi">9001</span> \
<span class="o">-</span><span class="n">e</span> <span class="s2">"MINIO_ACCESS_KEY=TEST"</span> \
<span class="o">-</span><span class="n">e</span> <span class="s2">"MINIO_SECRET_KEY=ASDFGHJKL"</span> \
<span class="o">--</span><span class="n">name</span> <span class="n">minio</span> \
<span class="o">-</span><span class="n">v</span> <span class="err">$</span><span class="p">(</span><span class="n">pwd</span><span class="p">)</span><span class="o">/../</span><span class="n">minio_data</span><span class="p">:</span><span class="o">/</span><span class="n">data</span> \
<span class="n">quay</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">minio</span><span class="o">/</span><span class="n">minio</span> <span class="n">server</span> <span class="o">/</span><span class="n">data</span> <span class="o">--</span><span class="n">console</span><span class="o">-</span><span class="n">address</span> <span class="s2">":9001"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>d5283c718b1b1dc8d60eadbc03a2834647088474431c66bd032eab726670c1d7
</code></pre></div>
<p>The minio service instance running in the docker container is accessing the local filesystem to serve files. When minio is running in this way, it makes the folders it finds in the local filesystem available as buckets through its API, in exactly the same API as the AWS S3 service.</p>
<p>In order to easily interact with the minio service, we'll use the <a href="https://pypi.org/project/minio/">minio package</a>.</p>
<div class="highlight"><pre><span></span><code><span class="o">%</span><span class="n">pip</span> <span class="n">install</span> <span class="n">minio</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Let's create a client to connect to the minio service:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">minio</span> <span class="kn">import</span> <span class="n">Minio</span>
<span class="n">minio_client</span> <span class="o">=</span> <span class="n">Minio</span><span class="p">(</span><span class="s2">"127.0.0.1:9000"</span><span class="p">,</span>
<span class="n">access_key</span><span class="o">=</span><span class="s1">'TEST'</span><span class="p">,</span>
<span class="n">secret_key</span><span class="o">=</span><span class="s1">'ASDFGHJKL'</span><span class="p">,</span>
<span class="n">secure</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</code></pre></div>
<p>Let's make a bucket for the model files:</p>
<div class="highlight"><pre><span></span><code><span class="n">minio_client</span><span class="o">.</span><span class="n">make_bucket</span><span class="p">(</span><span class="s2">"model-files"</span><span class="p">)</span>
</code></pre></div>
<p>Now let's upload the packaged model parameters to the bucket so that we can make predictions with the model parameters later.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">io</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"../diabetes_risk_model/model_files/diabetes_risk_model-0.1.0-2023_03_17.zip"</span><span class="p">,</span> <span class="s2">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">zip_bytes</span> <span class="o">=</span> <span class="n">file</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">minio_client</span><span class="o">.</span><span class="n">put_object</span><span class="p">(</span>
<span class="n">bucket_name</span><span class="o">=</span><span class="s2">"model-files"</span><span class="p">,</span>
<span class="n">object_name</span><span class="o">=</span><span class="s2">"diabetes_risk_model-0.1.0-2023_03_17.zip"</span><span class="p">,</span>
<span class="n">data</span><span class="o">=</span><span class="n">io</span><span class="o">.</span><span class="n">BytesIO</span><span class="p">(</span><span class="n">zip_bytes</span><span class="p">),</span>
<span class="n">length</span><span class="o">=</span><span class="nb">len</span><span class="p">(</span><span class="n">zip_bytes</span><span class="p">)</span>
<span class="p">)</span>
</code></pre></div>
<p>The model parameters are now in place to be used for making predictions. The zip file shows up in the Minio UI:</p>
<p><img alt="Minio UI" src="https://www.tekhnoal.com/minio_ui_sdfmlm.png" width="100%"></p>
<p>The reason that we went through the process of uploading the model parameters in an external storage service is to show how they can be hosted in an external location. By signing the model parameters before we store them in the minio service, we can be sure that the parameters are not tampered with even if the minio service is compromised. Because we signed the model parameters, the attacker would also need to figure out the secret key to be able to modify the model parameters that the deployed model is using.</p>
<h2>Making Predictions with the Model</h2>
<p>We now have a working model that accepts Pandas dataframes as input and also returns predictions in dataframes. This is useful in the context of model training, but makes integrating the model with other software components a lot more complicated. To make the model easier to use, we'll need to create input and output schemas for the model and also create a wrapper class that provides a consistent interface for the model.</p>
<p>We'll create the model's input and output schemas with the <a href="https://docs.pydantic.dev/">pydantic package</a>, which is used for data validation. By creating the schemas using this package we're able to fully document the inputs that the model accepts and the expected outputs of the model we're going to deploy.</p>
<div class="highlight"><pre><span></span><code><span class="o">%</span><span class="n">pip</span> <span class="n">install</span> <span class="n">pydantic</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To begin, we'll define the allowed values for the categorical variables. The model uses three categorical variables, so we'll define three Enum classes that contain the values accepted for these variables. By using enumerated values, we can ensure that the model can only receive values in these inputs that it has previously seen in the training set.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">enum</span> <span class="kn">import</span> <span class="n">Enum</span>
<span class="k">class</span> <span class="nc">GeneralHealth</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="sd">"""How would you say that in general your health is?"""</span>
<span class="n">EXCELLENT</span> <span class="o">=</span> <span class="s2">"EXCELLENT"</span>
<span class="n">VERY_GOOD</span> <span class="o">=</span> <span class="s2">"VERY_GOOD"</span>
<span class="n">GOOD</span> <span class="o">=</span> <span class="s2">"GOOD"</span>
<span class="n">FAIR</span> <span class="o">=</span> <span class="s2">"FAIR"</span>
<span class="n">POOR</span> <span class="o">=</span> <span class="s2">"POOR"</span>
<span class="nd">@staticmethod</span>
<span class="k">def</span> <span class="nf">map</span><span class="p">(</span><span class="n">value</span><span class="p">)</span> <span class="o">-></span> <span class="nb">float</span><span class="p">:</span>
<span class="n">mapping</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"EXCELLENT"</span><span class="p">:</span> <span class="mf">1.0</span><span class="p">,</span>
<span class="s2">"VERY_GOOD"</span><span class="p">:</span> <span class="mf">2.0</span><span class="p">,</span>
<span class="s2">"GOOD"</span><span class="p">:</span> <span class="mf">3.0</span><span class="p">,</span>
<span class="s2">"FAIR"</span><span class="p">:</span> <span class="mf">4.0</span><span class="p">,</span>
<span class="s2">"POOR"</span><span class="p">:</span> <span class="mf">5.0</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">mapping</span><span class="p">[</span><span class="n">value</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">Age</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="sd">"""How old are you?"""</span>
<span class="n">EIGHTEEN_TO_TWENTY_FOUR</span> <span class="o">=</span> <span class="s2">"EIGHTEEN_TO_TWENTY_FOUR"</span>
<span class="n">TWENTY_FIVE_TO_TWENTY_NINE</span> <span class="o">=</span> <span class="s2">"TWENTY_FIVE_TO_TWENTY_NINE"</span>
<span class="n">THIRTY_TO_THIRTY_FOUR</span> <span class="o">=</span> <span class="s2">"THIRTY_TO_THIRTY_FOUR"</span>
<span class="n">THIRTY_FIVE_TO_THIRTY_NINE</span> <span class="o">=</span> <span class="s2">"THIRTY_FIVE_TO_THIRTY_NINE"</span>
<span class="n">FORTY_TO_FORTY_FOUR</span> <span class="o">=</span> <span class="s2">"FORTY_TO_FORTY_FOUR"</span>
<span class="n">FORTY_FIVE_TO_FORTY_NINE</span> <span class="o">=</span> <span class="s2">"FORTY_FIVE_TO_FORTY_NINE"</span>
<span class="n">FIFTY_TO_FIFTY_FOUR</span> <span class="o">=</span> <span class="s2">"FIFTY_TO_FIFTY_FOUR"</span>
<span class="n">FIFTY_FIVE_TO_FIFTY_NINE</span> <span class="o">=</span> <span class="s2">"FIFTY_FIVE_TO_FIFTY_NINE"</span>
<span class="n">SIXTY_TO_SIXTY_FOUR</span> <span class="o">=</span> <span class="s2">"SIXTY_TO_SIXTY_FOUR"</span>
<span class="n">SIXTY_FIVE_TO_SIXTY_NINE</span> <span class="o">=</span> <span class="s2">"SIXTY_FIVE_TO_SIXTY_NINE"</span>
<span class="n">SEVENTY_TO_SEVENTY_FOUR</span> <span class="o">=</span> <span class="s2">"SEVENTY_TO_SEVENTY_FOUR"</span>
<span class="n">SEVENTY_FIVE_TO_SEVENTY_NINE</span> <span class="o">=</span> <span class="s2">"SEVENTY_FIVE_TO_SEVENTY_NINE"</span>
<span class="n">EIGHTY_OR_OLDER</span> <span class="o">=</span> <span class="s2">"EIGHTY_OR_OLDER"</span>
<span class="nd">@staticmethod</span>
<span class="k">def</span> <span class="nf">map</span><span class="p">(</span><span class="n">value</span><span class="p">)</span> <span class="o">-></span> <span class="nb">float</span><span class="p">:</span>
<span class="n">mapping</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"EIGHTEEN_TO_TWENTY_FOUR"</span><span class="p">:</span> <span class="mf">1.0</span><span class="p">,</span>
<span class="s2">"TWENTY_FIVE_TO_TWENTY_NINE"</span><span class="p">:</span> <span class="mf">2.0</span><span class="p">,</span>
<span class="s2">"THIRTY_TO_THIRTY_FOUR"</span><span class="p">:</span> <span class="mf">3.0</span><span class="p">,</span>
<span class="s2">"THIRTY_FIVE_TO_THIRTY_NINE"</span><span class="p">:</span> <span class="mf">4.0</span><span class="p">,</span>
<span class="s2">"FORTY_TO_FORTY_FOUR"</span><span class="p">:</span> <span class="mf">5.0</span><span class="p">,</span>
<span class="s2">"FORTY_FIVE_TO_FORTY_NINE"</span><span class="p">:</span> <span class="mf">6.0</span><span class="p">,</span>
<span class="s2">"FIFTY_TO_FIFTY_FOUR"</span><span class="p">:</span> <span class="mf">7.0</span><span class="p">,</span>
<span class="s2">"FIFTY_FIVE_TO_FIFTY_NINE"</span><span class="p">:</span> <span class="mf">8.0</span><span class="p">,</span>
<span class="s2">"SIXTY_TO_SIXTY_FOUR"</span><span class="p">:</span> <span class="mf">9.0</span><span class="p">,</span>
<span class="s2">"SIXTY_FIVE_TO_SIXTY_NINE"</span><span class="p">:</span> <span class="mf">10.0</span><span class="p">,</span>
<span class="s2">"SEVENTY_TO_SEVENTY_FOUR"</span><span class="p">:</span> <span class="mf">11.0</span><span class="p">,</span>
<span class="s2">"SEVENTY_FIVE_TO_SEVENTY_NINE"</span><span class="p">:</span> <span class="mf">12.0</span><span class="p">,</span>
<span class="s2">"EIGHTY_OR_OLDER"</span><span class="p">:</span> <span class="mf">13.0</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">mapping</span><span class="p">[</span><span class="n">value</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">Income</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="sd">"""What is your income?"""</span>
<span class="n">LESS_THAN_10K</span> <span class="o">=</span> <span class="s2">"LESS_THAN_10K"</span>
<span class="n">BETWEEN_10K_AND_15K</span> <span class="o">=</span> <span class="s2">"BETWEEN_10K_AND_15K"</span>
<span class="n">BETWEEN_15K_AND_20K</span> <span class="o">=</span> <span class="s2">"BETWEEN_15K_AND_20K"</span>
<span class="n">BETWEEN_20K_AND_25K</span> <span class="o">=</span> <span class="s2">"BETWEEN_20K_AND_25K"</span>
<span class="n">BETWEEN_25K_AND_35K</span> <span class="o">=</span> <span class="s2">"BETWEEN_25K_AND_35K"</span>
<span class="n">BETWEEN_35K_AND_50K</span> <span class="o">=</span> <span class="s2">"BETWEEN_35K_AND_50K"</span>
<span class="n">BETWEEN_50K_AND_75K</span> <span class="o">=</span> <span class="s2">"BETWEEN_50K_AND_75K"</span>
<span class="n">SEVENTY_FIVE_THOUSAND_OR_MORE</span> <span class="o">=</span> <span class="s2">"SEVENTY_FIVE_THOUSAND_OR_MORE"</span>
<span class="nd">@staticmethod</span>
<span class="k">def</span> <span class="nf">map</span><span class="p">(</span><span class="n">value</span><span class="p">)</span> <span class="o">-></span> <span class="nb">float</span><span class="p">:</span>
<span class="n">mapping</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"LESS_THAN_10K"</span><span class="p">:</span> <span class="mf">1.0</span><span class="p">,</span>
<span class="s2">"BETWEEN_10K_AND_15K"</span><span class="p">:</span> <span class="mf">2.0</span><span class="p">,</span>
<span class="s2">"BETWEEN_15K_AND_20K"</span><span class="p">:</span> <span class="mf">3.0</span><span class="p">,</span>
<span class="s2">"BETWEEN_20K_AND_25K"</span><span class="p">:</span> <span class="mf">4.0</span><span class="p">,</span>
<span class="s2">"BETWEEN_25K_AND_35K"</span><span class="p">:</span> <span class="mf">5.0</span><span class="p">,</span>
<span class="s2">"BETWEEN_35K_AND_50K"</span><span class="p">:</span> <span class="mf">6.0</span><span class="p">,</span>
<span class="s2">"BETWEEN_50K_AND_75K"</span><span class="p">:</span> <span class="mf">7.0</span><span class="p">,</span>
<span class="s2">"SEVENTY_FIVE_THOUSAND_OR_MORE"</span><span class="p">:</span> <span class="mf">8.0</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">mapping</span><span class="p">[</span><span class="n">value</span><span class="p">]</span>
</code></pre></div>
<p>The enum classes contain the values that were originally found in the training dataset. These variables were actually encoded as numbers in the dataset, so we also added a map() method to each Enum class that returns the corresponding number for the enumerated value passed into it. We'll be using the map() method of each Enum class later. </p>
<p>If we didn't provide these enumerated values and the mapping, we'd be asking the user of the model to encode the values before sending them to the model. They would have to read and understand the dataset documentation to create their prediction request. By creating enumerations for the categorical values, we make it much easier to use the model. </p>
<p>Now that we have the categorical variables defined, we can define the input schema for the model:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Optional</span>
<span class="kn">from</span> <span class="nn">pydantic</span> <span class="kn">import</span> <span class="n">BaseModel</span><span class="p">,</span> <span class="n">Field</span>
<span class="k">class</span> <span class="nc">DiabetesRiskModelInput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">body_mass_index</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mi">15</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">60</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Body Mass Index."</span><span class="p">)</span>
<span class="n">general_health</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">GeneralHealth</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"How would you say that in general your health is?"</span><span class="p">)</span>
<span class="n">age</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Age</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"How old are you?"</span><span class="p">)</span>
<span class="n">income</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Income</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"What is your income?"</span><span class="p">)</span>
</code></pre></div>
<p>The schema is called "DiabetesRiskModelInput" and contains fields for each variable found in the dataset. We're using the Enum classes we defined above for the categorical fields, and we defined a field for the numerical variable. Each numerical field has a range of allowed values that matches the range of the numerical variable found in the dataset. Each field also has a description of the variable that helps the user of the model to correctly feed data to the model. We only have 4 input variables because the feature selection process removed 17 features from the training set of the model.</p>
<p>The process of creating an input data schema exposes information found in the dataset that the model was originally trained on to the user of the model. For example, the body_mass_index variable only allows values between 15 and 60, which is the range of the variable in the training set.</p>
<p>To show how the schema classes work, let's try to instantiate the DiabetesRiskModelInput class:</p>
<div class="highlight"><pre><span></span><code><span class="n">input_instance</span> <span class="o">=</span> <span class="n">DiabetesRiskModelInput</span><span class="p">(</span>
<span class="n">body_mass_index</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span>
<span class="n">general_health</span><span class="o">=</span><span class="n">GeneralHealth</span><span class="o">.</span><span class="n">VERY_GOOD</span><span class="p">,</span>
<span class="n">age</span><span class="o">=</span><span class="n">Age</span><span class="o">.</span><span class="n">THIRTY_TO_THIRTY_FOUR</span><span class="p">,</span>
<span class="n">income</span><span class="o">=</span><span class="n">Income</span><span class="o">.</span><span class="n">BETWEEN_20K_AND_25K</span>
<span class="p">)</span>
<span class="n">input_instance</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>DiabetesRiskModelInput(body_mass_index=20, general_health=<GeneralHealth.VERY_GOOD: 'VERY_GOOD'>, age=<Age.THIRTY_TO_THIRTY_FOUR: 'THIRTY_TO_THIRTY_FOUR'>, income=<Income.BETWEEN_20K_AND_25K: 'BETWEEN_20K_AND_25K'>)
</code></pre></div>
<p>The instance of the schema class contains all of the information needed to make a a prediction.</p>
<p>Now let's try to instantiate it with values that are not allowed by the schema:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pydantic</span> <span class="kn">import</span> <span class="n">ValidationError</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">input_instance</span> <span class="o">=</span> <span class="n">DiabetesRiskModelInput</span><span class="p">(</span>
<span class="n">body_mass_index</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span>
<span class="n">general_health</span><span class="o">=</span><span class="n">GeneralHealth</span><span class="o">.</span><span class="n">VERY_GOOD</span><span class="p">,</span>
<span class="n">age</span><span class="o">=</span><span class="n">Age</span><span class="o">.</span><span class="n">THIRTY_TO_THIRTY_FOUR</span><span class="p">,</span>
<span class="n">income</span><span class="o">=</span><span class="n">Income</span><span class="o">.</span><span class="n">BETWEEN_20K_AND_25K</span><span class="p">)</span>
<span class="k">except</span> <span class="n">ValidationError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"ValidationError exception raised!"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">1</span><span class="w"> </span><span class="nb">val</span><span class="n">idation</span><span class="w"> </span><span class="n">error</span><span class="w"> </span><span class="kr">for</span><span class="w"> </span><span class="n">DiabetesRiskModelInput</span><span class="w"></span>
<span class="n">body_mass_index</span><span class="w"></span>
<span class="w"> </span><span class="n">ensure</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="nb">val</span><span class="n">ue</span><span class="w"> </span><span class="n">is</span><span class="w"> </span><span class="n">greater</span><span class="w"> </span><span class="n">than</span><span class="w"> </span><span class="ow">or</span><span class="w"> </span><span class="n">equal</span><span class="w"> </span><span class="kr">to</span><span class="w"> </span><span class="mf">15</span><span class="w"> </span><span class="p">(</span><span class="n">type</span><span class="o">=</span><span class="nb">val</span><span class="n">ue_error</span><span class="mf">.</span><span class="n">number</span><span class="mf">.</span><span class="ow">not</span><span class="n">_ge</span><span class="p">;</span><span class="w"> </span><span class="n">limit_value</span><span class="o">=</span><span class="mf">15</span><span class="p">)</span><span class="w"></span>
<span class="nb">Val</span><span class="n">idationError</span><span class="w"> </span><span class="n">exception</span><span class="w"> </span><span class="n">raised</span><span class="err">!</span><span class="w"></span>
</code></pre></div>
<p>The class was not instantiated succesfully because the value for body_mass_index is too low and the model cannot accept it. By using the pydantic package, we're able to describe what values the model is able to accept.</p>
<p>We can also ommit values because they are optional:</p>
<div class="highlight"><pre><span></span><code><span class="n">input_instance</span> <span class="o">=</span> <span class="n">DiabetesRiskModelInput</span><span class="p">(</span>
<span class="n">body_mass_index</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span>
<span class="n">age</span><span class="o">=</span><span class="n">Age</span><span class="o">.</span><span class="n">THIRTY_TO_THIRTY_FOUR</span><span class="p">,</span>
<span class="n">income</span><span class="o">=</span><span class="n">Income</span><span class="o">.</span><span class="n">BETWEEN_20K_AND_25K</span><span class="p">)</span>
<span class="n">input_instance</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>DiabetesRiskModelInput(body_mass_index=20, general_health=None, age=<Age.THIRTY_TO_THIRTY_FOUR: 'THIRTY_TO_THIRTY_FOUR'>, income=<Income.BETWEEN_20K_AND_25K: 'BETWEEN_20K_AND_25K'>)
</code></pre></div>
<p>In this case, we did not provide a value for general_health, which is filled in with a value of "None". We can do this because the model has built-in imputers that can provide a default value when it is not provided by the user of the model. </p>
<p>Now that we have the model's input schema defined, we'll define the output schema:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">DiabetesRisk</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="sd">"""Risk of diabetes."""</span>
<span class="n">NO_DIABETES</span> <span class="o">=</span> <span class="s2">"NO_DIABETES"</span>
<span class="n">DIABETES</span> <span class="o">=</span> <span class="s2">"DIABETES"</span>
<span class="nd">@staticmethod</span>
<span class="k">def</span> <span class="nf">map</span><span class="p">(</span><span class="n">value</span><span class="p">:</span> <span class="nb">float</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="n">mapping</span> <span class="o">=</span> <span class="p">{</span>
<span class="mf">0.0</span><span class="p">:</span> <span class="n">DiabetesRisk</span><span class="o">.</span><span class="n">NO_DIABETES</span><span class="p">,</span>
<span class="mf">1.0</span><span class="p">:</span> <span class="n">DiabetesRisk</span><span class="o">.</span><span class="n">DIABETES</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">mapping</span><span class="p">[</span><span class="n">value</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">DiabetesRiskModelOutput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="sd">"""Diabetes risk model output."""</span>
<span class="n">diabetes_risk</span><span class="p">:</span> <span class="n">DiabetesRisk</span>
</code></pre></div>
<p>The model is a classification model and the output schema simply enumerates the classes that the model can predict. We also added a map() method to the DiabetesRisk class in order to map the number that is output by the model to a value that can be returned to the user.</p>
<p>One of the benefits of using the pydantic package is that each schema class can create a JSON Schema description for itself:</p>
<div class="highlight"><pre><span></span><code><span class="n">json_schema</span> <span class="o">=</span> <span class="n">DiabetesRiskModelOutput</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
<span class="n">json_schema</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'DiabetesRiskModelOutput'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Diabetes risk model output.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'object'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'properties'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'diabetes_risk'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/DiabetesRisk'</span><span class="p">}},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'required'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'diabetes_risk'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'definitions'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'DiabetesRisk'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'DiabetesRisk'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Risk of diabetes.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'NO_DIABETES'</span><span class="p">,</span><span class="w"> </span><span class="s1">'DIABETES'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">}}}</span><span class="w"></span>
</code></pre></div>
<p>JSON schemas are useful for documenting data structures. We'll use this JSON schema later in order to automatically build documentation for the deployed model.</p>
<p>Now that we have the input and output schemas defined, now we can tie it all together by creating a wrapper class for the model. To do this we'll use the <a href="https://pypi.org/project/ml-base/">ml_base package</a>. </p>
<p>To install the ml_base package, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="o">%</span><span class="n">pip</span> <span class="n">install</span> <span class="n">ml_base</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>The ml_base package defines a simple base class for model prediction code that allows us to "wrap" the prediction code for a model in a class that follows the MLModel interface. This interface publishes this information about the model:</p>
<ul>
<li>Qualified Name, a unique identifier for the model.</li>
<li>Display Name, a friendly name for the model used in user interfaces.</li>
<li>Description, a description for the model.</li>
<li>Version, semantic version of the model codebase.</li>
<li>Input Schema, an object that describes the model's input data.</li>
<li>Output Schema, an object that describes the model's output schema.</li>
</ul>
<p>The MLModel interface dictates that the model class implements two methods:</p>
<ul>
<li>__init__(), the initialization method which loads any model parameters needed to make predictions</li>
<li>predict(), prediction method that receives model inputs makes a prediction and returns model outputs</li>
</ul>
<p>By using the MLModel base class we'll be able to do more interesting things later with the model. If you'd like to learn more about the ml_base package, <a href="https://schmidtbri.github.io/ml-base/basic/">here</a> is some documentation about it.</p>
<p>We'll define the wrapper class like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
<span class="kn">import</span> <span class="nn">pickle</span>
<span class="kn">from</span> <span class="nn">io</span> <span class="kn">import</span> <span class="n">BytesIO</span>
<span class="kn">import</span> <span class="nn">zipfile</span>
<span class="kn">from</span> <span class="nn">itsdangerous</span> <span class="kn">import</span> <span class="n">Signer</span>
<span class="kn">from</span> <span class="nn">minio</span> <span class="kn">import</span> <span class="n">Minio</span>
<span class="kn">from</span> <span class="nn">ml_base</span> <span class="kn">import</span> <span class="n">MLModel</span>
<span class="k">class</span> <span class="nc">DiabetesRiskModel</span><span class="p">(</span><span class="n">MLModel</span><span class="p">):</span>
<span class="sd">"""Prediction logic for the Diabetes Risk Model."""</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">display_name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s2">"Diabetes Risk Model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">qualified_name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s2">"diabetes_risk_model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">description</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s2">"Model to predict the diabetes risk of a patient."</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">version</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s2">"0.1.0"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">DiabetesRiskModelInput</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">DiabetesRiskModelOutput</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">model_parameters_version</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
<span class="n">model_files_bucket</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
<span class="n">minio_url</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
<span class="n">minio_access_key</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
<span class="n">minio_secret_key</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
<span class="n">parameters_signing_key</span><span class="p">:</span> <span class="nb">str</span><span class="p">):</span>
<span class="c1"># retrieving values from environment variables if the values provided have ${} around them</span>
<span class="k">if</span> <span class="n">minio_access_key</span><span class="p">[</span><span class="mi">0</span><span class="p">:</span><span class="mi">2</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"${"</span> <span class="ow">and</span> <span class="n">minio_access_key</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"}"</span><span class="p">:</span>
<span class="n">minio_access_key</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="n">minio_access_key</span><span class="p">[</span><span class="mi">2</span><span class="p">:</span><span class="o">-</span><span class="mi">1</span><span class="p">]]</span>
<span class="k">if</span> <span class="n">minio_secret_key</span><span class="p">[</span><span class="mi">0</span><span class="p">:</span><span class="mi">2</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"${"</span> <span class="ow">and</span> <span class="n">minio_secret_key</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"}"</span><span class="p">:</span>
<span class="n">minio_secret_key</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="n">minio_secret_key</span><span class="p">[</span><span class="mi">2</span><span class="p">:</span><span class="o">-</span><span class="mi">1</span><span class="p">]]</span>
<span class="k">if</span> <span class="n">parameters_signing_key</span><span class="p">[</span><span class="mi">0</span><span class="p">:</span><span class="mi">2</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"${"</span> <span class="ow">and</span> <span class="n">parameters_signing_key</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"}"</span><span class="p">:</span>
<span class="n">parameters_signing_key</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="n">parameters_signing_key</span><span class="p">[</span><span class="mi">2</span><span class="p">:</span><span class="o">-</span><span class="mi">1</span><span class="p">]]</span>
<span class="n">minio_client</span> <span class="o">=</span> <span class="n">Minio</span><span class="p">(</span><span class="n">minio_url</span><span class="p">,</span>
<span class="n">access_key</span><span class="o">=</span><span class="n">minio_access_key</span><span class="p">,</span>
<span class="n">secret_key</span><span class="o">=</span><span class="n">minio_secret_key</span><span class="p">,</span>
<span class="n">secure</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>
<span class="c1"># accessing the model file stored in minio</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">minio_client</span><span class="o">.</span><span class="n">get_object</span><span class="p">(</span><span class="n">model_files_bucket</span><span class="p">,</span>
<span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">qualified_name</span><span class="si">}</span><span class="s2">-</span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">version</span><span class="si">}</span><span class="s2">-</span><span class="si">{</span><span class="n">model_parameters_version</span><span class="si">}</span><span class="s2">.zip"</span><span class="p">)</span>
<span class="n">zip_bytes</span> <span class="o">=</span> <span class="n">BytesIO</span><span class="p">(</span><span class="n">response</span><span class="o">.</span><span class="n">data</span><span class="p">)</span>
<span class="n">response</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="n">response</span><span class="o">.</span><span class="n">release_conn</span><span class="p">()</span>
<span class="c1"># unzipping the parameters</span>
<span class="k">with</span> <span class="n">zipfile</span><span class="o">.</span><span class="n">ZipFile</span><span class="p">(</span><span class="n">zip_bytes</span><span class="p">)</span> <span class="k">as</span> <span class="n">zf</span><span class="p">:</span>
<span class="k">if</span> <span class="s2">"signed_model.pkl"</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">zf</span><span class="o">.</span><span class="n">namelist</span><span class="p">():</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Could not find signed model file in zip file."</span><span class="p">)</span>
<span class="n">signed_model_bytes</span> <span class="o">=</span> <span class="n">zf</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="s2">"signed_model.pkl"</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">RuntimeError</span><span class="p">(</span><span class="s2">"Could not access model file."</span><span class="p">)</span> <span class="kn">from</span> <span class="nn">e</span>
<span class="c1"># checking the signed parameters</span>
<span class="n">signer</span> <span class="o">=</span> <span class="n">Signer</span><span class="p">(</span><span class="n">parameters_signing_key</span><span class="p">)</span>
<span class="n">unsigned_model_bytes</span> <span class="o">=</span> <span class="n">signer</span><span class="o">.</span><span class="n">unsign</span><span class="p">(</span><span class="n">signed_model_bytes</span><span class="p">)</span>
<span class="c1"># unpickling the model object</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">unsigned_model_bytes</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">:</span> <span class="n">DiabetesRiskModelInput</span><span class="p">)</span> <span class="o">-></span> <span class="n">DiabetesRiskModelOutput</span><span class="p">:</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="n">DiabetesRiskModelInput</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Input must be of type 'DiabetesRiskModelInput'"</span><span class="p">)</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">([[</span>
<span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">body_mass_index</span><span class="p">,</span>
<span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span>
<span class="n">GeneralHealth</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">general_health</span><span class="p">),</span>
<span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span>
<span class="n">Age</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">age</span><span class="p">),</span>
<span class="kc">None</span><span class="p">,</span>
<span class="n">Income</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">income</span><span class="p">),</span>
<span class="p">]],</span> <span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s1">'HighBloodPressure'</span><span class="p">,</span> <span class="s1">'HighCholesterol'</span><span class="p">,</span> <span class="s1">'CholesterolChecked'</span><span class="p">,</span> <span class="s1">'BMI'</span><span class="p">,</span> <span class="s1">'Smoker'</span><span class="p">,</span> <span class="s1">'Stroke'</span><span class="p">,</span>
<span class="s1">'HeartDiseaseOrHeartAttack'</span><span class="p">,</span> <span class="s1">'PhysicalActivity'</span><span class="p">,</span> <span class="s1">'Fruits'</span><span class="p">,</span> <span class="s1">'Veggies'</span><span class="p">,</span>
<span class="s1">'HeavyAlchoholConsumption'</span><span class="p">,</span> <span class="s1">'AnyHealthcare'</span><span class="p">,</span> <span class="s1">'NoDoctorsVisitBecauseOfCost'</span><span class="p">,</span>
<span class="s1">'GeneralHealth'</span><span class="p">,</span> <span class="s1">'MentalHealth'</span><span class="p">,</span> <span class="s1">'PhysicalHealth'</span><span class="p">,</span> <span class="s1">'DifficultyWalking'</span><span class="p">,</span> <span class="s1">'Sex'</span><span class="p">,</span>
<span class="s1">'Age'</span><span class="p">,</span> <span class="s1">'Education'</span><span class="p">,</span> <span class="s1">'Income'</span><span class="p">])</span>
<span class="n">y_hat</span> <span class="o">=</span> <span class="nb">float</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X</span><span class="p">)[</span><span class="mi">0</span><span class="p">])</span>
<span class="k">return</span> <span class="n">DiabetesRiskModelOutput</span><span class="p">(</span><span class="n">diabetes_risk</span><span class="o">=</span><span class="n">DiabetesRisk</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">y_hat</span><span class="p">))</span>
</code></pre></div>
<p>The model class __init__() method loads the model parameters from the minio service, verifies the signature, and deserializes the pickle into a model object that we can use to make predictions. The predict() method uses the model object to make predictions. Notice that we mapped the enumerated values into the numbers that the model expects to see before making a prediction with the model, and also mapped the model's prediction back into an enumerated value that can be returned to the user.</p>
<p>Let's instantiate the model class we defined above:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">DiabetesRiskModel</span><span class="p">(</span>
<span class="n">model_parameters_version</span><span class="o">=</span><span class="s2">"2023_03_17"</span><span class="p">,</span>
<span class="n">model_files_bucket</span><span class="o">=</span><span class="s2">"model-files"</span><span class="p">,</span>
<span class="n">minio_url</span><span class="o">=</span><span class="s2">"127.0.0.1:9000"</span><span class="p">,</span>
<span class="n">minio_access_key</span><span class="o">=</span><span class="s2">"TEST"</span><span class="p">,</span>
<span class="n">minio_secret_key</span><span class="o">=</span><span class="s2">"ASDFGHJKL"</span><span class="p">,</span>
<span class="n">parameters_signing_key</span><span class="o">=</span><span class="s2">"wjtRFppXQpxTChQnNcQJKGlLHKJBmAHMepfFbqvOoUrnuxIsKdiLCrrypYFQsqcw"</span><span class="p">)</span>
</code></pre></div>
<p>When the model object is instantiated, the init method loaded the zip file that contains the model pickle file from the minio service, verified that the bytes have not been changed using the secret key, and deserialized the model. The model object is ready to use to make predictions.</p>
<p>We can make a prediction with the model by first building a DiabetesRiskModelInput object:</p>
<div class="highlight"><pre><span></span><code><span class="n">input_instance</span> <span class="o">=</span> <span class="n">DiabetesRiskModelInput</span><span class="p">(</span>
<span class="n">body_mass_index</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span>
<span class="n">general_health</span><span class="o">=</span><span class="n">GeneralHealth</span><span class="o">.</span><span class="n">VERY_GOOD</span><span class="p">,</span>
<span class="n">age</span><span class="o">=</span><span class="n">Age</span><span class="o">.</span><span class="n">THIRTY_TO_THIRTY_FOUR</span><span class="p">,</span>
<span class="n">income</span><span class="o">=</span><span class="n">Income</span><span class="o">.</span><span class="n">BETWEEN_20K_AND_25K</span>
<span class="p">)</span>
</code></pre></div>
<p>Then use the input object with the model's predict() method:</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">input_instance</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>DiabetesRiskModelOutput(diabetes_risk=<DiabetesRisk.NO_DIABETES: 'NO_DIABETES'>)
</code></pre></div>
<p>The model predicted that the patient does not have diabetes.</p>
<h2>Creating a RESTful Service</h2>
<p>Now that we have a working model that loads and verifies parameters from minio, we can deploy the model into a service. To do this, we won't need to write any extra code, we can leverage the <a href="https://pypi.org/project/rest-model-service/">rest_model_service package</a> to provide the RESTful API for the service. You can learn more about the package in <a href="https://www.tekhnoal.com/rest-model-service.html">this blog post</a>.</p>
<p>To install the package, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="o">%</span><span class="n">pip</span> <span class="n">install</span> <span class="n">rest_model_service</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To create a service for our model, all that is needed is that we add a YAML configuration file to the project. The configuration file looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Diabetes Risk Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">diabetes_risk_model.prediction.model.DiabetesRiskModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
<span class="w"> </span><span class="nt">configuration</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">model_parameters_version</span><span class="p">:</span><span class="w"> </span><span class="s">"2023_03_17"</span><span class="w"></span>
<span class="w"> </span><span class="nt">model_files_bucket</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">model-files</span><span class="w"></span>
<span class="w"> </span><span class="nt">minio_url</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">127.0.0.1:9000</span><span class="w"></span>
<span class="w"> </span><span class="nt">minio_access_key</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">TEST</span><span class="w"></span>
<span class="w"> </span><span class="nt">minio_secret_key</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">ASDFGHJKL</span><span class="w"></span>
<span class="w"> </span><span class="nt">parameters_signing_key</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">wjtRFppXQpxTChQnNcQJKGlLHKJBmAHMepfFbqvOoUrnuxIsKdiLCrrypYFQsqcw</span><span class="w"></span>
</code></pre></div>
<p>The "service_title" field is the name of the service as it will appear in the documentation. The "models" field is an array that contains the details of the models we would like to deploy in the service. The "class_path" points at the MLModel class that implements the model's prediction logic. </p>
<p>Using the configuration file, we're able to create an OpenAPI specification file for the model service by executing these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
generate_openapi --configuration_file<span class="o">=</span>configuration/rest_config.yaml --output_file<span class="o">=</span>service_contract.yaml
</code></pre></div>
<p>The generate_openapi command generated the service_contract.yaml file which contains the <a href="https://en.wikipedia.org/wiki/OpenAPI_Specification">OpenAPI specification</a> for the model service. The "/api/models/diabetes_risk_model/prediction" endpoint is the one we'll call to make predictions with the model. The model's input and output schemas were automatically extracted and added to the specification. The service_contract.yaml is available in the root of the Github repository.</p>
<p>To run the model service locally, execute these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>./configuration/rest_config.yaml
uvicorn rest_model_service.main:app --reload
</code></pre></div>
<p>The service comes up and can be accessed in a web browser at http://127.0.0.1:8000. When you access that URL you will be redirected to the documentation page that is generated by the FastAPI package:</p>
<p><img alt="FastAPI Documentation" src="https://www.tekhnoal.com/fastapi_documentation_sdfmlm.png" width="100%"></p>
<p>The documentation allows you to make requests against the API in order to try it out. Here's a prediction request against the diabetes risk model:</p>
<p><img alt="Prediction Request" src="https://www.tekhnoal.com/prediction_request_sdfmlm.png" width="100%"></p>
<p>And the prediction result:</p>
<p><img alt="Prediction Result" src="https://www.tekhnoal.com/prediction_result_sdfmlm.png" width="100%"></p>
<p>By using the MLModel base class provided by the ml_base package and the REST service framework provided by the rest_model_service package we're able to quickly stand up a service to host the model. We're done with the model service, so we'll stop it with CTL+C.</p>
<h2>Creating a Docker Image</h2>
<p>Now that we have a working model and model service, we'll need to deploy it somewhere. We'll start by deploying the service locally using Docker.</p>
<p>Let's create a docker image and run it locally. The docker image is generated using instructions in the Dockerfile:</p>
<div class="highlight"><pre><span></span><code><span class="c"># syntax=docker/dockerfile:1</span>
<span class="k">FROM</span><span class="w"> </span><span class="s">python:3.9-slim</span>
<span class="k">ARG</span><span class="w"> </span>BUILD_DATE
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.title<span class="o">=</span><span class="s2">"Diabetes Risk Model Service"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.description<span class="o">=</span><span class="s2">"Diabetes Risk Model Service."</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.created<span class="o">=</span><span class="nv">$BUILD_DATE</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.authors<span class="o">=</span><span class="s2">"6666331+schmidtbri@users.noreply.github.com"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.source<span class="o">=</span><span class="s2">"https://github.com/schmidtbri/securing-parameters-for-ml-models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.version<span class="o">=</span><span class="s2">"0.1.0"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.licenses<span class="o">=</span><span class="s2">"MIT License"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.base.name<span class="o">=</span><span class="s2">"python:3.9-slim"</span>
<span class="k">WORKDIR</span><span class="w"> </span><span class="s">/service</span>
<span class="k">ARG</span><span class="w"> </span><span class="nv">USERNAME</span><span class="o">=</span>service-user
<span class="k">ARG</span><span class="w"> </span><span class="nv">USER_UID</span><span class="o">=</span><span class="m">10000</span>
<span class="k">ARG</span><span class="w"> </span><span class="nv">USER_GID</span><span class="o">=</span><span class="m">10000</span>
<span class="c"># install packages</span>
<span class="k">RUN</span><span class="w"> </span>apt-get update -y <span class="o">&&</span> <span class="se">\</span>
apt-get install -y --no-install-recommends sudo <span class="o">&&</span> <span class="se">\</span>
apt-get install -y --no-install-recommends libgomp1 <span class="o">&&</span> <span class="se">\</span>
apt-get clean <span class="o">&&</span> <span class="se">\</span>
rm -rf /var/lib/apt/lists/*
<span class="c"># create a user</span>
<span class="k">RUN</span><span class="w"> </span>groupadd --gid <span class="nv">$USER_GID</span> <span class="nv">$USERNAME</span> <span class="o">&&</span> <span class="se">\</span>
useradd --uid <span class="nv">$USER_UID</span> --gid <span class="nv">$USER_GID</span> -m <span class="nv">$USERNAME</span> <span class="o">&&</span> <span class="se">\</span>
<span class="nb">echo</span> <span class="nv">$USERNAME</span> <span class="nv">ALL</span><span class="o">=</span><span class="se">\(</span>root<span class="se">\)</span> NOPASSWD:ALL > /etc/sudoers.d/<span class="nv">$USERNAME</span> <span class="o">&&</span> <span class="se">\</span>
chmod <span class="m">0440</span> /etc/sudoers.d/<span class="nv">$USERNAME</span>
<span class="c"># installing dependencies</span>
<span class="k">COPY</span><span class="w"> </span>./service_requirements.txt ./service_requirements.txt
<span class="k">RUN</span><span class="w"> </span>pip install --no-cache -r service_requirements.txt
<span class="c"># copying model code and license</span>
<span class="k">COPY</span><span class="w"> </span>./diabetes_risk_model ./diabetes_risk_model
<span class="k">COPY</span><span class="w"> </span>./LICENSE ./LICENSE
<span class="k">USER</span><span class="w"> </span><span class="s">$USERNAME</span>
<span class="k">RUN</span><span class="w"> </span>sudo chown <span class="nv">$USERNAME</span>:<span class="nv">$USERNAME</span> -R /service <span class="o">&&</span> <span class="se">\</span>
sudo chmod -R +rw /service <span class="o">&&</span> <span class="se">\</span>
sudo mkdir -p /var/folders/vb <span class="o">&&</span> <span class="se">\</span>
sudo chown <span class="nv">$USERNAME</span>:<span class="nv">$USERNAME</span> -R /var/folders/vb <span class="o">&&</span> <span class="se">\</span>
sudo chmod -R +rw /var/folders/vb
<span class="k">CMD</span><span class="w"> </span><span class="p">[</span><span class="s2">"uvicorn"</span><span class="p">,</span><span class="w"> </span><span class="s2">"rest_model_service.main:app"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--host"</span><span class="p">,</span><span class="w"> </span><span class="s2">"0.0.0.0"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--port"</span><span class="p">,</span><span class="w"> </span><span class="s2">"8000"</span><span class="p">]</span>
</code></pre></div>
<p>This Dockerfile is used by this docker command to create a docker image:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">build</span> <span class="o">-</span><span class="n">t</span> <span class="n">diabetes_risk_model_service</span><span class="p">:</span><span class="mf">0.1.0</span> <span class="o">../</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make sure everything worked as expected, we'll look through the docker images in our system:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">image</span> <span class="n">ls</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">diabetes_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>diabetes_risk_model_service 0.1.0 92d771f815ee 48 seconds ago 1.2GB
</code></pre></div>
<p>The diabetes_risk_model_service image is listed. To test the model service docker image with the minio docker container that is already running, we'll need to create a network for them first.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">create</span> <span class="n">local</span><span class="o">-</span><span class="n">test</span><span class="o">-</span><span class="n">network</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">7</span><span class="n">e66d4b4dd92e454d4a662c51678a3e05d61ca1389b566ec07afef7630cb1b93</span><span class="w"></span>
</code></pre></div>
<p>Next, we'll connect the running minio container to the network.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">connect</span> <span class="n">local</span><span class="o">-</span><span class="n">test</span><span class="o">-</span><span class="n">network</span> <span class="n">minio</span>
</code></pre></div>
<p>Now we can start the model service docker image connected to the same network as the minio container.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">d</span> \
<span class="o">--</span><span class="n">name</span> <span class="n">diabetes_risk_model_service</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">8000</span><span class="p">:</span><span class="mi">8000</span> \
<span class="o">--</span><span class="n">net</span> <span class="n">local</span><span class="o">-</span><span class="n">test</span><span class="o">-</span><span class="n">network</span> \
<span class="o">-</span><span class="n">v</span> <span class="err">$</span><span class="p">(</span><span class="n">pwd</span><span class="p">)</span><span class="o">/../</span><span class="n">configuration</span><span class="p">:</span><span class="o">/</span><span class="n">service</span><span class="o">/</span><span class="n">configuration</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">REST_CONFIG</span><span class="o">=./</span><span class="n">configuration</span><span class="o">/</span><span class="n">docker_rest_config</span><span class="o">.</span><span class="n">yaml</span> \
<span class="n">diabetes_risk_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>a9b9f3b22af0c2b2e74f1c01e062c56c921b9f689c0284b308a3e93ed6990eba
</code></pre></div>
<p>Notice that we provided the configuration YAML file to the service running in the docker image by mounting the local configuration folder.</p>
<p>To make sure the server process started up correctly, we'll look at the logs:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">logs</span> <span class="n">diabetes_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Started</span><span class="w"> </span><span class="n">server</span><span class="w"> </span><span class="n">process</span><span class="w"> </span><span class="o">[</span><span class="mi">1</span><span class="o">]</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Waiting</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">application</span><span class="w"> </span><span class="n">startup</span><span class="o">.</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Application</span><span class="w"> </span><span class="n">startup</span><span class="w"> </span><span class="n">complete</span><span class="o">.</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Uvicorn</span><span class="w"> </span><span class="n">running</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">http</span><span class="o">://</span><span class="mf">0.0</span><span class="o">.</span><span class="mf">0.0</span><span class="o">:</span><span class="mi">8000</span><span class="w"> </span><span class="o">(</span><span class="n">Press</span><span class="w"> </span><span class="n">CTRL</span><span class="o">+</span><span class="n">C</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">quit</span><span class="o">)</span><span class="w"></span>
</code></pre></div>
<p>The logs don't show any errors, looks like the model parameters were loaded and verified correctly from the minio service when the service started up.</p>
<p>The service should be accessible on port 8000 of localhost, so we'll try to make a prediction using the curl command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://0.0.0.0:8000/api/models/diabetes_risk_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s1">'{ </span><span class="se">\</span>
<span class="s1"> "body_mass_index": 20, </span><span class="se">\</span>
<span class="s1"> "general_health": "EXCELLENT", </span><span class="se">\</span>
<span class="s1"> "age": "EIGHTEEN_TO_TWENTY_FOUR", </span><span class="se">\</span>
<span class="s1"> "income": "LESS_THAN_10K" </span><span class="se">\</span>
<span class="s1">}'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"diabetes_risk":"NO_DIABETES"}
</code></pre></div>
<p>The model predicted that the patient does not have diabetes.</p>
<p>We're done with the docker containers, so we'll shut them down along with the docker network.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">diabetes_risk_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">diabetes_risk_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">minio</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">minio</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">rm</span> <span class="n">local</span><span class="o">-</span><span class="n">test</span><span class="o">-</span><span class="n">network</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>diabetes_risk_model_service
diabetes_risk_model_service
minio
minio
local-test-network
</code></pre></div>
<h2>Creating a Kubernetes Cluster</h2>
<p>To show the system in action, we’ll deploy the model service and the minio service to a Kubernetes cluster. A local cluster can be easily started by using <a href="https://minikube.sigs.k8s.io/docs/">minikube</a>. Installation instructions can be found <a href="https://minikube.sigs.k8s.io/docs/start/">here</a>.</p>
<p>To start the minikube cluster execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">start</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="err">😄</span><span class="w"> </span><span class="n">minikube</span><span class="w"> </span><span class="n">v1</span><span class="o">.</span><span class="mf">28.0</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">Darwin</span><span class="w"> </span><span class="mf">13.2</span><span class="o">.</span><span class="mi">1</span><span class="w"></span>
<span class="err">🎉</span><span class="w"> </span><span class="n">minikube</span><span class="w"> </span><span class="mf">1.29</span><span class="o">.</span><span class="mi">0</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="n">available</span><span class="o">!</span><span class="w"> </span><span class="n">Download</span><span class="w"> </span><span class="n">it</span><span class="p">:</span><span class="w"> </span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">github</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">minikube</span><span class="o">/</span><span class="n">releases</span><span class="o">/</span><span class="n">tag</span><span class="o">/</span><span class="n">v1</span><span class="o">.</span><span class="mf">29.0</span><span class="w"></span>
<span class="err">💡</span><span class="w"> </span><span class="n">To</span><span class="w"> </span><span class="n">disable</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="n">notice</span><span class="p">,</span><span class="w"> </span><span class="n">run</span><span class="p">:</span><span class="w"> </span><span class="s1">'minikube config set WantUpdateNotification false'</span><span class="w"></span>
<span class="err">✨</span><span class="w"> </span><span class="n">Using</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">docker</span><span class="w"> </span><span class="n">driver</span><span class="w"> </span><span class="n">based</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">existing</span><span class="w"> </span><span class="n">profile</span><span class="w"></span>
<span class="err">👍</span><span class="w"> </span><span class="n">Starting</span><span class="w"> </span><span class="n">control</span><span class="w"> </span><span class="n">plane</span><span class="w"> </span><span class="n">node</span><span class="w"> </span><span class="n">minikube</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">cluster</span><span class="w"> </span><span class="n">minikube</span><span class="w"></span>
<span class="err">🚜</span><span class="w"> </span><span class="n">Pulling</span><span class="w"> </span><span class="n">base</span><span class="w"> </span><span class="n">image</span><span class="w"> </span><span class="o">...</span><span class="w"></span>
<span class="err">🔄</span><span class="w"> </span><span class="n">Restarting</span><span class="w"> </span><span class="n">existing</span><span class="w"> </span><span class="n">docker</span><span class="w"> </span><span class="n">container</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="s2">"minikube"</span><span class="w"> </span><span class="o">...</span><span class="w"></span>
<span class="err">🐳</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span><span class="n">Kubernetes</span><span class="w"> </span><span class="n">v1</span><span class="o">.</span><span class="mf">25.3</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">Docker</span><span class="w"> </span><span class="mf">20.10</span><span class="o">.</span><span class="mi">20</span><span class="w"> </span><span class="o">...</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="w"></span>
<span class="err">🔎</span><span class="w"> </span><span class="n">Verifying</span><span class="w"> </span><span class="n">Kubernetes</span><span class="w"> </span><span class="n">components</span><span class="o">...</span><span class="w"></span>
<span class="w"> </span><span class="err">▪</span><span class="w"> </span><span class="n">Using</span><span class="w"> </span><span class="n">image</span><span class="w"> </span><span class="n">gcr</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">k8s</span><span class="o">-</span><span class="n">minikube</span><span class="o">/</span><span class="n">storage</span><span class="o">-</span><span class="n">provisioner</span><span class="p">:</span><span class="n">v5</span><span class="w"></span>
<span class="w"> </span><span class="err">▪</span><span class="w"> </span><span class="n">Using</span><span class="w"> </span><span class="n">image</span><span class="w"> </span><span class="n">docker</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">kubernetesui</span><span class="o">/</span><span class="n">metrics</span><span class="o">-</span><span class="n">scraper</span><span class="p">:</span><span class="n">v1</span><span class="o">.</span><span class="mf">0.8</span><span class="w"></span>
<span class="w"> </span><span class="err">▪</span><span class="w"> </span><span class="n">Using</span><span class="w"> </span><span class="n">image</span><span class="w"> </span><span class="n">docker</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">kubernetesui</span><span class="o">/</span><span class="n">dashboard</span><span class="p">:</span><span class="n">v2</span><span class="o">.</span><span class="mf">7.0</span><span class="w"></span>
<span class="err">💡</span><span class="w"> </span><span class="n">Some</span><span class="w"> </span><span class="n">dashboard</span><span class="w"> </span><span class="n">features</span><span class="w"> </span><span class="n">require</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">metrics</span><span class="o">-</span><span class="n">server</span><span class="w"> </span><span class="n">addon</span><span class="o">.</span><span class="w"> </span><span class="n">To</span><span class="w"> </span><span class="n">enable</span><span class="w"> </span><span class="n">all</span><span class="w"> </span><span class="n">features</span><span class="w"> </span><span class="n">please</span><span class="w"> </span><span class="n">run</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="n">minikube</span><span class="w"> </span><span class="n">addons</span><span class="w"> </span><span class="n">enable</span><span class="w"> </span><span class="n">metrics</span><span class="o">-</span><span class="n">server</span><span class="w"></span>
<span class="err">🌟</span><span class="w"> </span><span class="n">Enabled</span><span class="w"> </span><span class="n">addons</span><span class="p">:</span><span class="w"> </span><span class="n">storage</span><span class="o">-</span><span class="n">provisioner</span><span class="p">,</span><span class="w"> </span><span class="n">default</span><span class="o">-</span><span class="n">storageclass</span><span class="p">,</span><span class="w"> </span><span class="n">dashboard</span><span class="w"></span>
<span class="err">🏄</span><span class="w"> </span><span class="n">Done</span><span class="o">!</span><span class="w"> </span><span class="n">kubectl</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="n">now</span><span class="w"> </span><span class="n">configured</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">use</span><span class="w"> </span><span class="s2">"minikube"</span><span class="w"> </span><span class="n">cluster</span><span class="w"> </span><span class="ow">and</span><span class="w"> </span><span class="s2">"default"</span><span class="w"> </span><span class="n">namespace</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="n">default</span><span class="w"></span>
</code></pre></div>
<p>Let's view all of the pods running in the minikube cluster to make sure we can connect to it using the kubectl command.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">-</span><span class="n">A</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-565d847f94-2v6l9 1/1 Running 15 (82s ago) 72d
kube-system etcd-minikube 1/1 Running 15 (2d ago) 72d
kube-system kube-apiserver-minikube 1/1 Running 14 (2d ago) 72d
kube-system kube-controller-manager-minikube 1/1 Running 15 (82s ago) 72d
kube-system kube-proxy-ztbgd 1/1 Running 14 (2d ago) 72d
kube-system kube-scheduler-minikube 1/1 Running 14 (2d ago) 72d
kube-system storage-provisioner 1/1 Running 26 (2d ago) 72d
kubernetes-dashboard dashboard-metrics-scraper-b74747df5-x559p 1/1 Running 14 (2d ago) 72d
kubernetes-dashboard kubernetes-dashboard-57bbdc5f89-9jvln 1/1 Running 18 (82s ago) 72d
</code></pre></div>
<p>Looks like we can connect, we're ready to start deploying the model service to the cluster.</p>
<h2>Creating a Namespace</h2>
<p>We'll first create a namespace to hold the resources for our model service. The resource definition is in the kubernetes/namespace.yaml file. To apply the manifest to the cluster, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace/model-services created
</code></pre></div>
<p>The namespace was created. To take a look at the namespaces, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">namespace</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME STATUS AGE
default Active 72d
kube-node-lease Active 72d
kube-public Active 72d
kube-system Active 72d
kubernetes-dashboard Active 72d
model-services Active 3s
</code></pre></div>
<p>The new namespace appears in the listing along with other namespaces created by default by the system. To use the new namespace for the rest of the operations, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">config</span> <span class="nb">set</span><span class="o">-</span><span class="n">context</span> <span class="o">--</span><span class="n">current</span> <span class="o">--</span><span class="n">namespace</span><span class="o">=</span><span class="n">model</span><span class="o">-</span><span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Context "minikube" modified.
</code></pre></div>
<p>Now the rest of the kubectl commands that we execute will automatically be applied in the "model-services" namespace.</p>
<h2>Creating the Storage Service</h2>
<p>To store the model parameters, we'll need to deploy minio to the cluster as a service. We can do this by using the helm tool and a helm chart provided by minio. </p>
<p>First let's add the minio helm repository:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">helm</span> <span class="n">repo</span> <span class="n">add</span> <span class="n">minio</span> <span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">charts</span><span class="o">.</span><span class="n">min</span><span class="o">.</span><span class="n">io</span><span class="o">/</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>"minio" has been added to your repositories
</code></pre></div>
<p>The minion helm repository is now available to be used.</p>
<p>Let's apply the minio helm chart:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">helm</span> <span class="n">install</span> <span class="n">minio</span> <span class="o">--</span><span class="nb">set</span> <span class="n">rootUser</span><span class="o">=</span><span class="n">TEST</span><span class="p">,</span><span class="n">rootPassword</span><span class="o">=</span><span class="n">ASDFGHJKL</span> \
<span class="o">--</span><span class="nb">set</span> <span class="n">persistence</span><span class="o">.</span><span class="n">enabled</span><span class="o">=</span><span class="n">true</span> \
<span class="o">--</span><span class="nb">set</span> <span class="n">persistence</span><span class="o">.</span><span class="n">size</span><span class="o">=</span><span class="mi">2</span><span class="n">Gi</span> \
<span class="o">--</span><span class="nb">set</span> <span class="n">resources</span><span class="o">.</span><span class="n">requests</span><span class="o">.</span><span class="n">cpu</span><span class="o">=</span><span class="mi">1</span> \
<span class="o">--</span><span class="nb">set</span> <span class="n">resources</span><span class="o">.</span><span class="n">limits</span><span class="o">.</span><span class="n">cpu</span><span class="o">=</span><span class="mi">2</span> \
<span class="o">--</span><span class="nb">set</span> <span class="n">resources</span><span class="o">.</span><span class="n">requests</span><span class="o">.</span><span class="n">memory</span><span class="o">=</span><span class="mi">250</span><span class="n">Mi</span> \
<span class="o">--</span><span class="nb">set</span> <span class="n">resources</span><span class="o">.</span><span class="n">limits</span><span class="o">.</span><span class="n">memory</span><span class="o">=</span><span class="mi">500</span><span class="n">Mi</span> \
<span class="o">--</span><span class="nb">set</span> <span class="n">mode</span><span class="o">=</span><span class="n">distributed</span><span class="p">,</span><span class="n">replicas</span><span class="o">=</span><span class="mi">2</span> \
<span class="n">minio</span><span class="o">/</span><span class="n">minio</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">NAME</span><span class="p">:</span><span class="w"> </span><span class="n">minio</span><span class="w"></span>
<span class="n">LAST</span><span class="w"> </span><span class="n">DEPLOYED</span><span class="p">:</span><span class="w"> </span><span class="n">Sat</span><span class="w"> </span><span class="n">Mar</span><span class="w"> </span><span class="mi">18</span><span class="w"> </span><span class="mi">00</span><span class="p">:</span><span class="mi">15</span><span class="p">:</span><span class="mi">07</span><span class="w"> </span><span class="mi">2023</span><span class="w"></span>
<span class="n">NAMESPACE</span><span class="p">:</span><span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="w"></span>
<span class="n">STATUS</span><span class="p">:</span><span class="w"> </span><span class="n">deployed</span><span class="w"></span>
<span class="n">REVISION</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="w"></span>
<span class="n">TEST</span><span class="w"> </span><span class="n">SUITE</span><span class="p">:</span><span class="w"> </span><span class="n">None</span><span class="w"></span>
<span class="n">NOTES</span><span class="p">:</span><span class="w"></span>
<span class="n">MinIO</span><span class="w"> </span><span class="n">can</span><span class="w"> </span><span class="n">be</span><span class="w"> </span><span class="n">accessed</span><span class="w"> </span><span class="n">via</span><span class="w"> </span><span class="n">port</span><span class="w"> </span><span class="mi">9000</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="n">DNS</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="n">from</span><span class="w"> </span><span class="n">within</span><span class="w"> </span><span class="n">your</span><span class="w"> </span><span class="n">cluster</span><span class="p">:</span><span class="w"></span>
<span class="n">minio</span><span class="o">.</span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="o">.</span><span class="n">svc</span><span class="o">.</span><span class="n">cluster</span><span class="o">.</span><span class="n">local</span><span class="w"></span>
<span class="n">To</span><span class="w"> </span><span class="n">access</span><span class="w"> </span><span class="n">MinIO</span><span class="w"> </span><span class="n">from</span><span class="w"> </span><span class="n">localhost</span><span class="p">,</span><span class="w"> </span><span class="n">run</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">below</span><span class="w"> </span><span class="n">commands</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="mf">1.</span><span class="w"> </span><span class="k">export</span><span class="w"> </span><span class="n">POD_NAME</span><span class="o">=$</span><span class="p">(</span><span class="n">kubectl</span><span class="w"> </span><span class="n">get</span><span class="w"> </span><span class="n">pods</span><span class="w"> </span><span class="o">--</span><span class="n">namespace</span><span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="w"> </span><span class="o">-</span><span class="n">l</span><span class="w"> </span><span class="s2">"release=minio"</span><span class="w"> </span><span class="o">-</span><span class="n">o</span><span class="w"> </span><span class="n">jsonpath</span><span class="o">=</span><span class="s2">"{.items[0].metadata.name}"</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="mf">2.</span><span class="w"> </span><span class="n">kubectl</span><span class="w"> </span><span class="n">port</span><span class="o">-</span><span class="n">forward</span><span class="w"> </span><span class="o">$</span><span class="n">POD_NAME</span><span class="w"> </span><span class="mi">9000</span><span class="w"> </span><span class="o">--</span><span class="n">namespace</span><span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="w"></span>
<span class="n">Read</span><span class="w"> </span><span class="n">more</span><span class="w"> </span><span class="n">about</span><span class="w"> </span><span class="n">port</span><span class="w"> </span><span class="n">forwarding</span><span class="w"> </span><span class="n">here</span><span class="p">:</span><span class="w"> </span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">kubernetes</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">docs</span><span class="o">/</span><span class="n">user</span><span class="o">-</span><span class="n">guide</span><span class="o">/</span><span class="n">kubectl</span><span class="o">/</span><span class="n">kubectl_port</span><span class="o">-</span><span class="n">forward</span><span class="o">/</span><span class="w"></span>
<span class="n">You</span><span class="w"> </span><span class="n">can</span><span class="w"> </span><span class="n">now</span><span class="w"> </span><span class="n">access</span><span class="w"> </span><span class="n">MinIO</span><span class="w"> </span><span class="n">server</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">localhost</span><span class="p">:</span><span class="mf">9000.</span><span class="w"> </span><span class="n">Follow</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">below</span><span class="w"> </span><span class="n">steps</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">connect</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">MinIO</span><span class="w"> </span><span class="n">server</span><span class="w"> </span><span class="n">with</span><span class="w"> </span><span class="n">mc</span><span class="w"> </span><span class="n">client</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="mf">1.</span><span class="w"> </span><span class="n">Download</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">MinIO</span><span class="w"> </span><span class="n">mc</span><span class="w"> </span><span class="n">client</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="nb">min</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">docs</span><span class="o">/</span><span class="n">minio</span><span class="o">/</span><span class="n">linux</span><span class="o">/</span><span class="n">reference</span><span class="o">/</span><span class="n">minio</span><span class="o">-</span><span class="n">mc</span><span class="o">.</span><span class="n">html</span><span class="c1">#quickstart</span><span class="w"></span>
<span class="w"> </span><span class="mf">2.</span><span class="w"> </span><span class="k">export</span><span class="w"> </span><span class="n">MC_HOST_minio</span><span class="o">-</span><span class="n">local</span><span class="o">=</span><span class="n">http</span><span class="p">:</span><span class="o">//$</span><span class="p">(</span><span class="n">kubectl</span><span class="w"> </span><span class="n">get</span><span class="w"> </span><span class="n">secret</span><span class="w"> </span><span class="o">--</span><span class="n">namespace</span><span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="w"> </span><span class="n">minio</span><span class="w"> </span><span class="o">-</span><span class="n">o</span><span class="w"> </span><span class="n">jsonpath</span><span class="o">=</span><span class="s2">"{.data.rootUser}"</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">base64</span><span class="w"> </span><span class="o">--</span><span class="n">decode</span><span class="p">):</span><span class="o">$</span><span class="p">(</span><span class="n">kubectl</span><span class="w"> </span><span class="n">get</span><span class="w"> </span><span class="n">secret</span><span class="w"> </span><span class="o">--</span><span class="n">namespace</span><span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="w"> </span><span class="n">minio</span><span class="w"> </span><span class="o">-</span><span class="n">o</span><span class="w"> </span><span class="n">jsonpath</span><span class="o">=</span><span class="s2">"{.data.rootPassword}"</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">base64</span><span class="w"> </span><span class="o">--</span><span class="n">decode</span><span class="p">)</span><span class="err">@</span><span class="n">localhost</span><span class="p">:</span><span class="mi">9000</span><span class="w"></span>
<span class="w"> </span><span class="mf">3.</span><span class="w"> </span><span class="n">mc</span><span class="w"> </span><span class="n">ls</span><span class="w"> </span><span class="n">minio</span><span class="o">-</span><span class="n">local</span><span class="w"></span>
</code></pre></div>
<p>The minio service was installed. We can view the pods running to see if it's running correctly:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY STATUS RESTARTS AGE
minio-0 1/1 Running 0 82s
minio-1 1/1 Running 0 82s
</code></pre></div>
<p>The minio service is running in two pods. The minio service is accessible through a set of Kubernetes Services:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
minio ClusterIP 10.108.159.154 <none> 9000/TCP 2m4s
minio-console ClusterIP 10.110.151.171 <none> 9001/TCP 2m4s
minio-svc ClusterIP None <none> 9000/TCP 2m4s
</code></pre></div>
<p>We'll upload the model parameters by accessing the minio-console service. To do that, we'll need to connect to the minio instance using using port forwarding. Port forwarding is a simple way to connect to a service running in the cluster from the local environment, it simply forwards all traffic from a local port to a remote port that is hosting the service.</p>
<p>To start port forwarding the minio-console service, execute this command:</p>
<div class="highlight"><pre><span></span><code>minikube service minio-console --url -n model-services
</code></pre></div>
<p>This command has to run continuously for the port forwarding to work. The UI of the minio instance that is running in the cluster is now available locally:</p>
<p><img alt="Minio UI" src="https://www.tekhnoal.com/minio_kubernetes_ui_sdfmlm.png" width="100%"></p>
<p>In order to keep things short, I created the "model-files" bucket and uploaded model .zip file that were working with above.</p>
<p>We now have model parameters for the model service to access. Now ready to deploy the model service to the cluster.</p>
<h2>Creating a Deployment and Service</h2>
<p>The model service is deployed by using Kubernetes resources. These are:</p>
<ul>
<li>Secret: a set of configuration string that are stored by Kubernetes that can be provided to Pods running within the cluster. The secrets will be the minio login details and the secret key used to verify the model parameters.</li>
<li>ConfigMap: a set of configuration options, in this case it is a simple YAML file that will be loaded into the running container as a volume mount. This resource allows us to change the configuration of the model service without having to modify the Docker image. </li>
<li>Deployment: a declarative way to manage a set of Pods, the model service pods are managed through the Deployment.</li>
<li>Service: a way to expose a set of Pods in a Deployment, the model service is made available to the outside world through the Service.</li>
</ul>
<p>We're almost ready to deploy the model service, but before starting it we'll need to send the docker image from the local docker daemon to the minikube image cache:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">image</span> <span class="n">load</span> <span class="n">diabetes_risk_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<p>We can view the images in the minikube cache with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">image</span> <span class="n">ls</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">diabetes_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>docker.io/library/diabetes_risk_model_service:0.1.0
</code></pre></div>
<p>The model service will need to access the YAML configuration file that we used for the local service above. This is file is in the /configuration folder and is called "kubernetes_rest_config.yaml", its customized for the kubernetes environment we're building.</p>
<p>To create a <a href="https://kubernetes.io/docs/concepts/configuration/configmap/">ConfigMap</a> for the service, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="n">configmap</span> <span class="n">model</span><span class="o">-</span><span class="n">service</span><span class="o">-</span><span class="n">configuration</span> \
<span class="o">--</span><span class="n">from</span><span class="o">-</span><span class="n">file</span><span class="o">=../</span><span class="n">configuration</span><span class="o">/</span><span class="n">kubernetes_rest_config</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap/model-service-configuration created
</code></pre></div>
<p>The model service also needs to access three secrets:</p>
<ul>
<li>minio access key, used for accessing the minio service</li>
<li>minio secret key, used for accessing the minio service</li>
<li>parameters signing key used for verifying the model parameters</li>
</ul>
<p>These secrets can't be added to the ConfigMap because they need to be encrypted to be secure. We'll store these secrets as <a href="https://kubernetes.io/docs/concepts/configuration/secret/">Secrets</a> in kubernetes with these commands:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="n">secret</span> <span class="n">generic</span> <span class="n">diabetes</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">service</span><span class="o">-</span><span class="n">secrets</span> \
<span class="o">--</span><span class="n">from</span><span class="o">-</span><span class="n">literal</span><span class="o">=</span><span class="n">minio</span><span class="o">-</span><span class="n">access</span><span class="o">-</span><span class="n">key</span><span class="o">=</span><span class="n">TEST</span> \
<span class="o">--</span><span class="n">from</span><span class="o">-</span><span class="n">literal</span><span class="o">=</span><span class="n">minio</span><span class="o">-</span><span class="n">secret</span><span class="o">-</span><span class="n">key</span><span class="o">=</span><span class="n">ASDFGHJKL</span> \
<span class="o">--</span><span class="n">from</span><span class="o">-</span><span class="n">literal</span><span class="o">=</span><span class="n">parameters</span><span class="o">-</span><span class="n">signing</span><span class="o">-</span><span class="n">key</span><span class="o">=</span><span class="n">wjtRFppXQpxTChQnNcQJKGlLHKJBmAHMepfFbqvOoUrnuxIsKdiLCrrypYFQsqcw</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>secret/diabetes-risk-model-service-secrets created
</code></pre></div>
<p>The model service Deployment and Service are created within the Kubernetes cluster with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps/diabetes-risk-model-deployment created
service/diabetes-risk-model-service created
</code></pre></div>
<p>Lets view the Deployment to see if it is available yet:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">deployments</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY UP-TO-DATE AVAILABLE AGE
diabetes-risk-model-deployment 1/1 1 1 65s
</code></pre></div>
<p>To get an idea of how the service went through the startup process, let's look a the service logs. Let's get the names of the pods that are running the service:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">diabetes</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>diabetes-risk-model-deployment-ff7887475-5q2j5 1/1 Running 0 68s
</code></pre></div>
<p>Using the pod name, we can get the logs from Kubernetes:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">logs</span> <span class="n">diabetes</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">deployment</span><span class="o">-</span><span class="n">ff7887475</span><span class="o">-</span><span class="mi">5</span><span class="n">q2j5</span> <span class="o">-</span><span class="n">c</span> <span class="n">diabetes</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Started</span><span class="w"> </span><span class="n">server</span><span class="w"> </span><span class="n">process</span><span class="w"> </span><span class="o">[</span><span class="mi">1</span><span class="o">]</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Waiting</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">application</span><span class="w"> </span><span class="n">startup</span><span class="o">.</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Application</span><span class="w"> </span><span class="n">startup</span><span class="w"> </span><span class="n">complete</span><span class="o">.</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Uvicorn</span><span class="w"> </span><span class="n">running</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">http</span><span class="o">://</span><span class="mf">0.0</span><span class="o">.</span><span class="mf">0.0</span><span class="o">:</span><span class="mi">8000</span><span class="w"> </span><span class="o">(</span><span class="n">Press</span><span class="w"> </span><span class="n">CTRL</span><span class="o">+</span><span class="n">C</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">quit</span><span class="o">)</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">35258</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/startup HTTP/1.1"</span><span class="w"> </span><span class="mi">503</span><span class="w"> </span><span class="n">Service</span><span class="w"> </span><span class="n">Unavailable</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">35272</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/startup HTTP/1.1"</span><span class="w"> </span><span class="mi">503</span><span class="w"> </span><span class="n">Service</span><span class="w"> </span><span class="n">Unavailable</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">55252</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/startup HTTP/1.1"</span><span class="w"> </span><span class="mi">503</span><span class="w"> </span><span class="n">Service</span><span class="w"> </span><span class="n">Unavailable</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">55264</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/startup HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">55270</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/ready HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">49028</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
</code></pre></div>
<p>Looks like the process started up correctly.</p>
<p>The Kubernetes Service details look like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">services</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">diabetes</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>diabetes-risk-model-service NodePort 10.99.180.41 <none> 80:31452/TCP 2m29s
</code></pre></div>
<p>We'll run another proxy process locally to be able to access the model service endpoint:</p>
<div class="highlight"><pre><span></span><code>minikube service diabetes-risk-model-service --url -n model-services
</code></pre></div>
<p>The command outputs this URL:</p>
<p>http://127.0.0.1:55659</p>
<p>We can send a request to the model service through the local endpoint like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:55659/api/models/diabetes_risk_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s1">'{ </span><span class="se">\</span>
<span class="s1"> "body_mass_index": 60, </span><span class="se">\</span>
<span class="s1"> "general_health": "EXCELLENT", </span><span class="se">\</span>
<span class="s1"> "age": "EIGHTEEN_TO_TWENTY_FOUR", </span><span class="se">\</span>
<span class="s1"> "income": "LESS_THAN_10K" </span><span class="se">\</span>
<span class="s1">}'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"diabetes_risk":"NO_DIABETES"}
</code></pre></div>
<p>The model is deployed within Kubernetes!</p>
<h2>Deleting the Resources</h2>
<p>We're done working with the Kubernetes resources, so we will delete them and shut down the cluster.</p>
<p>To delete the model service Deployment and Service, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps "diabetes-risk-model-deployment" deleted
service "diabetes-risk-model-service" deleted
</code></pre></div>
<p>We'll also delete the ConfigMap:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="n">configmap</span> <span class="n">model</span><span class="o">-</span><span class="n">service</span><span class="o">-</span><span class="n">configuration</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap "model-service-configuration" deleted
</code></pre></div>
<p>Next, we'll delete the secrets:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="n">secret</span> <span class="n">diabetes</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">service</span><span class="o">-</span><span class="n">secrets</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>secret "diabetes-risk-model-service-secrets" deleted
</code></pre></div>
<p>To delete the minio deployment execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">helm</span> <span class="n">delete</span> <span class="n">minio</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>release "minio" uninstalled
</code></pre></div>
<p>The minio service used <a href="https://kubernetes.io/docs/concepts/storage/persistent-volumes/">Persistent Volume Claims</a> to store data. Since these are not deleted with the minio helm chart, we'll delete it with a kubectl command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="n">pvc</span> <span class="o">-</span><span class="n">l</span> <span class="n">app</span><span class="o">=</span><span class="n">minio</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">persistentvolumeclaim</span><span class="w"> </span><span class="s2">"export-minio-0"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
<span class="n">persistentvolumeclaim</span><span class="w"> </span><span class="s2">"export-minio-1"</span><span class="w"> </span><span class="n">deleted</span><span class="w"></span>
</code></pre></div>
<p>To delete the model-services namespace, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace "model-services" deleted
</code></pre></div>
<p>To shut down the minikube cluster:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">stop</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>✋ Stopping node "minikube" ...
🛑 Powering off "minikube" via SSH ...
🛑 1 node stopped.
</code></pre></div>
<h2>Closing</h2>
<p>In this blog post we trained, validated, signed, and verified a set of model parameters to ensure that they remain secure. This process is needed because of the inherent security problems that Python pickles bring with them. The signing and verificationprocess added a little bit of complexity, but it's worth it to ensure the security of the model deployment. </p>
<p>We also showed how to deploy the serialized model parameters to a storage service, and how to access them from the deployed model. We did this to show a common vulnerability of machine learning model deployments. Since a lot of model parameters are not deployed alongside the prediction code, they are deployed in a separate storage service from which they are loaded. This practice makes the deployment of model parameters faster, but adds another attack vector that needs to be secured. Since the model parameters are stored in a storage server, an attacker can access the storage service and modify the model parameters in order to do arbritrary code execution in the server where the model is deployed. By adding a signature verification process before the model parameters can be deserialized, we made the deployment of model parameters a little more secure.</p>
<p>One way to improve this process is to make it into a plug-in that can be easily added to model training and prediction code, making it simpler to add to a training pipeline and model deployment. Another way to improve it is by adding a key cycling mechanism to ensure that secret keys do not remain in production for a long time.</p>Health Checks for ML Model Deployments2023-01-15T22:00:00-05:002023-01-15T22:00:00-05:00Brian Schmidttag:www.tekhnoal.com,2023-01-15:/health-checks-for-ml-model-deployments.html<p>Deploying machine learning models in RESTful services is a common way to make the model available for use within a software system. RESTful services are the most common type of service deployed, since they are very simple to build, have wide compatibility, and have lots of tooling available for them. In order to monitor the availability of the service, RESTful APIs often provide health check endpoints which make it easy for an outside system to verify that the service is up and running. A health check endpoint is a simple endpoint that can be called by a process manager to ascertain whether the application is running correctly. In this blog post we'll be working with Kubernetes so we'll focus on the health checks supported by Kubernetes.</p><h1>Health Checks for ML Model Deployments</h1>
<p>In a <a href="https://www.tekhnoal.com/rest-model-service.html">previous blog post</a> we showed how to create a RESTful model service for a machine learning model. In this blog post, we'll extend the model service API by adding health checks to it.</p>
<p>This blog post was written in a Jupyter notebook, some of the code and commands found in it reflect this.</p>
<p>All of the code for this blog post is in <a href="https://github.com/schmidtbri/health-checks-for-ml-model-deployments">this github repository</a>.</p>
<h2>Introduction</h2>
<p>Deploying machine learning models in RESTful services is a common way to make the model available for use within a software system. In general, RESTful services are the most common type of service deployed, since they are simple to build, have wide compatibility, and have lots of tooling available for them. In order to monitor the availability of the service, RESTful APIs often provide health check endpoints which make it easy for an outside system to verify that the service is up and running. A health check endpoint is a simple endpoint that can be called by a process manager to ascertain whether the application is running correctly. In this blog post we'll be working with Kubernetes so we'll focus on the health checks supported by Kubernetes. </p>
<p>There are several types of health check endpoints supported by Kubernetes: startup, readiness, and liveness health checks. Startup checks are used to check if an application has started. If the container has a startup check configured on it, Kubernetes will wait until the application has finished starting up before moving on with the process of making the application available to clients. Startup checks are useful for applications that take a while to startup. Startup checks are only called during application startup, once an application has finished starting up the startup check is not called again.</p>
<p>Readiness checks are used to check if a container is ready to start accepting requests. Once the application has finished starting up, Kubernetes uses the readiness check to make sure that the application is able to accept requests. Service readiness can change during the service lifecycle so the check is called continuously until the application is stopped.</p>
<p>Liveness checks are used to restart a pod if the application is not responding. They are are the simplest type of check to implement in the application because they should always succeed if the server process is running. Liveness checks are useful to detect if the application is in an unsafe state, if the liveness check fails the process manager needs to restart the application to get it out of the unsafe state. Liveness checks are also called continuously while the application is running.</p>
<p>In this blog post, we’ll be adding startup, readiness, and liveness checks to a RESTful model service that is hosting a machine learning model. We'll also build a model that requires health checks in order to be deployed correctly.</p>
<h2>Getting Data</h2>
<p>In order to train a model, we first need to have a dataset. We went into Kaggle and found a dataset that contained loan information. To make it easy to download the data, we installed the <a href="https://pypi.org/project/kaggle/">kaggle python package</a>. Then we executed these commands to download the data and unzip it into the data folder in the project:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">clear_output</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">mkdir</span> <span class="o">-</span><span class="n">p</span> <span class="o">../</span><span class="n">data</span>
<span class="err">!</span><span class="n">kaggle</span> <span class="n">datasets</span> <span class="n">download</span> <span class="o">-</span><span class="n">d</span> <span class="n">ranadeep</span><span class="o">/</span><span class="n">credit</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">dataset</span> <span class="o">-</span><span class="n">p</span> <span class="o">../</span><span class="n">data</span> <span class="o">--</span><span class="n">unzip</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Let's look at the data files:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">ls</span> <span class="o">-</span><span class="n">la</span> <span class="o">../</span><span class="n">data</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>total 67648
drwxr-xr-x 6 brian staff 192 Nov 21 14:45 [34m.[m[m
drwxr-xr-x 27 brian staff 864 Jan 11 21:23 [34m..[m[m
-rw-r--r--@ 1 brian staff 6148 Oct 31 17:30 .DS_Store
-rw-r--r-- 1 brian staff 20995 Nov 21 13:49 LCDataDictionary.xlsx
-rw-r--r-- 1 brian staff 34603008 Nov 21 14:58 credit-risk-dataset.zip
drwxr-xr-x 3 brian staff 96 Nov 16 09:33 [34mloan[m[m
</code></pre></div>
<p>Looks like the data is in a .csv file in the /loan folder and the data dictionary is in an excel spreadsheet file.</p>
<p>Let's load the data .csv file into a pandas dataframe.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s2">"../data/loan/loan.csv"</span><span class="p">,</span> <span class="n">low_memory</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">data</span><span class="o">.</span><span class="n">info</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><class 'pandas.core.frame.DataFrame'>
RangeIndex: 887379 entries, 0 to 887378
Data columns (total 74 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 887379 non-null int64
1 member_id 887379 non-null int64
2 loan_amnt 887379 non-null float64
3 funded_amnt 887379 non-null float64
4 funded_amnt_inv 887379 non-null float64
5 term 887379 non-null object
6 int_rate 887379 non-null float64
7 installment 887379 non-null float64
8 grade 887379 non-null object
9 sub_grade 887379 non-null object
10 emp_title 835917 non-null object
11 emp_length 842554 non-null object
12 home_ownership 887379 non-null object
13 annual_inc 887375 non-null float64
14 verification_status 887379 non-null object
15 issue_d 887379 non-null object
16 loan_status 887379 non-null object
17 pymnt_plan 887379 non-null object
18 url 887379 non-null object
19 desc 126028 non-null object
20 purpose 887379 non-null object
21 title 887227 non-null object
22 zip_code 887379 non-null object
23 addr_state 887379 non-null object
24 dti 887379 non-null float64
25 delinq_2yrs 887350 non-null float64
26 earliest_cr_line 887350 non-null object
27 inq_last_6mths 887350 non-null float64
28 mths_since_last_delinq 433067 non-null float64
29 mths_since_last_record 137053 non-null float64
30 open_acc 887350 non-null float64
31 pub_rec 887350 non-null float64
32 revol_bal 887379 non-null float64
33 revol_util 886877 non-null float64
34 total_acc 887350 non-null float64
35 initial_list_status 887379 non-null object
36 out_prncp 887379 non-null float64
37 out_prncp_inv 887379 non-null float64
38 total_pymnt 887379 non-null float64
39 total_pymnt_inv 887379 non-null float64
40 total_rec_prncp 887379 non-null float64
41 total_rec_int 887379 non-null float64
42 total_rec_late_fee 887379 non-null float64
43 recoveries 887379 non-null float64
44 collection_recovery_fee 887379 non-null float64
45 last_pymnt_d 869720 non-null object
46 last_pymnt_amnt 887379 non-null float64
47 next_pymnt_d 634408 non-null object
48 last_credit_pull_d 887326 non-null object
49 collections_12_mths_ex_med 887234 non-null float64
50 mths_since_last_major_derog 221703 non-null float64
51 policy_code 887379 non-null float64
52 application_type 887379 non-null object
53 annual_inc_joint 511 non-null float64
54 dti_joint 509 non-null float64
55 verification_status_joint 511 non-null object
56 acc_now_delinq 887350 non-null float64
57 tot_coll_amt 817103 non-null float64
58 tot_cur_bal 817103 non-null float64
59 open_acc_6m 21372 non-null float64
60 open_il_6m 21372 non-null float64
61 open_il_12m 21372 non-null float64
62 open_il_24m 21372 non-null float64
63 mths_since_rcnt_il 20810 non-null float64
64 total_bal_il 21372 non-null float64
65 il_util 18617 non-null float64
66 open_rv_12m 21372 non-null float64
67 open_rv_24m 21372 non-null float64
68 max_bal_bc 21372 non-null float64
69 all_util 21372 non-null float64
70 total_rev_hi_lim 817103 non-null float64
71 inq_fi 21372 non-null float64
72 total_cu_tl 21372 non-null float64
73 inq_last_12m 21372 non-null float64
dtypes: float64(49), int64(2), object(23)
memory usage: 501.0+ MB
</code></pre></div>
<p>We'll be predicting credit risk. Let's select the most promising columns in the dataset so that we wont have to deal with all of the columns.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[[</span>
<span class="s2">"annual_inc"</span><span class="p">,</span>
<span class="s2">"collections_12_mths_ex_med"</span><span class="p">,</span>
<span class="s2">"delinq_2yrs"</span><span class="p">,</span>
<span class="s2">"dti"</span><span class="p">,</span>
<span class="s2">"emp_length"</span><span class="p">,</span>
<span class="s2">"home_ownership"</span><span class="p">,</span>
<span class="s2">"acc_now_delinq"</span><span class="p">,</span>
<span class="s2">"installment"</span><span class="p">,</span>
<span class="s2">"int_rate"</span><span class="p">,</span>
<span class="s2">"last_pymnt_amnt"</span><span class="p">,</span>
<span class="s2">"loan_amnt"</span><span class="p">,</span>
<span class="s2">"loan_status"</span><span class="p">,</span>
<span class="s2">"pub_rec"</span><span class="p">,</span>
<span class="s2">"purpose"</span><span class="p">,</span>
<span class="s2">"revol_util"</span><span class="p">,</span>
<span class="s2">"term"</span><span class="p">,</span>
<span class="s2">"total_pymnt_inv"</span><span class="p">,</span>
<span class="s2">"verification_status"</span>
<span class="p">]]</span>
</code></pre></div>
<p>To make the data easier to explore we'll rename the columns to make their names more user friendly.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">rename</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="p">{</span>
<span class="s2">"annual_inc"</span><span class="p">:</span> <span class="s2">"AnnualIncome"</span><span class="p">,</span>
<span class="s2">"collections_12_mths_ex_med"</span><span class="p">:</span> <span class="s2">"CollectionsInLast12Months"</span><span class="p">,</span>
<span class="s2">"delinq_2yrs"</span><span class="p">:</span> <span class="s2">"DelinquenciesInLast2Years"</span><span class="p">,</span>
<span class="s2">"dti"</span><span class="p">:</span> <span class="s2">"DebtToIncomeRatio"</span><span class="p">,</span>
<span class="s2">"emp_length"</span><span class="p">:</span> <span class="s2">"EmploymentLength"</span><span class="p">,</span>
<span class="s2">"home_ownership"</span><span class="p">:</span> <span class="s2">"HomeOwnership"</span><span class="p">,</span>
<span class="s2">"acc_now_delinq"</span><span class="p">:</span> <span class="s2">"NumberOfDelinquentAccounts"</span><span class="p">,</span>
<span class="s2">"installment"</span><span class="p">:</span> <span class="s2">"MonthlyInstallmentPayment"</span><span class="p">,</span>
<span class="s2">"int_rate"</span><span class="p">:</span> <span class="s2">"InterestRate"</span><span class="p">,</span>
<span class="s2">"last_pymnt_amnt"</span><span class="p">:</span> <span class="s2">"LastPaymentAmount"</span><span class="p">,</span>
<span class="s2">"loan_amnt"</span><span class="p">:</span> <span class="s2">"LoanAmount"</span><span class="p">,</span>
<span class="s2">"loan_status"</span><span class="p">:</span> <span class="s2">"LoanStatus"</span><span class="p">,</span>
<span class="s2">"pub_rec"</span><span class="p">:</span> <span class="s2">"DerogatoryPublicRecordCount"</span><span class="p">,</span>
<span class="s2">"purpose"</span><span class="p">:</span> <span class="s2">"LoanPurpose"</span><span class="p">,</span>
<span class="s2">"revol_util"</span><span class="p">:</span> <span class="s2">"RevolvingLineUtilizationRate"</span><span class="p">,</span>
<span class="s2">"term"</span><span class="p">:</span> <span class="s2">"Term"</span><span class="p">,</span>
<span class="s2">"total_pymnt_inv"</span><span class="p">:</span> <span class="s2">"TotalPaymentsToDate"</span><span class="p">,</span>
<span class="s2">"verification_status"</span><span class="p">:</span> <span class="s2">"VerificationStatus"</span><span class="p">,</span>
<span class="p">})</span>
</code></pre></div>
<p>We'll also build a simple data dictionary with the column descriptions that were downloaded with the dataset.</p>
<div class="highlight"><pre><span></span><code><span class="n">data_dictionary</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"AnnualIncome"</span><span class="p">:</span> <span class="s2">"The self-reported annual income provided by the borrower during registration."</span><span class="p">,</span>
<span class="s2">"CollectionsInLast12Months"</span><span class="p">:</span> <span class="s2">"Number of collections in 12 months excluding medical collections."</span><span class="p">,</span>
<span class="s2">"DelinquenciesInLast2Years"</span><span class="p">:</span> <span class="s2">"The number of 30+ days past-due incidences of delinquency in the borrower's credit file for the past 2 years."</span><span class="p">,</span>
<span class="s2">"DebtToIncomeRatio"</span><span class="p">:</span> <span class="s2">"A ratio calculated using the borrower’s total monthly debt payments on the total debt obligations, excluding mortgage and the requested LC loan, divided by the borrower’s self-reported monthly income."</span><span class="p">,</span>
<span class="s2">"EmploymentLength"</span><span class="p">:</span> <span class="s2">"Employment length in years. Possible values are between 0 and 10 where 0 means less than one year and 10 means ten or more years."</span><span class="p">,</span>
<span class="s2">"HomeOwnership"</span><span class="p">:</span> <span class="s2">"The home ownership status provided by the borrower during registration. Our values are: RENT, OWN, MORTGAGE, OTHER."</span><span class="p">,</span>
<span class="s2">"NumberOfDelinquentAccounts"</span><span class="p">:</span> <span class="s2">"The number of accounts on which the borrower is now delinquent."</span><span class="p">,</span>
<span class="s2">"MonthlyInstallmentPayment"</span><span class="p">:</span> <span class="s2">"The monthly payment owed by the borrower if the loan originates."</span><span class="p">,</span>
<span class="s2">"InterestRate"</span><span class="p">:</span> <span class="s2">"Interest Rate on the loan."</span><span class="p">,</span>
<span class="s2">"LastPaymentAmount"</span><span class="p">:</span> <span class="s2">"Last total payment amount received."</span><span class="p">,</span>
<span class="s2">"LoanAmount"</span><span class="p">:</span> <span class="s2">"The listed amount of the loan applied for by the borrower."</span><span class="p">,</span>
<span class="s2">"LoanStatus"</span><span class="p">:</span> <span class="s2">"Current status of the loan."</span><span class="p">,</span>
<span class="s2">"DerogatoryPublicRecordCount"</span><span class="p">:</span> <span class="s2">"Number of derogatory public records."</span><span class="p">,</span>
<span class="s2">"LoanPurpose"</span><span class="p">:</span> <span class="s2">"A category provided by the borrower for the loan request."</span><span class="p">,</span>
<span class="s2">"RevolvingLineUtilizationRate"</span><span class="p">:</span> <span class="s2">"Revolving line utilization rate, or the amount of credit the borrower is using relative to all available revolving credit."</span><span class="p">,</span>
<span class="s2">"Term"</span><span class="p">:</span> <span class="s2">"The number of payments on the loan. Values are in months and can be either 36 or 60."</span><span class="p">,</span>
<span class="s2">"TotalPaymentsToDate"</span><span class="p">:</span> <span class="s2">"Payments received to date for portion of total amount funded by investors."</span><span class="p">,</span>
<span class="s2">"VerificationStatus"</span><span class="p">:</span> <span class="s2">"Indicates if income was verified."</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div>
<h2>Building a Model</h2>
<p>Now that we have the dataset, we'll start working on training a model. We'll be doing data exploration, data preparation, model training, and model validation.</p>
<h3>Profiling the Data</h3>
<p>Profiling the data can tell us a lot about the internal structure of the dataset. To profile the data, we'll use the <a href="https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/">pandas_profiling package</a>.</p>
<p>To install the package, we'll execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">pandas_profiling</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>The profile is built with this code:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pandas_profiling</span> <span class="kn">import</span> <span class="n">ProfileReport</span>
<span class="n">profile</span> <span class="o">=</span> <span class="n">ProfileReport</span><span class="p">(</span><span class="n">data</span><span class="p">,</span>
<span class="n">title</span><span class="o">=</span><span class="s2">"Credit Risk Analysis Dataset Report"</span><span class="p">,</span>
<span class="n">pool_size</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span>
<span class="n">progress_bar</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">dataset</span><span class="o">=</span><span class="p">{</span>
<span class="s2">"description"</span><span class="p">:</span> <span class="s2">"Lending Club loan data, complete loan data for all loans issued through the 2007-2015."</span>
<span class="p">},</span>
<span class="n">variables</span><span class="o">=</span><span class="p">{</span>
<span class="s2">"descriptions"</span><span class="p">:</span> <span class="n">data_dictionary</span>
<span class="p">})</span>
</code></pre></div>
<p>We passed the data dictionary we built to the profile, it will be saved in the report generated.</p>
<p>Once the report is created, we'll save it to disk as an HTML file.</p>
<div class="highlight"><pre><span></span><code><span class="n">profile</span><span class="o">.</span><span class="n">to_file</span><span class="p">(</span><span class="s2">"../credit_risk_model/model_files/data_exploration_report.html"</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Right away the profile will tell us a few key details about the dataset:</p>
<p><img alt="Data Overview" src="https://www.tekhnoal.com/data_overview_hcfmlm.png" width="100%"></p>
<p>The profile also contains a few alerts about the data:</p>
<p><img alt="Data Alerts" src="https://www.tekhnoal.com/data_alerts_hcfmlm.png" width="100%"></p>
<p>The profile has a description for each variable. Here's the description for the "LoanStatus" variable, which we will use to build the target variable.</p>
<p><img alt="Loan Status Variable Description" src="https://www.tekhnoal.com/loan_status_description_hcfmlm.png" width="100%"></p>
<p>By using the pandas_profiling package we can avoid writing the most common data analysis code that we write for all datasets. </p>
<h3>Preparing the Data</h3>
<p>The column that we're interested in is the "LoanStatus" column which tells us the current status of the loan. The values in the column are:</p>
<div class="highlight"><pre><span></span><code><span class="nb">list</span><span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="s2">"LoanStatus"</span><span class="p">]</span><span class="o">.</span><span class="n">unique</span><span class="p">())</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>['Fully Paid',
'Charged Off',
'Current',
'Default',
'Late (31-120 days)',
'In Grace Period',
'Late (16-30 days)',
'Does not meet the credit policy. Status:Fully Paid',
'Does not meet the credit policy. Status:Charged Off',
'Issued']
</code></pre></div>
<p>We'll be using this column to predict how risky a loan is. To do this, we'll need to create a target column that maps the values above into values that represent the riskiness of the loan. To keep things simple we'll simply create two categories for loans:</p>
<ul>
<li>"safe", for all loans that are current, fully paid off, in grace period, or the payment plan has not started yet</li>
<li>"risky", for all other loans</li>
</ul>
<p>Now we'll map the values above to the categories we want:</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span><span class="p">[</span><span class="s2">"LoanRisk"</span><span class="p">]</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s2">"LoanStatus"</span><span class="p">]</span><span class="o">.</span><span class="n">replace</span><span class="p">({</span>
<span class="s2">"Fully Paid"</span><span class="p">:</span> <span class="s2">"safe"</span><span class="p">,</span>
<span class="s2">"Charged Off"</span><span class="p">:</span> <span class="s2">"risky"</span><span class="p">,</span>
<span class="s2">"Current"</span><span class="p">:</span> <span class="s2">"safe"</span><span class="p">,</span>
<span class="s2">"Default"</span><span class="p">:</span> <span class="s2">"risky"</span><span class="p">,</span>
<span class="s2">"Late (31-120 days)"</span><span class="p">:</span> <span class="s2">"risky"</span><span class="p">,</span>
<span class="s2">"In Grace Period"</span><span class="p">:</span> <span class="s2">"safe"</span><span class="p">,</span>
<span class="s2">"Late (16-30 days)"</span><span class="p">:</span> <span class="s2">"risky"</span><span class="p">,</span>
<span class="s2">"Does not meet the credit policy. Status:Fully Paid"</span><span class="p">:</span> <span class="s2">"safe"</span><span class="p">,</span>
<span class="s2">"Does not meet the credit policy. Status:Charged Off"</span><span class="p">:</span> <span class="s2">"risky"</span><span class="p">,</span>
<span class="s2">"Issued"</span><span class="p">:</span> <span class="s2">"safe"</span>
<span class="p">})</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(887379, 19)
</code></pre></div>
<p>Now we can remove the "LoanStatus" column since we created a new target column.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">drop</span><span class="p">(</span><span class="s2">"LoanStatus"</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(887379, 18)
</code></pre></div>
<p>Now that have a defined target column, we can move on to fixing some things in the dataset. From the profile we can see that there are several problems with the data that we need to fix.</p>
<p>The profile tells us that there are rows with missing data. To simplify the data modeling, we'll drop these rows.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">dropna</span><span class="p">()</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(841954, 18)
</code></pre></div>
<p>All of the rows with missing values are now gone. </p>
<p>The "AnnualIncome" column is highly skewed. This is due to some rows which have outlier values, for example the max value for this column is $9,500,000. We'll fix this by removing rows with outlier values in that column. We'll remove the rows with values in this column that are more than three standard deviations from the mean like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="kn">from</span> <span class="nn">scipy</span> <span class="kn">import</span> <span class="n">stats</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[(</span><span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">stats</span><span class="o">.</span><span class="n">zscore</span><span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="s2">"AnnualIncome"</span><span class="p">]))</span> <span class="o"><</span> <span class="mi">3</span><span class="p">)]</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(835167, 18)
</code></pre></div>
<p>Another column in the dataset that is highly skewed is "DebtToIncomeRatio". For example, the maximum value in this column is 9999 which is probably not correct since most of the values in the column have a range between 0 and 100.</p>
<p>We'll use the same code to remove the outlier values for DebtToIncomeRatio.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[(</span><span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">stats</span><span class="o">.</span><span class="n">zscore</span><span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="s2">"DebtToIncomeRatio"</span><span class="p">]))</span> <span class="o"><</span> <span class="mi">3</span><span class="p">)]</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(835112, 18)
</code></pre></div>
<p>The column "NumberOfDelinquentAccounts" is highly skewed because of a single record that has a value of 14. We'll remove the outliers by simply filtering out the rows with values above 6.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">data</span><span class="p">[</span><span class="s2">"NumberOfDelinquentAccounts"</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">6</span><span class="p">]</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(835111, 18)
</code></pre></div>
<p>The "HomeOwnership" column has several values that stand in for missing data. These values make up a small portion of the dataset, so we'll just remove the rows instead of doing data imputation.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">data</span><span class="p">[</span><span class="s2">"HomeOwnership"</span><span class="p">]</span> <span class="o">!=</span> <span class="s2">"OTHER"</span><span class="p">]</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">data</span><span class="p">[</span><span class="s2">"HomeOwnership"</span><span class="p">]</span> <span class="o">!=</span> <span class="s2">"NONE"</span><span class="p">]</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">data</span><span class="p">[</span><span class="s2">"HomeOwnership"</span><span class="p">]</span> <span class="o">!=</span> <span class="s2">"ANY"</span><span class="p">]</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(834889, 18)
</code></pre></div>
<p>Looks like we only lost a few hundred rows from the dataset.</p>
<p>The variable "CollectionsInLast12Months" is not highly skewed but it contains values that only appear once or very few times. There are very few samples that have a value above 5, these samples are likely not useful so we'll remove them.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">data</span><span class="p">[</span><span class="s2">"CollectionsInLast12Months"</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">5</span><span class="p">]</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(834884, 18)
</code></pre></div>
<p>The same is true for the "DelinquenciesInLast2Years" and "DerogatoryPublicRecordCount" variables. There are very few samples with a value above 10 for these variables. We'll remove those samples.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">data</span><span class="p">[</span><span class="s2">"DelinquenciesInLast2Years"</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">10</span><span class="p">]</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">data</span><span class="p">[</span><span class="s2">"DerogatoryPublicRecordCount"</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">10</span><span class="p">]</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(834415, 18)
</code></pre></div>
<p>The variable "RevolvingLineUtilizationRate" is a percent whose values must be between 0 and 100. There really isn't a way to use more than 100% of your revolving line of credit. However, this variable has values above 100, we'll remove those samples because they're bad data.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">data</span><span class="p">[</span><span class="s2">"RevolvingLineUtilizationRate"</span><span class="p">]</span> <span class="o"><=</span> <span class="mf">100.0</span><span class="p">]</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(831103, 18)
</code></pre></div>
<h3>Validating the Data</h3>
<p>Next, we'll use the <a href="https://docs.deepchecks.com/stable/getting-started/welcome.html">deepchecks package</a> to do ML specific checks on the data. These checks are for checking for data issues that might affect an ML model.</p>
<p>Let's install the package:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">deepchecks</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Before we can run these checks, we need to specify which variables are categorical and numerical, and which variable is the target variable. We'll create lists of variable names for this purpose.</p>
<div class="highlight"><pre><span></span><code><span class="n">categorical_variables</span> <span class="o">=</span> <span class="p">[</span>
<span class="s2">"EmploymentLength"</span><span class="p">,</span>
<span class="s2">"HomeOwnership"</span><span class="p">,</span>
<span class="s2">"LoanPurpose"</span><span class="p">,</span>
<span class="s2">"VerificationStatus"</span><span class="p">,</span>
<span class="s2">"Term"</span>
<span class="p">]</span>
<span class="n">numerical_variables</span> <span class="o">=</span> <span class="p">[</span>
<span class="s2">"AnnualIncome"</span><span class="p">,</span>
<span class="s2">"CollectionsInLast12Months"</span><span class="p">,</span>
<span class="s2">"DelinquenciesInLast2Years"</span><span class="p">,</span>
<span class="s2">"DebtToIncomeRatio"</span><span class="p">,</span>
<span class="s2">"NumberOfDelinquentAccounts"</span><span class="p">,</span>
<span class="s2">"MonthlyInstallmentPayment"</span><span class="p">,</span>
<span class="s2">"InterestRate"</span><span class="p">,</span>
<span class="s2">"LastPaymentAmount"</span><span class="p">,</span>
<span class="s2">"LoanAmount"</span><span class="p">,</span>
<span class="s2">"DerogatoryPublicRecordCount"</span><span class="p">,</span>
<span class="s2">"RevolvingLineUtilizationRate"</span><span class="p">,</span>
<span class="s2">"TotalPaymentsToDate"</span>
<span class="p">]</span>
<span class="n">target_variable</span> <span class="o">=</span> <span class="s2">"LoanRisk"</span>
<span class="n">all_variables</span> <span class="o">=</span> <span class="n">categorical_variables</span> <span class="o">+</span> <span class="n">numerical_variables</span> <span class="o">+</span> <span class="p">[</span><span class="n">target_variable</span><span class="p">]</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">all_variables</span><span class="p">)</span> <span class="o">==</span> <span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">columns</span><span class="p">),</span> <span class="s2">"A column is missing from the lists of variables."</span>
</code></pre></div>
<p>The assert statement at the end of the code above is a simple check that makes sure that we don't forget to include all of the columns in the lists. If we forget to include a column in the list it will stop execution of the notebook with an error message.</p>
<p>In order to avoid some errors in the training process, we'll need to change the column type of some of the columns in the pandas dataframe to "category".</p>
<div class="highlight"><pre><span></span><code><span class="k">for</span> <span class="n">column_name</span> <span class="ow">in</span> <span class="n">categorical_variables</span><span class="p">:</span>
<span class="n">data</span><span class="p">[</span><span class="n">column_name</span><span class="p">]</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">column_name</span><span class="p">]</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="s2">"category"</span><span class="p">)</span>
</code></pre></div>
<p>Lets start the data validation checks.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">deepchecks.tabular</span> <span class="kn">import</span> <span class="n">Dataset</span>
<span class="kn">from</span> <span class="nn">deepchecks.tabular.suites</span> <span class="kn">import</span> <span class="n">data_integrity</span>
<span class="n">dataset</span> <span class="o">=</span> <span class="n">Dataset</span><span class="p">(</span><span class="n">data</span><span class="p">,</span>
<span class="n">cat_features</span><span class="o">=</span><span class="n">categorical_variables</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="n">target_variable</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>The Dataset object contains a reference to the original Dataframe that we've been working with, and also contains the metadata about the columns in the dataframe that is needed to analyze the data.</p>
<p>We'll run the checks on the data like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">data_integrity_suite</span> <span class="o">=</span> <span class="n">data_integrity</span><span class="p">()</span>
<span class="n">suite_result</span> <span class="o">=</span> <span class="n">data_integrity_suite</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">dataset</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">suite_result</span><span class="o">.</span><span class="n">save_as_html</span><span class="p">(</span><span class="s2">"../credit_risk_model/model_files/deepchecks_data_integrity_results.html"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>'../credit_risk_model/model_files/deepchecks_data_integrity_results.html'
</code></pre></div>
<p>The checks done by this suite are geared towards datasets used for machine learning.</p>
<p>The results of the data integrity suite look like this:</p>
<p><img alt="Data Integrity Suite" src="https://www.tekhnoal.com/data_integrity_suite_hcfmlm.png" width="100%"></p>
<p>The suite contains many checks that execute on the data set, the checks that passed are:</p>
<ul>
<li>Feature Label Correlation, predictive power score is less than 0.8 for all features.</li>
<li>Single Value in Column, column does not contain only a single value</li>
<li>Special Characters, ratio of samples containing solely special character is less or equal to 0.1%</li>
<li>Mixed Nulls, number of different null types is less or equal to 1</li>
<li>Mixed Data Types, rare data types in column are either more than 10% or less than 1% of the data</li>
<li>Data Duplicates, duplicate data ratio is less or equal to 0%</li>
<li>String Length Out Of Bounds, ratio of string length outliers is less or equal to 0%</li>
<li>Conflicting Labels, ambiguous sample ratio is less or equal to 0%</li>
</ul>
<p>The checks are fully explained in the <a href="https://docs.deepchecks.com/0.9/checks_gallery/tabular.html">deepchecks documentation</a>.</p>
<p>For now, we're more interested in the checks that did not pass:</p>
<p><img alt="Data Integrity Suite Fail" src="https://www.tekhnoal.com/data_integrity_suite_fail_hcfmlm.png" width="100%"></p>
<p>The two checks that didn't pass are:</p>
<ul>
<li>Feature-Feature Correlation, not more than 0 pairs are correlated above 0.9</li>
<li>String Mismatch, no string variants</li>
</ul>
<p>The deepchecks package found that the "LoanAmount" and "MonthlyInstallmentPayment" variables are highly correlated, which makes sense because an increase in loan amount will always cause an increase in payment amount. We can safely drop the MonthlyInstallmentPayment column from the dataset.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">drop</span><span class="p">(</span><span class="s2">"MonthlyInstallmentPayment"</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">numerical_variables</span><span class="o">.</span><span class="n">remove</span><span class="p">(</span><span class="s2">"MonthlyInstallmentPayment"</span><span class="p">)</span>
<span class="n">data</span><span class="o">.</span><span class="n">shape</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(831103, 17)
</code></pre></div>
<p>The deepchecks package also found that the "EmploymentLength" variable contains string values that are similar to each other. For example, two levels found in the categorical variable are "1 year" and "< 1 year". This is a warning that we can ignore because the levels are correctly set.</p>
<p>We're now getting closer to a dataset that we can use to train a model . We'll be using deepchecks to do train/test dataset checks and model checks later on.</p>
<h3>Training a Model</h3>
<p>To train a model, we'll first create a training and testing set. We'll use 80% of the rows for training and 20% of the rows for testing.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="n">mask</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">rand</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">))</span> <span class="o"><</span> <span class="mf">0.80</span>
<span class="n">training_data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">mask</span><span class="p">]</span>
<span class="n">testing_data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="o">~</span><span class="n">mask</span><span class="p">]</span>
<span class="nb">print</span><span class="p">(</span><span class="n">training_data</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">testing_data</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>(664983, 17)
(166120, 17)
</code></pre></div>
<p>We will be running a test suite on the newly created training and test suites using deepchecks. The deepchecks package requires that we create two Dataset objects, one for the training set and one for the testing set.</p>
<div class="highlight"><pre><span></span><code><span class="n">train_dataset</span> <span class="o">=</span> <span class="n">Dataset</span><span class="p">(</span><span class="n">training_data</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="n">target_variable</span><span class="p">,</span>
<span class="n">cat_features</span><span class="o">=</span><span class="n">categorical_variables</span><span class="p">)</span>
<span class="n">test_dataset</span> <span class="o">=</span> <span class="n">Dataset</span><span class="p">(</span><span class="n">testing_data</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="n">target_variable</span><span class="p">,</span>
<span class="n">cat_features</span><span class="o">=</span><span class="n">categorical_variables</span><span class="p">)</span>
</code></pre></div>
<p>Now we can run the train-test validation suite.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">deepchecks.tabular.suites</span> <span class="kn">import</span> <span class="n">train_test_validation</span>
<span class="n">validation_suite</span> <span class="o">=</span> <span class="n">train_test_validation</span><span class="p">()</span>
<span class="n">suite_result</span> <span class="o">=</span> <span class="n">validation_suite</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">train_dataset</span><span class="p">,</span> <span class="n">test_dataset</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>We'll save the results to files for this suite as well.</p>
<div class="highlight"><pre><span></span><code><span class="n">suite_result</span><span class="o">.</span><span class="n">save_as_html</span><span class="p">(</span><span class="s2">"../credit_risk_model/model_files/deepchecks_train_test_results.html"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>'../credit_risk_model/model_files/deepchecks_train_test_results.html'
</code></pre></div>
<p>The results of the suite look like this:</p>
<p><img alt="Train Test Suite" src="https://www.tekhnoal.com/train_test_suite_hcfmlm.png" width="100%"></p>
<p>All of the checks in this suite passed:</p>
<ul>
<li>Datasets Size Comparison, Test-Train size ratio is greater than 0.01</li>
<li>Category Mismatch Train Test, ratio of samples with a new category is less or equal to 0%</li>
<li>Feature Label Correlation Change, Train-Test features' Predictive Power Score difference is less than 0.2</li>
<li>Feature Label Correlation Change, Train features' Predictive Power Score is less than 0.7</li>
<li>Train Test Feature Drift, categorical drift score < 0.2 and numerical drift score < 0.1</li>
<li>Train Test Label Drift, categorical drift score < 0.2 and numerical drift score < 0.1 for label drift Label's drift score Cramer's V is 0</li>
<li>New Label Train Test, number of new label values is less or equal to 0</li>
<li>String Mismatch Comparison No new variants allowed in test data</li>
<li>Train Test Samples Mix, percentage of test data samples that appear in train data is less or equal to 10%, </li>
<li>Multivariate Drift, drift value is less than 0.25</li>
</ul>
<p>These checks are more fully explained in the <a href="https://docs.deepchecks.com/0.9/checks_gallery/tabular/train_test_validation/">documentation</a>.</p>
<p>Now that we have verified the contents of the training and testing sets, we're finally ready to traing a model. To do this, we'll need to create separate dataframes for the predictor and target columns:</p>
<div class="highlight"><pre><span></span><code><span class="n">feature_columns</span> <span class="o">=</span> <span class="n">categorical_variables</span> <span class="o">+</span> <span class="n">numerical_variables</span>
<span class="n">X_train</span> <span class="o">=</span> <span class="n">training_data</span><span class="p">[</span><span class="n">feature_columns</span><span class="p">]</span>
<span class="n">y_train</span> <span class="o">=</span> <span class="n">training_data</span><span class="p">[</span><span class="n">target_variable</span><span class="p">]</span>
<span class="n">X_test</span> <span class="o">=</span> <span class="n">testing_data</span><span class="p">[</span><span class="n">feature_columns</span><span class="p">]</span>
<span class="n">y_test</span> <span class="o">=</span> <span class="n">testing_data</span><span class="p">[</span><span class="n">target_variable</span><span class="p">]</span>
</code></pre></div>
<p>We'll be using the <a href="https://lightgbm.readthedocs.io/en/latest/index.html">LightGBM package</a> to train a GBM model and the <a href="https://microsoft.github.io/FLAML/">FLAML package</a> for doing automated machine learning. Let's install the packages:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">lightgbm</span>
<span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">flaml</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Let's train a model using the default hyperparameters to have a baseline.</p>
<div class="highlight"><pre><span></span><code><span class="o">%%</span><span class="n">time</span>
<span class="kn">from</span> <span class="nn">lightgbm</span> <span class="kn">import</span> <span class="n">LGBMClassifier</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">LGBMClassifier</span><span class="p">()</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>CPU times: user 13.1 s, sys: 657 ms, total: 13.8 s
Wall time: 3.46 s
</code></pre></div>
<p>Now let's calculate the classification metrics for this simple model:</p>
<div class="highlight"><pre><span></span><code><span class="o">%%</span><span class="n">time</span>
<span class="kn">from</span> <span class="nn">sklearn.metrics</span> <span class="kn">import</span> <span class="n">classification_report</span>
<span class="n">y_pred</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X_test</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">classification_report</span><span class="p">(</span><span class="n">y_test</span><span class="p">,</span> <span class="n">y_pred</span><span class="p">))</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code> precision recall f1-score support
risky 0.92 0.14 0.24 11456
safe 0.94 1.00 0.97 154664
accuracy 0.94 166120
macro avg 0.93 0.57 0.61 166120
weighted avg 0.94 0.94 0.92 166120
CPU times: user 8.96 s, sys: 191 ms, total: 9.15 s
Wall time: 6.52 s
</code></pre></div>
<p>The "safe" class has good metrics but the "risky" class does not. This is due to the fact that the classes are imbalanced. </p>
<p>We'll try to fix this issue by doing automated ML with the FLAML package. The automated hyperparameter search will hopefully find some parameters that can improve the metrics of the "risky" class.</p>
<p>The settings are:</p>
<ul>
<li>time_budget: amount of time allowed for the auto ML algorithm to run</li>
<li>metric: the metric that should be maximized by the auto ML alogorithm</li>
<li>estimator_list: the types of estimators that can be used by FLAML, in this case we only want to try LightGBM</li>
<li>task: the type of task that the estimator should be solving</li>
<li>log_file_name: name of the log file output by the auto ML algorithm</li>
<li>seed: the random seed to be used by the auto ML algorithm</li>
</ul>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">flaml</span> <span class="kn">import</span> <span class="n">AutoML</span>
<span class="n">automl</span> <span class="o">=</span> <span class="n">AutoML</span><span class="p">()</span>
<span class="n">settings</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"time_budget"</span><span class="p">:</span> <span class="mi">1200</span><span class="p">,</span>
<span class="s2">"metric"</span><span class="p">:</span> <span class="s2">"roc_auc"</span><span class="p">,</span>
<span class="s2">"estimator_list"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"lgbm"</span><span class="p">],</span>
<span class="s2">"task"</span><span class="p">:</span> <span class="s2">"classification"</span><span class="p">,</span>
<span class="s2">"log_file_name"</span><span class="p">:</span> <span class="s2">"experiment.log"</span><span class="p">,</span>
<span class="s2">"seed"</span><span class="p">:</span> <span class="mi">42</span>
<span class="p">}</span>
<span class="n">automl</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="o">=</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="o">=</span><span class="n">y_train</span><span class="p">,</span> <span class="o">**</span><span class="n">settings</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>The hyperparameter search has found an optimal set of hyperparametes using the training set and cross validation. These are the hyperparameters found:</p>
<div class="highlight"><pre><span></span><code><span class="n">automl</span><span class="o">.</span><span class="n">best_config</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{'</span><span class="n">n_estimators</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mh">10707</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">num_leaves</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mh">7</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">min_child_samples</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mh">62</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">learning_rate</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mf">0.24185440044608203</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">log_max_bin</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mh">10</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">colsample_bytree</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mf">0.9914098492087268</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">reg_alpha</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mf">2.551067627605118</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">reg_lambda</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mf">0.0010846951681516895</span><span class="p">}</span><span class="w"></span>
</code></pre></div>
<p>Let's train a model using the optimal hyperparameters:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">LGBMClassifier</span><span class="p">(</span><span class="o">**</span><span class="n">automl</span><span class="o">.</span><span class="n">best_config</span><span class="p">)</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
</code></pre></div>
<p>Let's get the the classification metrics for the best model:</p>
<div class="highlight"><pre><span></span><code><span class="n">y_pred</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X_test</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">classification_report</span><span class="p">(</span><span class="n">y_test</span><span class="p">,</span> <span class="n">y_pred</span><span class="p">))</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code> precision recall f1-score support
risky 0.89 0.46 0.61 11456
safe 0.96 1.00 0.98 154664
accuracy 0.96 166120
macro avg 0.93 0.73 0.79 166120
weighted avg 0.96 0.96 0.95 166120
</code></pre></div>
<h3>Validating the Model</h3>
<p>Deepchecks is also able to validate the model with the model_evaluation suite of checks.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">deepchecks.tabular.suites</span> <span class="kn">import</span> <span class="n">model_evaluation</span>
<span class="n">evaluation_suite</span> <span class="o">=</span> <span class="n">model_evaluation</span><span class="p">()</span>
<span class="n">suite_result</span> <span class="o">=</span> <span class="n">evaluation_suite</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">train_dataset</span><span class="p">,</span> <span class="n">test_dataset</span><span class="p">,</span> <span class="n">model</span><span class="p">)</span>
</code></pre></div>
<style>
progress {
-webkit-appearance: none;
border: none;
border-radius: 3px;
width: 300px;
height: 20px;
vertical-align: middle;
margin-right: 10px;
background-color: aliceblue;
}
progress::-webkit-progress-bar {
border-radius: 3px;
background-color: aliceblue;
}
progress::-webkit-progress-value {
background-color: #9d60fb;
}
progress::-moz-progress-bar {
background-color: #9d60fb;
}
</style>
<div class="highlight"><pre><span></span><code><span class="n">suite_result</span><span class="o">.</span><span class="n">save_as_html</span><span class="p">(</span><span class="s2">"../credit_risk_model/model_files/deepchecks_model_evaluation_results.html"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>'../credit_risk_model/model_files/deepchecks_model_evaluation_results.html'
</code></pre></div>
<h3>Packaging the Model Files</h3>
<p>Let's serialize the best model to disk:</p>
<div class="highlight"><pre><span></span><code><span class="o">%%</span><span class="n">time</span>
<span class="kn">import</span> <span class="nn">joblib</span>
<span class="n">joblib</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="s2">"../credit_risk_model/model_files/model.joblib"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>CPU times: user 1.33 s, sys: 82.1 ms, total: 1.41 s
Wall time: 226 ms
['../credit_risk_model/model_files/model.joblib']
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">ls</span> <span class="o">-</span><span class="n">la</span> <span class="o">../</span><span class="n">credit_risk_model</span><span class="o">/</span><span class="n">model_files</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>total 82824
drwxr-xr-x 8 brian staff 256 Jan 15 20:07 [34m.[m[m
drwxr-xr-x 8 brian staff 256 Dec 13 11:15 [34m..[m[m
-rw-r--r--@ 1 brian staff 6148 Jan 15 19:09 .DS_Store
-rw-r--r-- 1 brian staff 9426966 Jan 15 19:12 data_exploration_report.html
-rw-r--r-- 1 brian staff 7750291 Jan 15 19:15 deepchecks_data_integrity_results.html
-rw-r--r--@ 1 brian staff 7964754 Jan 15 20:06 deepchecks_model_evaluation_results.html
-rw-r--r-- 1 brian staff 7793791 Jan 15 19:17 deepchecks_train_test_results.html
-rw-r--r-- 1 brian staff 9452707 Jan 15 20:07 model.joblib
</code></pre></div>
<p>The serialized model is 9.5 megabytes in size and took 226 milliseconds to write to disk. This is important to note because we will need to deserialize the model later in order to make predictions with it.</p>
<p>In the process of training this model, we created a few files. To be able to use these files later, we'll package them up and save them in a location that can be accessed by the prediction code later. We'll be using a .zip file for this purpose.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">shutil</span>
<span class="n">shutil</span><span class="o">.</span><span class="n">make_archive</span><span class="p">(</span><span class="s2">"../credit_risk_model/model_files/1"</span><span class="p">,</span> <span class="s2">"zip"</span><span class="p">,</span> <span class="s2">"../credit_risk_model/model_files"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="s1">'</span><span class="s">/Users/brian/Code/health-checks-for-ml-model-deployments/credit_risk_model/model_files/1.zip</span><span class="s1">'</span>
</code></pre></div>
<p>The command created a .zip file with all of the files in the model_files folder. The name of the folder is "1.zip" this is just a simple name that denotes that it is the first model trained for the credit_risk_model package.</p>
<p>Now that we have the model files in a .zip file, we can delete the original files from the folder:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">rm</span> <span class="o">../</span><span class="n">credit_risk_model</span><span class="o">/</span><span class="n">model_files</span><span class="o">/</span><span class="n">data_exploration_report</span><span class="o">.</span><span class="n">html</span>
<span class="err">!</span><span class="n">rm</span> <span class="o">../</span><span class="n">credit_risk_model</span><span class="o">/</span><span class="n">model_files</span><span class="o">/</span><span class="n">deepchecks_data_integrity_results</span><span class="o">.</span><span class="n">html</span>
<span class="err">!</span><span class="n">rm</span> <span class="o">../</span><span class="n">credit_risk_model</span><span class="o">/</span><span class="n">model_files</span><span class="o">/</span><span class="n">deepchecks_train_test_results</span><span class="o">.</span><span class="n">html</span>
<span class="err">!</span><span class="n">rm</span> <span class="o">../</span><span class="n">credit_risk_model</span><span class="o">/</span><span class="n">model_files</span><span class="o">/</span><span class="n">deepchecks_model_evaluation_results</span><span class="o">.</span><span class="n">html</span>
<span class="err">!</span><span class="n">rm</span> <span class="o">../</span><span class="n">credit_risk_model</span><span class="o">/</span><span class="n">model_files</span><span class="o">/</span><span class="n">model</span><span class="o">.</span><span class="n">joblib</span>
</code></pre></div>
<p>This packaging process ensures that all of the results of the model training process end up in one archive that we can use later. All of the data and model check results are packaged along with the serialized model so its easy to review the model training process.</p>
<h2>Making Predictions with the Model</h2>
<p>We now have a working model that accepts Pandas dataframes as input and also returns predictions in dataframes. This is useful in the context of model training, but makes integrating the model with other software components a lot more complicated. To make the model easier to use, we'll need to create input and output schemas for the model and also create a wrapper class that provides a consistent interface for the model. </p>
<p>We'll create the model's input and output schemas with the <a href="https://pydantic-docs.helpmanual.io/">pydantic package</a>, which is a package used for data validation. By creating the schemas using this package we're able to fully document the inputs that the model accepts and the expected outputs of the model we're going to deploy.</p>
<p>To begin, we'll define the allowed values for the categorical variables.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pydantic</span> <span class="kn">import</span> <span class="n">BaseModel</span><span class="p">,</span> <span class="n">Field</span>
<span class="kn">from</span> <span class="nn">enum</span> <span class="kn">import</span> <span class="n">Enum</span>
<span class="k">class</span> <span class="nc">EmploymentLength</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="sd">"""Employment length in years."""</span>
<span class="n">less_than_1_year</span> <span class="o">=</span> <span class="s2">"< 1 year"</span>
<span class="n">one_year</span> <span class="o">=</span> <span class="s2">"1 year"</span>
<span class="n">two_years</span> <span class="o">=</span> <span class="s2">"2 years"</span>
<span class="n">three_years</span> <span class="o">=</span> <span class="s2">"3 years"</span>
<span class="n">four_years</span> <span class="o">=</span> <span class="s2">"4 years"</span>
<span class="n">five_years</span> <span class="o">=</span> <span class="s2">"5 years"</span>
<span class="n">six_years</span> <span class="o">=</span> <span class="s2">"6 years"</span>
<span class="n">seven_years</span> <span class="o">=</span> <span class="s2">"7 years"</span>
<span class="n">eight_years</span> <span class="o">=</span> <span class="s2">"8 years"</span>
<span class="n">nine_years</span> <span class="o">=</span> <span class="s2">"9 years"</span>
<span class="n">ten_years_or_more</span> <span class="o">=</span> <span class="s2">"10+ years"</span>
<span class="k">class</span> <span class="nc">HomeOwnership</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="sd">"""The home ownership status provided by the borrower during registration."""</span>
<span class="n">MORTGAGE</span> <span class="o">=</span> <span class="s2">"MORTGAGE"</span>
<span class="n">RENT</span> <span class="o">=</span> <span class="s2">"RENT"</span>
<span class="n">OWN</span> <span class="o">=</span> <span class="s2">"OWN"</span>
<span class="k">class</span> <span class="nc">LoanPurpose</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="sd">"""A category provided by the borrower for the loan request."""</span>
<span class="n">debt_consolidation</span> <span class="o">=</span> <span class="s2">"debt_consolidation"</span>
<span class="n">credit_card</span> <span class="o">=</span> <span class="s2">"credit_card"</span>
<span class="n">home_improvement</span> <span class="o">=</span> <span class="s2">"home_improvement"</span>
<span class="n">other</span> <span class="o">=</span> <span class="s2">"other"</span>
<span class="n">major_purchase</span> <span class="o">=</span> <span class="s2">"major_purchase"</span>
<span class="n">small_business</span> <span class="o">=</span> <span class="s2">"small_business"</span>
<span class="n">car</span> <span class="o">=</span> <span class="s2">"car"</span>
<span class="n">medical</span> <span class="o">=</span> <span class="s2">"medical"</span>
<span class="n">moving</span> <span class="o">=</span> <span class="s2">"moving"</span>
<span class="n">vacation</span> <span class="o">=</span> <span class="s2">"vacation"</span>
<span class="n">wedding</span> <span class="o">=</span> <span class="s2">"wedding"</span>
<span class="n">house</span> <span class="o">=</span> <span class="s2">"house"</span>
<span class="n">renewable_energy</span> <span class="o">=</span> <span class="s2">"renewable_energy"</span>
<span class="n">educational</span> <span class="o">=</span> <span class="s2">"educational"</span>
<span class="k">class</span> <span class="nc">Term</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="sd">"""The number of payments on the loan."""</span>
<span class="n">thirty_six_months</span> <span class="o">=</span> <span class="s2">" 36 months"</span>
<span class="n">sixty_months</span> <span class="o">=</span> <span class="s2">" 60 months"</span>
<span class="k">class</span> <span class="nc">VerificationStatus</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="sd">"""Indicates if income was verified."""</span>
<span class="n">source_verified</span> <span class="o">=</span> <span class="s2">"Source Verified"</span>
<span class="n">verified</span> <span class="o">=</span> <span class="s2">"Verified"</span>
<span class="n">not_verified</span> <span class="o">=</span> <span class="s2">"Not Verified"</span>
</code></pre></div>
<p>The dataset contains 5 categorical variables, so we defined 5 Enum classes that contain the values accepted for these variables. Each enumeration has a key and value, with the value being the value as the model expects to see it. By using enumerated values, we can ensure that the model can only receive values in these inputs that it has previously seen in the training set.</p>
<p>Now that we have the categorical variables defined, we can define the input schema for the model:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">CreditRiskModelInput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="sd">"""Inputs for predicting credit risk."""</span>
<span class="n">annual_income</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mi">1896</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">273000</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"The self-reported annual income provided by the borrower during registration."</span><span class="p">)</span>
<span class="n">collections_in_last_12_months</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Number of collections in 12 months excluding medical collections."</span><span class="p">)</span>
<span class="n">delinquencies_in_last_2_years</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">39</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"The number of 30+ days past-due incidences of delinquency in the borrower's credit file for the past 2 years."</span><span class="p">)</span>
<span class="n">debt_to_income_ratio</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mf">42.64</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"A ratio calculated using the borrower’s total monthly debt payments on the total debt obligations, excluding mortgage and the requested LC loan, divided by the borrower’s self-reported monthly income."</span><span class="p">)</span>
<span class="n">employment_length</span><span class="p">:</span> <span class="n">EmploymentLength</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"Employment length in years."</span><span class="p">)</span>
<span class="n">home_ownership</span><span class="p">:</span> <span class="n">HomeOwnership</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"The home ownership status provided by the borrower during registration. Our values are: RENT, OWN, MORTGAGE, OTHER."</span><span class="p">)</span>
<span class="n">number_of_delinquent_accounts</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">6</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"The number of accounts on which the borrower is now delinquent."</span><span class="p">)</span>
<span class="n">interest_rate</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mf">5.32</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mf">28.99</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Interest Rate on the loan."</span><span class="p">)</span>
<span class="n">last_payment_amount</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mf">36475.59</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Last total payment amount received."</span><span class="p">)</span>
<span class="n">loan_amount</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mf">500.0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mf">35000.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"The listed amount of the loan applied for by the borrower."</span><span class="p">)</span>
<span class="n">derogatory_public_record_count</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mf">86.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Number of derogatory public records."</span><span class="p">)</span>
<span class="n">loan_purpose</span><span class="p">:</span> <span class="n">LoanPurpose</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"A category provided by the borrower for the loan request."</span><span class="p">)</span>
<span class="n">revolving_line_utilization_rate</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mf">892.3</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Revolving line utilization rate, or the amount of credit the borrower is using relative to all available revolving credit."</span><span class="p">)</span>
<span class="n">term</span><span class="p">:</span> <span class="n">Term</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"The number of payments on the loan. Values are in months and can be either 36 or 60."</span><span class="p">)</span>
<span class="n">total_payments_to_date</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">ge</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mf">57777.58</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Payments received to date for portion of total amount funded by investors."</span><span class="p">)</span>
<span class="n">verification_status</span><span class="p">:</span> <span class="n">VerificationStatus</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"Indicates if income was verified."</span><span class="p">)</span>
</code></pre></div>
<p>The schema is called "CreditRiskModelInput" and contains fields for each variable found in the dataset. We're using the Enum classes we defined above for the categorical fields, and we defined fields for all of the numerical variables. Each numerical field has a range of allowed values that matches the range of the numerical variable found in the dataset. Each field also has a description of the variable that helps the user of the model to correctly feed data to the model.</p>
<p>The process of creating an input data schema makes the model much more used friendly and exposes information found in the dataset that the model was originally trained on to the user of the model. Doing this also allows us to build documentation for the model service automatically.</p>
<p>Now that we have the model's input schema defined, we'll define the output schema:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">CreditRisk</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="sd">"""Indicates whether or not loan is risky."""</span>
<span class="n">safe</span> <span class="o">=</span> <span class="s2">"safe"</span>
<span class="n">risky</span> <span class="o">=</span> <span class="s2">"risky"</span>
<span class="k">class</span> <span class="nc">CreditRiskModelOutput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">credit_risk</span><span class="p">:</span> <span class="n">CreditRisk</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"Whether or not the loan is risky."</span><span class="p">)</span>
</code></pre></div>
<p>The model is a classification model and the output schema simply enumerates the classes that the model can predict. </p>
<p>We now have the input and output schemas defined, now we can tie it all together by creating a wrapper class for the model. The <a href="https://pypi.org/project/ml-base/">ml_base package</a> defines a simple base class for model prediction code that allows us to "wrap" the prediction code in a class that follows the MLModel interface. This interface publishes this information about the model:</p>
<ul>
<li>Qualified Name, a unique identifier for the model</li>
<li>Display Name, a friendly name for the model used in user interfaces</li>
<li>Description, a description for the model</li>
<li>Version, semantic version of the model codebase</li>
<li>Input Schema, an object that describes the model\'s input data</li>
<li>Output Schema, an object that describes the model\'s output schema</li>
</ul>
<p>The MLModel interface also dictates that the model class implements two methods:</p>
<ul>
<li>__init__, the initialization method which loads any model artifacts needed to make predictions </li>
<li>predict, prediction method that receives model inputs makes a prediction and returns model outputs </li>
</ul>
<p>By using the MLModel base class we'll be able to do more interesting things later with the model. If you'd like to learn more about the ml_base package, <a href="https://schmidtbri.github.io/ml-base/basic/">here</a> is some documentation about it.</p>
<p>To install the ml_base package, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">ml_base</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>We'll define the wrapper class like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">joblib</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
<span class="kn">from</span> <span class="nn">ml_base</span> <span class="kn">import</span> <span class="n">MLModel</span>
<span class="kn">import</span> <span class="nn">zipfile</span>
<span class="vm">__file__</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">realpath</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">abspath</span><span class="p">(</span><span class="s1">''</span><span class="p">))),</span> <span class="s2">"credit_risk_model"</span><span class="p">,</span> <span class="s2">"prediction"</span><span class="p">,</span> <span class="s2">"model.py"</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">CreditRiskModel</span><span class="p">(</span><span class="n">MLModel</span><span class="p">):</span>
<span class="sd">"""Prediction logic for the Credit Risk Model."""</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">display_name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="sd">"""Return display name of model."""</span>
<span class="k">return</span> <span class="s2">"Credit Risk Model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">qualified_name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="sd">"""Return qualified name of model."""</span>
<span class="k">return</span> <span class="s2">"credit_risk_model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">description</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="sd">"""Return description of model."""</span>
<span class="k">return</span> <span class="s2">"Model to predict the credit risk of a loan."</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">version</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="sd">"""Return version of model."""</span>
<span class="k">return</span> <span class="s2">"0.1.0"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">"""Return input schema of model."""</span>
<span class="k">return</span> <span class="n">CreditRiskModelInput</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">"""Return output schema of model."""</span>
<span class="k">return</span> <span class="n">CreditRiskModelOutput</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">"""Class constructor that loads and deserializes the model parameters."""</span>
<span class="n">dir_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">realpath</span><span class="p">(</span><span class="vm">__file__</span><span class="p">)))</span>
<span class="n">file_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir_path</span><span class="p">,</span> <span class="s2">"model_files"</span><span class="p">,</span> <span class="s2">"1.zip"</span><span class="p">)</span>
<span class="k">with</span> <span class="n">zipfile</span><span class="o">.</span><span class="n">ZipFile</span><span class="p">(</span><span class="n">file_path</span><span class="p">)</span> <span class="k">as</span> <span class="n">zf</span><span class="p">:</span>
<span class="k">if</span> <span class="s2">"model.joblib"</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">zf</span><span class="o">.</span><span class="n">namelist</span><span class="p">():</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Could not find model file in zip file."</span><span class="p">)</span>
<span class="n">model_file</span> <span class="o">=</span> <span class="n">zf</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"model.joblib"</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="o">=</span> <span class="n">joblib</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">model_file</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">:</span> <span class="n">CreditRiskModelInput</span><span class="p">)</span> <span class="o">-></span> <span class="n">CreditRiskModelOutput</span><span class="p">:</span>
<span class="sd">"""Make a prediction with the model.</span>
<span class="sd"> Params:</span>
<span class="sd"> data: Data for making a prediction with the model.</span>
<span class="sd"> Returns:</span>
<span class="sd"> The result of the prediction.</span>
<span class="sd"> """</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="n">CreditRiskModelInput</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Input must be of type 'CreditRisk'"</span><span class="p">)</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">([[</span>
<span class="n">data</span><span class="o">.</span><span class="n">employment_length</span><span class="o">.</span><span class="n">value</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">home_ownership</span><span class="o">.</span><span class="n">value</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">loan_purpose</span><span class="o">.</span><span class="n">value</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">verification_status</span><span class="o">.</span><span class="n">value</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">term</span><span class="o">.</span><span class="n">value</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">annual_income</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">collections_in_last_12_months</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">delinquencies_in_last_2_years</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">debt_to_income_ratio</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">number_of_delinquent_accounts</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">interest_rate</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">last_payment_amount</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">loan_amount</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">derogatory_public_record_count</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">revolving_line_utilization_rate</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">total_payments_to_date</span><span class="p">,</span>
<span class="p">]],</span>
<span class="n">columns</span><span class="o">=</span><span class="p">[</span>
<span class="s2">"EmploymentLength"</span><span class="p">,</span>
<span class="s2">"HomeOwnership"</span><span class="p">,</span>
<span class="s2">"LoanPurpose"</span><span class="p">,</span>
<span class="s2">"VerificationStatus"</span><span class="p">,</span>
<span class="s2">"Term"</span><span class="p">,</span>
<span class="s2">"AnnualIncome"</span><span class="p">,</span>
<span class="s2">"CollectionsInLast12Months"</span><span class="p">,</span>
<span class="s2">"DelinquenciesInLast2Years"</span><span class="p">,</span>
<span class="s2">"DebtToIncomeRatio"</span><span class="p">,</span>
<span class="s2">"NumberOfDelinquentAccounts"</span><span class="p">,</span>
<span class="s2">"InterestRate"</span><span class="p">,</span>
<span class="s2">"LastPaymentAmount"</span><span class="p">,</span>
<span class="s2">"LoanAmount"</span><span class="p">,</span>
<span class="s2">"DerogatoryPublicRecordCount"</span><span class="p">,</span>
<span class="s2">"RevolvingLineUtilizationRate"</span><span class="p">,</span>
<span class="s2">"TotalPaymentsToDate"</span>
<span class="p">])</span>
<span class="n">categorical_variables</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"EmploymentLength"</span><span class="p">,</span>
<span class="s2">"HomeOwnership"</span><span class="p">,</span>
<span class="s2">"LoanPurpose"</span><span class="p">,</span>
<span class="s2">"VerificationStatus"</span><span class="p">,</span>
<span class="s2">"Term"</span><span class="p">]</span>
<span class="k">for</span> <span class="n">column_name</span> <span class="ow">in</span> <span class="n">categorical_variables</span><span class="p">:</span>
<span class="n">X</span><span class="p">[</span><span class="n">column_name</span><span class="p">]</span> <span class="o">=</span> <span class="n">X</span><span class="p">[</span><span class="n">column_name</span><span class="p">]</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="s2">"category"</span><span class="p">)</span>
<span class="n">y_hat</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
<span class="k">return</span> <span class="n">CreditRiskModelOutput</span><span class="p">(</span><span class="n">credit_risk</span><span class="o">=</span><span class="n">CreditRisk</span><span class="p">[</span><span class="n">y_hat</span><span class="p">])</span>
</code></pre></div>
<p>We can make a prediction with the model by first building a CreditRiskModelInput object:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">CreditRiskModelInput</span><span class="p">(</span>
<span class="n">annual_income</span><span class="o">=</span><span class="mi">273000</span><span class="p">,</span>
<span class="n">collections_in_last_12_months</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span>
<span class="n">delinquencies_in_last_2_years</span><span class="o">=</span><span class="mi">39</span><span class="p">,</span>
<span class="n">debt_to_income_ratio</span><span class="o">=</span><span class="mf">42.64</span><span class="p">,</span>
<span class="n">employment_length</span><span class="o">=</span><span class="n">EmploymentLength</span><span class="o">.</span><span class="n">less_than_1_year</span><span class="p">,</span>
<span class="n">home_ownership</span><span class="o">=</span><span class="n">HomeOwnership</span><span class="o">.</span><span class="n">MORTGAGE</span><span class="p">,</span>
<span class="n">number_of_delinquent_accounts</span><span class="o">=</span><span class="mi">6</span><span class="p">,</span>
<span class="n">interest_rate</span><span class="o">=</span><span class="mf">28.99</span><span class="p">,</span>
<span class="n">last_payment_amount</span><span class="o">=</span><span class="mf">36475.59</span><span class="p">,</span>
<span class="n">loan_amount</span><span class="o">=</span><span class="mi">35000</span><span class="p">,</span>
<span class="n">derogatory_public_record_count</span><span class="o">=</span><span class="mi">86</span><span class="p">,</span>
<span class="n">loan_purpose</span><span class="o">=</span><span class="n">LoanPurpose</span><span class="o">.</span><span class="n">debt_consolidation</span><span class="p">,</span>
<span class="n">revolving_line_utilization_rate</span><span class="o">=</span><span class="mf">892.3</span><span class="p">,</span>
<span class="n">term</span><span class="o">=</span><span class="n">Term</span><span class="o">.</span><span class="n">thirty_six_months</span><span class="p">,</span>
<span class="n">total_payments_to_date</span><span class="o">=</span><span class="mf">57777.58</span><span class="p">,</span>
<span class="n">verification_status</span><span class="o">=</span><span class="n">VerificationStatus</span><span class="o">.</span><span class="n">source_verified</span>
<span class="p">)</span>
</code></pre></div>
<p>Next, we'll instantiate the model class we defined above:</p>
<div class="highlight"><pre><span></span><code><span class="o">%%</span><span class="n">time</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">CreditRiskModel</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>CPU times: user 372 ms, sys: 29.2 ms, total: 401 ms
Wall time: 113 ms
</code></pre></div>
<p>Notice that the model object took 113 milliseconds to be instantiated. This is because the model parameters take a lot of disk space and take a while to load from the hard drive. This is something that we'll need to deal with later.</p>
<p>We'll use the CreditRiskModelInput instance to make a prediction like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>CreditRiskModelOutput(credit_risk=<CreditRisk.safe: 'safe'>)
</code></pre></div>
<p>The model predicted that the loan is "safe".</p>
<h2>Creating a RESTful Service</h2>
<p>Now that we have a model, we can deploy it in a service that allows clients to make predictions. To do this, we won't need to write any extra code, we can leverage the <a href="https://pypi.org/project/rest-model-service/">rest_model_service package</a> to provide the RESTful API for the service. You can learn more about the package in <a href="https://www.tekhnoal.com/rest-model-service.html">this blog post</a>.</p>
<p>To install the package, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">rest_model_service</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To create a service for our model, all that is needed is that we add a YAML configuration file to the project. The configuration file looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Credit Risk Model Service</span><span class="w"></span>
<span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s">"Service</span><span class="nv"> </span><span class="s">hosting</span><span class="nv"> </span><span class="s">the</span><span class="nv"> </span><span class="s">Credit</span><span class="nv"> </span><span class="s">Risk</span><span class="nv"> </span><span class="s">Model."</span><span class="w"></span>
<span class="nt">version</span><span class="p">:</span><span class="w"> </span><span class="s">"0.1.0"</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">qualified_name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">credit_risk_model</span><span class="w"></span>
<span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">credit_risk_model.prediction.model.CreditRiskModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
</code></pre></div>
<p>At the root of the YAML, the "service_title" field is the name of the service as it will appear in the documentation. The "description" and "version"fields will also be used to create the service documentation.</p>
<p>The models field is an array that contains the details of the models we would like to deploy in the service. The "qualified_name" field is the name we gave to the model. The "class_path" field points at the MLModel class that implements the model's prediction logic, in this case it is pointing to the class we built earlier in this blog post. The "create_endpoint" field tells the service to create an endpoint for the model.</p>
<p>Using the configuration file, we can create an OpenAPI specification file for the model service by executing these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
generate_openapi --configuration_file<span class="o">=</span>./configuration/rest_configuration.yaml --output_file<span class="o">=</span><span class="s2">"service_contract.yaml"</span>
</code></pre></div>
<p>The service_contract.yaml file is generated and contains the OpenAPI specification that was generated for the model service. The specification contains a description of the model's endpoint. The model's input and output schemas are automatically extracted and added to the specification. The OpenAPI specification file generated can be found at the root of the github repository in the file named service_contract.yaml</p>
<p>To run the service locally, execute these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>./configuration/rest_configuration.yaml
uvicorn rest_model_service.main:app
</code></pre></div>
<p>The service process starts up and can be accessed in a web browser at http://127.0.0.1:8000. The service renders the OpenAPI specification as a webpage that looks like this:</p>
<p><img alt="Service Documentation" src="https://www.tekhnoal.com/service_documentation_hcfmlm.png" width="100%"></p>
<p>By using the MLModel base class provided by the ml_base package and the REST service framework provided by the rest_model_service package we're able to quickly stand up a service to host the model.</p>
<p>We can make a prediction using the model running in the service with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:8000/api/models/credit_risk_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">annual_income</span><span class="se">\"</span><span class="s2">: 273000, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">collections_in_last_12_months</span><span class="se">\"</span><span class="s2">: 20, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">delinquencies_in_last_2_years</span><span class="se">\"</span><span class="s2">: 39, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">debt_to_income_ratio</span><span class="se">\"</span><span class="s2">: 42.64, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">employment_length</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">< 1 year</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">home_ownership</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">MORTGAGE</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">number_of_delinquent_accounts</span><span class="se">\"</span><span class="s2">: 6, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">interest_rate</span><span class="se">\"</span><span class="s2">: 28.99, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">last_payment_amount</span><span class="se">\"</span><span class="s2">: 36475.59, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">loan_amount</span><span class="se">\"</span><span class="s2">: 35000, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">derogatory_public_record_count</span><span class="se">\"</span><span class="s2">: 86, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">loan_purpose</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">debt_consolidation</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">revolving_line_utilization_rate</span><span class="se">\"</span><span class="s2">: 892.3, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">term</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2"> 36 months</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">total_payments_to_date</span><span class="se">\"</span><span class="s2">: 57777.58, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">verification_status</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">Source Verified</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2">}"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"credit_risk":"safe"}
</code></pre></div>
<p>The model returned a prediction of "safe" for the loan.</p>
<h2>Understanding Health Checks</h2>
<p>The service is able exposes health information about itself through health endpoints. The health endpoints are:</p>
<ul>
<li>/api/health: indicates whether the service process is running. This endpoint will return a 200 status once the service has started.</li>
<li>/api/health/ready: indicates whether the service is ready to respond to requests. This endpoint will return a 200 status only if all the models and decorators have finished being instantiated without errors.</li>
<li>/api/health/startup: indicates whether the service is started. This endpoint will return a 200 status only if all the models and decorators have finished being instantiated without errors.</li>
</ul>
<p>These endpoints are important for our use case because our model takes a while to load and become ready to be served over the API. The service will not be ready to serve traffic for a while, so the readiness and startup checks will fail until the models are ready.</p>
<p>The service is running so we'll try out each endpoint with a request:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'GET'</span> \
<span class="s1">'http://127.0.0.1:8000/api/health'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"health_status":"HEALTHY"}
</code></pre></div>
<p>The health endpoint returned a status of "HEALTHY". </p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'GET'</span> \
<span class="s1">'http://127.0.0.1:8000/api/health/ready'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"readiness_status":"ACCEPTING_TRAFFIC"}
</code></pre></div>
<p>The readiness status endpoint returned a status of "ACCEPTING_TRAFFIC".</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'GET'</span> \
<span class="s1">'http://127.0.0.1:8000/api/health/startup'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"startup_status":"STARTED"}
</code></pre></div>
<p>The startup status endpoint returned a status of "STARTED".</p>
<p>During normal operation, the health endpoints are not very interesting. However, in special situations they are very useful. For example, if a model takes a long time to start up, the startup check endpoint will not return a 200 status response until each model is initiated and ready to make predictions. The readiness endpoint will also not return a 200 status until the model is ready. We'll use the healtcheck endpoints to integrated with Kubernetes.</p>
<h2>Creating a Docker Image</h2>
<p>Now that we have a working model and model service, we'll need to deploy it somewhere. We'll start by deploying the service locally using Docker.</p>
<p>Let's create a docker image and run it locally. The docker image is generated using instructions in the Dockerfile:</p>
<div class="highlight"><pre><span></span><code><span class="c"># syntax=docker/dockerfile:1</span>
<span class="k">FROM</span><span class="w"> </span><span class="s">python:3.9-slim</span>
<span class="k">ARG</span><span class="w"> </span>BUILD_DATE
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.title<span class="o">=</span><span class="s2">"Health Checks for ML Models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.description<span class="o">=</span><span class="s2">"Health checks for ML models."</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.created<span class="o">=</span><span class="nv">$BUILD_DATE</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.authors<span class="o">=</span><span class="s2">"6666331+schmidtbri@users.noreply.github.com"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.source<span class="o">=</span><span class="s2">"https://github.com/schmidtbri/health-checks-for-ml-models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.version<span class="o">=</span><span class="s2">"0.1.0"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.licenses<span class="o">=</span><span class="s2">"MIT License"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.base.name<span class="o">=</span><span class="s2">"python:3.9-slim"</span>
<span class="k">WORKDIR</span><span class="w"> </span><span class="s">/service</span>
<span class="k">ARG</span><span class="w"> </span><span class="nv">USERNAME</span><span class="o">=</span>service-user
<span class="k">ARG</span><span class="w"> </span><span class="nv">USER_UID</span><span class="o">=</span><span class="m">10000</span>
<span class="k">ARG</span><span class="w"> </span><span class="nv">USER_GID</span><span class="o">=</span><span class="m">10000</span>
<span class="c"># install packages</span>
<span class="k">RUN</span><span class="w"> </span>apt-get update <span class="se">\</span>
<span class="o">&&</span> apt-get install --assume-yes --no-install-recommends sudo <span class="se">\</span>
<span class="o">&&</span> apt-get install --assume-yes --no-install-recommends git <span class="se">\</span>
<span class="o">&&</span> apt-get install -y --no-install-recommends apt-utils <span class="se">\</span>
<span class="o">&&</span> apt-get install -y --no-install-recommends libgomp1 <span class="se">\</span>
<span class="o">&&</span> apt-get clean <span class="se">\</span>
<span class="o">&&</span> rm -rf /var/lib/apt/lists/*
<span class="c"># create a user</span>
<span class="k">RUN</span><span class="w"> </span>groupadd --gid <span class="nv">$USER_GID</span> <span class="nv">$USERNAME</span> <span class="se">\</span>
<span class="o">&&</span> useradd --uid <span class="nv">$USER_UID</span> --gid <span class="nv">$USER_GID</span> -m <span class="nv">$USERNAME</span> <span class="se">\</span>
<span class="o">&&</span> <span class="nb">echo</span> <span class="nv">$USERNAME</span> <span class="nv">ALL</span><span class="o">=</span><span class="se">\(</span>root<span class="se">\)</span> NOPASSWD:ALL > /etc/sudoers.d/<span class="nv">$USERNAME</span> <span class="se">\</span>
<span class="o">&&</span> chmod <span class="m">0440</span> /etc/sudoers.d/<span class="nv">$USERNAME</span>
<span class="c"># installing dependencies</span>
<span class="k">COPY</span><span class="w"> </span>./service_requirements.txt ./service_requirements.txt
<span class="k">RUN</span><span class="w"> </span>pip install -r service_requirements.txt
<span class="c"># copying code and license</span>
<span class="k">COPY</span><span class="w"> </span>./credit_risk_model ./credit_risk_model
<span class="k">COPY</span><span class="w"> </span>./LICENSE ./LICENSE
<span class="k">USER</span><span class="w"> </span><span class="s">$USERNAME</span>
<span class="k">CMD</span><span class="w"> </span><span class="p">[</span><span class="s2">"uvicorn"</span><span class="p">,</span><span class="w"> </span><span class="s2">"rest_model_service.main:app"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--host"</span><span class="p">,</span><span class="w"> </span><span class="s2">"0.0.0.0"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--port"</span><span class="p">,</span><span class="w"> </span><span class="s2">"8000"</span><span class="p">]</span>
</code></pre></div>
<p>The Dockerfile is used by this docker command to create a docker image:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">build</span> <span class="o">-</span><span class="n">t</span> <span class="n">credit_risk_model_service</span><span class="p">:</span><span class="mf">0.1.0</span> <span class="o">../</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make sure everything worked as expected, we'll look through the docker images in our system:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">image</span> <span class="n">ls</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">credit_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>credit_risk_model_service 0.1.0 fc3c3c747e2b 4 seconds ago 614MB
</code></pre></div>
<p>The credit_risk_model_service image is listed. Next, we'll start the image to see if the service is working correctly. </p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">d</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">8000</span><span class="p">:</span><span class="mi">8000</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">REST_CONFIG</span><span class="o">=./</span><span class="n">configuration</span><span class="o">/</span><span class="n">rest_configuration</span><span class="o">.</span><span class="n">yaml</span> \
<span class="o">-</span><span class="n">v</span> <span class="err">$</span><span class="p">(</span><span class="n">pwd</span><span class="p">)</span><span class="o">/../</span><span class="n">configuration</span><span class="p">:</span><span class="o">/</span><span class="n">service</span><span class="o">/</span><span class="n">configuration</span> \
<span class="o">--</span><span class="n">name</span> <span class="n">credit_risk_model_service</span> \
<span class="n">credit_risk_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">706</span><span class="n">ff17d8db159568989d8a74221b8bc3bbcb52074ca61da1ee4015297035dc6</span><span class="w"></span>
</code></pre></div>
<p>To make sure the server process started up correctly, we'll look at the logs:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">logs</span> <span class="n">credit_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Started</span><span class="w"> </span><span class="n">server</span><span class="w"> </span><span class="n">process</span><span class="w"> </span><span class="o">[</span><span class="mi">1</span><span class="o">]</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Waiting</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">application</span><span class="w"> </span><span class="n">startup</span><span class="o">.</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Application</span><span class="w"> </span><span class="n">startup</span><span class="w"> </span><span class="n">complete</span><span class="o">.</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Uvicorn</span><span class="w"> </span><span class="n">running</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">http</span><span class="o">://</span><span class="mf">0.0</span><span class="o">.</span><span class="mf">0.0</span><span class="o">:</span><span class="mi">8000</span><span class="w"> </span><span class="o">(</span><span class="n">Press</span><span class="w"> </span><span class="n">CTRL</span><span class="o">+</span><span class="n">C</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">quit</span><span class="o">)</span><span class="w"></span>
</code></pre></div>
<p>The logs look good and the service is up and running.</p>
<p>The service should be accessible on port 8000 of localhost, so we'll try to make a prediction using the curl command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:8000/api/models/credit_risk_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">annual_income</span><span class="se">\"</span><span class="s2">: 273000, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">collections_in_last_12_months</span><span class="se">\"</span><span class="s2">: 20, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">delinquencies_in_last_2_years</span><span class="se">\"</span><span class="s2">: 39, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">debt_to_income_ratio</span><span class="se">\"</span><span class="s2">: 42.64, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">employment_length</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">< 1 year</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">home_ownership</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">MORTGAGE</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">number_of_delinquent_accounts</span><span class="se">\"</span><span class="s2">: 6, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">interest_rate</span><span class="se">\"</span><span class="s2">: 28.99, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">last_payment_amount</span><span class="se">\"</span><span class="s2">: 36475.59, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">loan_amount</span><span class="se">\"</span><span class="s2">: 35000, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">derogatory_public_record_count</span><span class="se">\"</span><span class="s2">: 86, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">loan_purpose</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">debt_consolidation</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">revolving_line_utilization_rate</span><span class="se">\"</span><span class="s2">: 892.3, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">term</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2"> 36 months</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">total_payments_to_date</span><span class="se">\"</span><span class="s2">: 57777.58, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">verification_status</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">Source Verified</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2">}"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"credit_risk":"safe"}
</code></pre></div>
<p>The model predicted that the loan is safe.</p>
<p>We're done with the docker container, so we'll shut down the model service container.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">credit_risk_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">credit_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>credit_risk_model_service
credit_risk_model_service
</code></pre></div>
<h2>Creating a Kubernetes Cluster</h2>
<p>To show the system in action, we’ll deploy the service to a Kubernetes cluster. A local cluster can be easily started by using <a href="https://minikube.sigs.k8s.io/docs/">minikube</a>. Installation instructions can be found <a href="https://minikube.sigs.k8s.io/docs/start/">here</a>.</p>
<p>To start the minikube cluster execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">start</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>😄 <span class="nv">minikube</span> <span class="nv">v1</span>.<span class="mi">28</span>.<span class="mi">0</span> <span class="nv">on</span> <span class="nv">Darwin</span> <span class="mi">13</span>.<span class="mi">0</span>.<span class="mi">1</span>
✨ <span class="nv">Using</span> <span class="nv">the</span> <span class="nv">docker</span> <span class="nv">driver</span> <span class="nv">based</span> <span class="nv">on</span> <span class="nv">existing</span> <span class="nv">profile</span>
👍 <span class="nv">Starting</span> <span class="nv">control</span> <span class="nv">plane</span> <span class="nv">node</span> <span class="nv">minikube</span> <span class="nv">in</span> <span class="nv">cluster</span> <span class="nv">minikube</span>
🚜 <span class="nv">Pulling</span> <span class="nv">base</span> <span class="nv">image</span> ...
🔄 <span class="nv">Restarting</span> <span class="nv">existing</span> <span class="nv">docker</span> <span class="nv">container</span> <span class="k">for</span> <span class="s2">"</span><span class="s">minikube</span><span class="s2">"</span> ...
🐳 <span class="nv">Preparing</span> <span class="nv">Kubernetes</span> <span class="nv">v1</span>.<span class="mi">25</span>.<span class="mi">3</span> <span class="nv">on</span> <span class="nv">Docker</span> <span class="mi">20</span>.<span class="mi">10</span>.<span class="mi">20</span> ...[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>
🔎 <span class="nv">Verifying</span> <span class="nv">Kubernetes</span> <span class="nv">components</span>...
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">docker</span>.<span class="nv">io</span><span class="o">/</span><span class="nv">kubernetesui</span><span class="o">/</span><span class="nv">dashboard</span>:<span class="nv">v2</span>.<span class="mi">7</span>.<span class="mi">0</span>
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">gcr</span>.<span class="nv">io</span><span class="o">/</span><span class="nv">k8s</span><span class="o">-</span><span class="nv">minikube</span><span class="o">/</span><span class="nv">storage</span><span class="o">-</span><span class="nv">provisioner</span>:<span class="nv">v5</span>
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">docker</span>.<span class="nv">io</span><span class="o">/</span><span class="nv">kubernetesui</span><span class="o">/</span><span class="nv">metrics</span><span class="o">-</span><span class="nv">scraper</span>:<span class="nv">v1</span>.<span class="mi">0</span>.<span class="mi">8</span>
💡 <span class="nv">Some</span> <span class="nv">dashboard</span> <span class="nv">features</span> <span class="nv">require</span> <span class="nv">the</span> <span class="nv">metrics</span><span class="o">-</span><span class="nv">server</span> <span class="nv">addon</span>. <span class="nv">To</span> <span class="nv">enable</span> <span class="nv">all</span> <span class="nv">features</span> <span class="nv">please</span> <span class="nv">run</span>:
<span class="nv">minikube</span> <span class="nv">addons</span> <span class="nv">enable</span> <span class="nv">metrics</span><span class="o">-</span><span class="nv">server</span>
🌟 <span class="nv">Enabled</span> <span class="nv">addons</span>: <span class="nv">storage</span><span class="o">-</span><span class="nv">provisioner</span>, <span class="nv">default</span><span class="o">-</span><span class="nv">storageclass</span>, <span class="nv">dashboard</span>
🏄 <span class="nv">Done</span><span class="o">!</span> <span class="nv">kubectl</span> <span class="nv">is</span> <span class="nv">now</span> <span class="nv">configured</span> <span class="nv">to</span> <span class="nv">use</span> <span class="s2">"</span><span class="s">minikube</span><span class="s2">"</span> <span class="nv">cluster</span> <span class="nv">and</span> <span class="s2">"</span><span class="s">default</span><span class="s2">"</span> <span class="nv">namespace</span> <span class="nv">by</span> <span class="nv">default</span>
</code></pre></div>
<p>Let's view all of the pods running in the minikube cluster to make sure we can connect.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">-</span><span class="n">A</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-565d847f94-2v6l9 0/1 Running 10 (3d20h ago) 11d
kube-system etcd-minikube 0/1 Running 10 (3d20h ago) 11d
kube-system kube-apiserver-minikube 0/1 Running 10 (3d20h ago) 11d
kube-system kube-controller-manager-minikube 0/1 Running 10 (25s ago) 11d
kube-system kube-proxy-ztbgd 1/1 Running 10 (25s ago) 11d
kube-system kube-scheduler-minikube 0/1 Running 10 (25s ago) 11d
kube-system storage-provisioner 1/1 Running 18 (25s ago) 11d
kubernetes-dashboard dashboard-metrics-scraper-b74747df5-x559p 1/1 Running 9 (25s ago) 11d
kubernetes-dashboard kubernetes-dashboard-57bbdc5f89-9jvln 1/1 Running 14 (3d20h ago) 11d
</code></pre></div>
<p>The pods running the kubernetes dashboard and other cluster services appear in the kube-system and kubernetes-dashboard namespaces.</p>
<h2>Creating a Kubernetes Namespace</h2>
<p>Now that we have a cluster and are connected to it, we'll create a namespace to hold the resources for our model deployment. The resource definition is in the kubernetes/namespace.yaml file. To apply the manifest to the cluster, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace/model-services created
resourcequota/model-services-resource-quota created
</code></pre></div>
<p>The namespace was created, alongside with a ResourceQuota which limits the amount of resources that can be taken by objects within the namespace.</p>
<p>To take a look at the namespaces, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">namespace</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME STATUS AGE
default Active 11d
kube-node-lease Active 11d
kube-public Active 11d
kube-system Active 11d
kubernetes-dashboard Active 11d
model-services Active 0s
</code></pre></div>
<p>The new namespace appears in the listing along with other namespaces created by default by the system. To use the new namespace for the rest of the operations, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">config</span> <span class="nb">set</span><span class="o">-</span><span class="n">context</span> <span class="o">--</span><span class="n">current</span> <span class="o">--</span><span class="n">namespace</span><span class="o">=</span><span class="n">model</span><span class="o">-</span><span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Context "minikube" modified.
</code></pre></div>
<h2>Creating a Kubernetes Deployment and Service</h2>
<p>The model service is deployed by using Kubernetes resources. These are:</p>
<ul>
<li>Model Service ConfigMap: a set of configuration options, in this case it is a simple YAML file that will be loaded into the running container as a volume mount. This resource allows us to change the configuration of the model service without having to modify the Docker image. The configuration file will overwrite the configuration files that were included with the Docker image.</li>
<li>Deployment: a declarative way to manage a set of pods, the model service pods are managed through the Deployment. This deployment includes the model service as well as the OPA service running as a sidecar container.</li>
<li>Service: a way to expose a set of pods in a Deployment, the model services is made available to the outside world through the Service.</li>
</ul>
<p>The Deployment resource will be created with some special options that can leverage the health endpoints of the model service. These options look like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">livenessProbe</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">httpGet</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">scheme</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">HTTP</span><span class="w"></span>
<span class="w"> </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/api/health</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8000</span><span class="w"></span>
<span class="w"> </span><span class="nt">initialDelaySeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">0</span><span class="w"></span>
<span class="w"> </span><span class="nt">periodSeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">5</span><span class="w"></span>
<span class="w"> </span><span class="nt">timeoutSeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">2</span><span class="w"></span>
<span class="w"> </span><span class="nt">failureThreshold</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">5</span><span class="w"></span>
<span class="w"> </span><span class="nt">successThreshold</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1</span><span class="w"></span>
<span class="nt">readinessProbe</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">httpGet</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">scheme</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">HTTP</span><span class="w"></span>
<span class="w"> </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/api/health/ready</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8000</span><span class="w"></span>
<span class="w"> </span><span class="nt">initialDelaySeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">0</span><span class="w"></span>
<span class="w"> </span><span class="nt">periodSeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">5</span><span class="w"></span>
<span class="w"> </span><span class="nt">timeoutSeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">2</span><span class="w"></span>
<span class="w"> </span><span class="nt">failureThreshold</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">5</span><span class="w"></span>
<span class="w"> </span><span class="nt">successThreshold</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1</span><span class="w"></span>
<span class="nt">startupProbe</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">httpGet</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">scheme</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">HTTP</span><span class="w"></span>
<span class="w"> </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/api/health/startup</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8000</span><span class="w"></span>
<span class="w"> </span><span class="nt">initialDelaySeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">0</span><span class="w"></span>
<span class="w"> </span><span class="nt">periodSeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">5</span><span class="w"></span>
<span class="w"> </span><span class="nt">timeoutSeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">2</span><span class="w"></span>
<span class="w"> </span><span class="nt">failureThreshold</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">5</span><span class="w"></span>
<span class="w"> </span><span class="nt">successThreshold</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1</span><span class="w"></span>
</code></pre></div>
<p>This is not the complete YAML file, the full Deployment is defined in the ./kubernetes/model_service.yaml file.</p>
<p>The model service container has options defined for each type of health check. Each type of health check is configured in the same way. The options are:</p>
<ul>
<li>initialDelaySeconds: This option tells Kubernetes how long to wait after container startup to start calling the health check.</li>
<li>periodSeconds: This option tells how often to call the health check endpoint.</li>
<li>timeoutSeconds: This option tell how long to wait for a response from the service before failing the health check.</li>
<li>failureThreshold: This option tells how many times the health check must fail before Kubernetes labels the container as unhealthy and restarts the pod.</li>
<li>successThreshold: This option tells how many times the health check must succeed before Kubernetes labels the container as healthy.</li>
</ul>
<p>We decided to have Kubernetes check 5 times before labelling the container as unhealthy, with a period of 5 seconds. This means that the service has 25 seconds for the model to finish loading. This is a value we know would work with this model, based on our timing measurements above.</p>
<p>We're almost ready to start the model service, but before starting it we'll need to send the docker image from the local docker daemon to the minikube image cache:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">image</span> <span class="n">load</span> <span class="n">credit_risk_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<p>We can view the images in the minikube cache with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">image</span> <span class="n">ls</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">credit_risk_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>docker.io/library/credit_risk_model_service:0.1.0
</code></pre></div>
<p>The model service resources are created within the Kubernetes cluster with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap/model-service-configuration created
deployment.apps/credit-risk-model-deployment created
service/credit-risk-model-service created
</code></pre></div>
<p>Lets view the Deployment to see if it is available yet:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">deployments</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY UP-TO-DATE AVAILABLE AGE
credit-risk-model-deployment 0/2 2 0 2s
</code></pre></div>
<p>Looks like the replicas are not ready yet. Let's wait a bit and try again:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">deployments</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY UP-TO-DATE AVAILABLE AGE
credit-risk-model-deployment 2/2 2 2 23s
</code></pre></div>
<p>After the model service finished loading the model, it switched to the "READY" and "STARTED" state, which made the replicas available to serve traffic.</p>
<p>To get an idea of how the service went through the startup process, let's look a the service logs. Let's get the names of the pods that are running the service:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY STATUS RESTARTS AGE
credit-risk-model-deployment-55654498f4-2bw9k 1/1 Running 0 29s
credit-risk-model-deployment-55654498f4-rxznw 1/1 Running 0 29s
</code></pre></div>
<p>Using one of the pod names, we'll get the logs from Kubernetes:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">logs</span> <span class="n">credit</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">deployment</span><span class="o">-</span><span class="mi">55654498</span><span class="n">f4</span><span class="o">-</span><span class="mi">2</span><span class="n">bw9k</span> <span class="o">-</span><span class="n">c</span> <span class="n">credit</span><span class="o">-</span><span class="n">risk</span><span class="o">-</span><span class="n">model</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Started</span><span class="w"> </span><span class="n">server</span><span class="w"> </span><span class="n">process</span><span class="w"> </span><span class="o">[</span><span class="mi">1</span><span class="o">]</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Waiting</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">application</span><span class="w"> </span><span class="n">startup</span><span class="o">.</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Application</span><span class="w"> </span><span class="n">startup</span><span class="w"> </span><span class="n">complete</span><span class="o">.</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="n">Uvicorn</span><span class="w"> </span><span class="n">running</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">http</span><span class="o">://</span><span class="mf">0.0</span><span class="o">.</span><span class="mf">0.0</span><span class="o">:</span><span class="mi">8000</span><span class="w"> </span><span class="o">(</span><span class="n">Press</span><span class="w"> </span><span class="n">CTRL</span><span class="o">+</span><span class="n">C</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">quit</span><span class="o">)</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">57232</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/startup HTTP/1.1"</span><span class="w"> </span><span class="mi">503</span><span class="w"> </span><span class="n">Service</span><span class="w"> </span><span class="n">Unavailable</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">49828</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/startup HTTP/1.1"</span><span class="w"> </span><span class="mi">503</span><span class="w"> </span><span class="n">Service</span><span class="w"> </span><span class="n">Unavailable</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">49844</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/startup HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">49858</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/ready HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">40210</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/ready HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">40212</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">40224</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/ready HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">40236</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">58908</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/ready HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">58910</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">58926</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health/ready HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
<span class="n">INFO</span><span class="o">:</span><span class="w"> </span><span class="mf">172.17</span><span class="o">.</span><span class="mf">0.1</span><span class="o">:</span><span class="mi">58942</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"GET /api/health HTTP/1.1"</span><span class="w"> </span><span class="mi">200</span><span class="w"> </span><span class="n">OK</span><span class="w"></span>
</code></pre></div>
<p>Looks like the process started up correctly and then the /api/health/startup endpoint was called three times, succeeding in the last request. Right after the startup check succeeded, the /api/health/ready endpoint was called, and it immediately succeeded. Right after that, the /api/health endpoint was called and it also succeeded. This startup process ensured that the model service was not accessed by clients before it finished loading the models into memory.</p>
<p>To access the model service, we created a Kubernetes service. The service details look like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
credit-risk-model-service NodePort 10.110.134.65 <none> 80:31004/TCP 68s
</code></pre></div>
<p>Minikube can expose the service on a local port, but we need to run a proxy process. The proxy is started like this:</p>
<div class="highlight"><pre><span></span><code>minikube service credit-risk-model-service --url -n model-services
</code></pre></div>
<p>The command outputs this URL:</p>
<p>http://127.0.0.1:59091</p>
<p>The command must keep running to keep the tunnel open to the running model service in the minikube cluster.</p>
<p>We can send a request to the model service through the local endpoint like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:59091/api/models/credit_risk_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">annual_income</span><span class="se">\"</span><span class="s2">: 273000, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">collections_in_last_12_months</span><span class="se">\"</span><span class="s2">: 20, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">delinquencies_in_last_2_years</span><span class="se">\"</span><span class="s2">: 39, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">debt_to_income_ratio</span><span class="se">\"</span><span class="s2">: 42.64, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">employment_length</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">< 1 year</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">home_ownership</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">MORTGAGE</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">number_of_delinquent_accounts</span><span class="se">\"</span><span class="s2">: 6, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">interest_rate</span><span class="se">\"</span><span class="s2">: 28.99, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">last_payment_amount</span><span class="se">\"</span><span class="s2">: 36475.59, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">loan_amount</span><span class="se">\"</span><span class="s2">: 35000, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">derogatory_public_record_count</span><span class="se">\"</span><span class="s2">: 86, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">loan_purpose</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">debt_consolidation</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">revolving_line_utilization_rate</span><span class="se">\"</span><span class="s2">: 892.3, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">term</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2"> 36 months</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">total_payments_to_date</span><span class="se">\"</span><span class="s2">: 57777.58, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">verification_status</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">Source Verified</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2">}"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"credit_risk":"safe"}
</code></pre></div>
<p>The health check endpoints of the model service are also available to clients:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'GET'</span> \
<span class="s1">'http://127.0.0.1:59091/api/health'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"health_status":"HEALTHY"}
</code></pre></div>
<h2>Deleting the Resources</h2>
<p>We're done working with the Kubernetes resources, so we will delete them and shut down the cluster.</p>
<p>To delete the model service pods, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap "model-service-configuration" deleted
deployment.apps "credit-risk-model-deployment" deleted
service "credit-risk-model-service" deleted
</code></pre></div>
<p>To delete the model-services namespace, delete this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace "model-services" deleted
resourcequota "model-services-resource-quota" deleted
</code></pre></div>
<p>To shut down the Kubernetes cluster:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">stop</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>✋ Stopping node "minikube" ...
🛑 Powering off "minikube" via SSH ...
🛑 1 node stopped.
</code></pre></div>
<h2>Closing</h2>
<p>In this blog post we showed how to deal with a common issue that arises when a large model is deployed. When the model parameters take a long time to load, the model service needs to make sure that no clients are depending on it to provide predictions until it is finished starting up. We accomplished this on the Kubernetes platform by adding health check endpoints to the model service and configuring Kubernetes to check on the endpoints. By doing this we are able to guarantee that the model service will only become available to clients once it has finished starting up. </p>
<p>In order to build the health checks, the model did not need to change at all. We were able to build the logic into the model service package, which means that the model prediction logic did not have to change to deal with this requirement. We were able to isolate a deployment concern from the model that we were trying to deploy. This also means that we can reuse the model to make predictions in other contexts and not have this extra logic being carried along with it.</p>Policies for ML Models2022-09-21T22:00:00-05:002022-09-21T22:00:00-05:00Brian Schmidttag:www.tekhnoal.com,2022-09-21:/policies-for-ml-models.html<p>Machine learning models are being used to make ever more important decisions in the modern world. Because of the power of data modeling, ML models are able to learn the nuances of a domain and make accurate predictions even in situations where a human expert would not be able to. However, ML models are not omniscient and they should not run without oversight from their operators. To handle situations in which we don't want to have an ML model make predictions, we can create a policy that steps in before the prediction is returned to the user. A policy that is applied to an ML model is simply a rule that ensures that the model will never make predictions that are unsafe to use. For example, we can create a policy that make sure that a machine learning model that makes predictions about optimal airline ticket prices never makes predictions that cost the airline money. A good policy for an ML model is one that allows the model some leeway while also ensuring that the model’s predictions are safe to use. In this blog post, we'll write policies for ML models and deploy the policies alongside the model using the decorator pattern.</p><h1>Policies for ML Model Deployments</h1>
<p>In a <a href="https://www.tekhnoal.com/ml-model-decorators.html">previous blog post</a> we introduced the decorator pattern for ML model deployments and then showed how to use the pattern to build extensions for a deployed model. For example, in <a href="https://www.tekhnoal.com/data-enrichment-for-ml-models.html">this blog post</a> we added data enrichment to a deployed model. In <a href="https://www.tekhnoal.com/caching-for-ml-models.html">this blog post</a> we added prediction caching to a deployed model. These extensions were added without having to modify the machine learning model prediction code at all, we were able to do it using the <a href="https://en.wikipedia.org/wiki/Decorator_pattern">decorator pattern</a>. In this blog post we’ll add policies to a deployed model in the same way.</p>
<p>This blog post was written in a Jupyter notebook, some of the code and commands found in it reflects this.</p>
<h2>Introduction</h2>
<p>Machine learning models are being used to make ever more important decisions in the modern world. Because of the power of data modeling, ML models are able to learn the nuances of a domain and make accurate predictions even in situations where a human expert would not be able to. However, ML models are not omniscient and they should not run without oversight from their operators. To handle situations in which we don't want to have an ML model make predictions, we can create a policy that steps in before the prediction is returned to the user. A policy that is applied to an ML model is simply a rule that ensures that the model will never make predictions that are unsafe to use. For example, we can create a policy that make sure that a machine learning model that makes predictions about optimal airline ticket prices never makes predictions that cost the airline money. A good policy for an ML model is one that allows the model some leeway while also ensuring that the model’s predictions are safe to use. In this blog post, we'll write policies for ML models and deploy the policies alongside the model using the decorator pattern.</p>
<p>A policy is a system of guidelines that are used to make decisions. A software-defined policy is simply a policy that is written as code and can be executed. Most of the time, the policies followed by a software system are hard-coded into the system using whichever programming language the system is written in. This is often good enough for, but sometimes the policies are complex enough or change often enough to warrant writing them in a specialized language that is specifically designed for policies. By writing policies separately from the system that they will work in, we can decouple them from the system and make the system simpler to work in. Policies can also be written by domain experts and more easily integrated into the software system in this way.</p>
<p>In this blog post we'll write policies for a deployed machine learning model, and we'll use the <a href="https://www.openpolicyagent.org/docs/latest/policy-language/">Rego policy language</a>. Policy decisions are made by querying policies written in Rego that are executed by the <a href="https://www.openpolicyagent.org/">Open Policy Agent</a> which is a service that can be integrated into software systems. Other services can offload policy management and execution to the OPA service, accessing it through an RESTful API. The OPA service is specifically built for low-latency evaluations of policies. Rego and OPA are already used to review <a href="https://www.openpolicyagent.org/docs/latest/kubernetes-introduction/">Kubernetes manifests</a> for best practices, to review infrastructure deployments by <a href="https://www.openpolicyagent.org/docs/latest/terraform/">checking Terraform plans</a>, and to check for authorization within the <a href="https://www.openpolicyagent.org/docs/latest/envoy-introduction/">Envoy service mesh</a>. </p>
<p>In this blog post we’ll also build a decorator that applies policies to the input and output of a model by using the OPA service. By using the decorator pattern that we’ve shown in previous blog posts, we’ll be able to show how to integrate policies separately from the model itself. We'll show how to deploy the ML model inside of a RESTful service along with the decorator, all by modifying a simple configuration file.</p>
<h2>Software Architecture</h2>
<p>The system we'll build will ultimately look like this:</p>
<p><img alt="Software Architecture" src="https://www.tekhnoal.com/software_architecture_pfmlm.png" width="100%"></p>
<h2>Installing a Model</h2>
<p>To make this blog post a little shorter we won't train a completely new model. Instead we'll install a model that <a href="https://www.tekhnoal.com/regression-model.html">we've built in a previous blog post</a>. The code for the model is in <a href="https://github.com/schmidtbri/regression-model">this github repository</a>.</p>
<p>The model is called the "Insurace Charges Model" and predicts the medical insurance charges based on features of a customer. To install the model, we can use the pip command and point it at the github repo of the model.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">clear_output</span>
<span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="o">-</span><span class="n">e</span> <span class="n">git</span><span class="o">+</span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">github</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">schmidtbri</span><span class="o">/</span><span class="n">regression</span><span class="o">-</span><span class="n">model</span><span class="c1">#egg=insurance_charges_model</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make a prediction with the model, we'll import the model's class.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.model</span> <span class="kn">import</span> <span class="n">InsuranceChargesModel</span>
</code></pre></div>
<p>Now we can instantiate the model using the class.</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">InsuranceChargesModel</span><span class="p">()</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>The model object contains everything needed to make a prediction. When the object was instantiated, it loaded the necessary model parameters.</p>
<p>The model object publishes some metadata about the model as attributes:</p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">display_name</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">version</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">description</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>insurance_charges_model
Insurance Charges Model
0.1.0
Model to predict the insurance charges of a customer.
</code></pre></div>
<p>To make a prediction, we need to use the model's input schema class. The input schema class is a <a href="https://pydantic-docs.helpmanual.io/">Pydantic</a> class that defines a data structure that can be used by the model's predict() method to make a prediction. </p>
<p>The input schema can be accessed directly from the model object like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span><span class="o">.</span><span class="n">input_schema</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>insurance_charges_model.prediction.schemas.InsuranceChargesModelInput
</code></pre></div>
<p>We can view input schema of the model as a JSON schema document by calling the .schema() method on the Pydantic class.</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'InsuranceChargesModelInput'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Schema for input of the model's predict method."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'object'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'properties'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'age'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Age'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Age of primary beneficiary in years.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mi">18</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mi">65</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'integer'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'sex'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Sex'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Gender of beneficiary.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/SexEnum'</span><span class="p">}]},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'bmi'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Body Mass Index'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Body mass index of beneficiary.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mf">15.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mf">50.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'number'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'children'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Children'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Number of children covered by health insurance.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'integer'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'smoker'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Smoker'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Whether beneficiary is a smoker.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'boolean'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'region'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Region'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Region where beneficiary lives.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/RegionEnum'</span><span class="p">}]}},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'definitions'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'SexEnum'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'SexEnum'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Enumeration for the value of the 'sex' input of the model."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'male'</span><span class="p">,</span><span class="w"> </span><span class="s1">'female'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'RegionEnum'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'RegionEnum'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Enumeration for the value of the 'region' input of the model."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'southwest'</span><span class="p">,</span><span class="w"> </span><span class="s1">'southeast'</span><span class="p">,</span><span class="w"> </span><span class="s1">'northwest'</span><span class="p">,</span><span class="w"> </span><span class="s1">'northeast'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">}}}</span><span class="w"></span>
</code></pre></div>
<p>The model's input schema is called InsuranceChargesModelInput. The model expects five fields to be provided in order to make a prediction. All of the fields have type information as well as the allowed values. For example, the input schema states that the minimum allowed value for "bmi" is 15 and the maximum allowed value is 50.</p>
<p>To make a prediction, all we need to do is instantiate the input schema class and give it to the model object's predict() method:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.schemas</span> <span class="kn">import</span> <span class="n">SexEnum</span><span class="p">,</span> <span class="n">RegionEnum</span>
<span class="n">model_input</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">input_schema</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=8640.78)
</code></pre></div>
<p>The prediction is another Pydantic class, this one is of type InsuranceChargesModelOutput. The output contains a single field called "charges", which is the predicted amount of charges in dollars. The model predicts that the charges will be $8640.78. Notice that we needed to import two Enum classes in order to fill in the categorical fields with allowed values.</p>
<p>The policies that we'll write need to interact with the model through these schemas, so it's important to review them.</p>
<h2>Creating a Policy</h2>
<p><a href="https://www.openpolicyagent.org/docs/latest/policy-language/">Rego policies</a> are assertions on data, in this blog post that data is the ML model's input and output data structures. Using the Insurance Charges Model we installed above, we'll create a policy for this situation:</p>
<p>"Smokers over the age of 60 should not have a prediction made."</p>
<p>This policy is completely made up, its an example of a situation in which we would not want to return a prediction from the model for reasons other than the model's capabilities. The prediction that the model makes would still be valid because the model is capable of prediting the insurance chages for a 62 year old smoker, but business requirements may prevent the prediction from being used. This is a good place to add a policy that will enforce this business requirement. The policy looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nv">package</span> <span class="nv">insurance_charges_model</span>
<span class="nv">customer_is_a_smoker_over_60</span> <span class="k">if</span> {
<span class="nv">input</span>.<span class="nv">model_input</span>.<span class="nv">smoker</span>
<span class="nv">input</span>.<span class="nv">model_input</span>.<span class="nv">age</span> <span class="o">></span> <span class="mi">60</span>
}
</code></pre></div>
<p>The policy is defined in the "insurance_charges_model" package. The policy is using the model input fields "smoker" which is a boolean field, and "age" which is an integer. The value "customer_is_a_smoker_over_60" is set to "true" if the conditions in the body of the rule are true. This policy is very simple and it does not actually make a decision about what to do with the model's prediction, all it does is detect whether the customer is a smoker over the age of 60. To create a decision we'll add another rule:</p>
<div class="highlight"><pre><span></span><code><span class="nv">allow</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="no">true</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nv">customer_is_a_smoker_over_60</span><span class="w"></span>
<span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="no">false</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nv">customer_is_a_smoker_over_60</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
</code></pre></div>
<p>We've added a rule called "allow" to the policy. Very simply, the value for "allow" is set to true if the customer is not a smoker over the age of of 60, otherwise it is set to false. We'll use this rule to actually make a decision as to what to do with the prediction. It would also be nice to have a description as to why the decision was made, so we'll add one last rule to the insurance_charges_model policy package:</p>
<div class="highlight"><pre><span></span><code><span class="nv">messages</span><span class="w"> </span><span class="nv">contains</span><span class="w"> </span><span class="nv">msg</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nv">customer_is_a_smoker_over_60</span><span class="w"></span>
<span class="w"> </span><span class="nv">msg</span><span class="o">:=</span><span class="w"> </span><span class="s">"Prediction cannot be made if customer is a smoker over the age of 60."</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
</code></pre></div>
<p>The last rule creates a "messages" array with explanations for the rules. If the "customer_is_a_smoker_over_60" rule is true, the messages array will contain an explanation for that particular decision. The structure of this policy package is designed to be extendable, so extra clauses can be added to the "allow" rule and "messages" rule as needed.</p>
<p>The policy file is called "insurance_charges_model.rego" and it is saved in the "policies" folder of the repository. </p>
<h2>Trying Out the Policy</h2>
<p>To show how the policy works, we'll start up the Open Policy Agent service in a Docker container.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">d</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">8181</span><span class="p">:</span><span class="mi">8181</span> \
<span class="o">--</span><span class="n">name</span> <span class="n">opa</span> \
<span class="n">openpolicyagent</span><span class="o">/</span><span class="n">opa</span> <span class="n">run</span> <span class="o">--</span><span class="n">server</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">84</span><span class="n">f6c5264e3b1c06e5d20891932e4e682cfd45754fac52dfd0a76ee1574f1302</span><span class="w"></span>
</code></pre></div>
<p>Once the container is up and running, we'll install the <a href="https://github.com/Turall/OPA-python-client">OPA python package</a> to make the integration a little easier. By using the package we won't need to make individual REST call to the service ourselves, we'll let the package handle that.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">OPA</span><span class="o">-</span><span class="n">python</span><span class="o">-</span><span class="n">client</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To contact the OPA service running in the Docker image, we'll create a client object:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">opa_client.opa</span> <span class="kn">import</span> <span class="n">OpaClient</span>
<span class="n">client</span> <span class="o">=</span> <span class="n">OpaClient</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span> <span class="n">port</span><span class="o">=</span><span class="mi">8181</span><span class="p">,</span> <span class="n">version</span><span class="o">=</span><span class="s2">"v1"</span><span class="p">)</span>
<span class="n">client</span><span class="o">.</span><span class="n">check_connection</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>"Yes I'm here :)"
</code></pre></div>
<p>The check_connection() method on the client reached out to the OPA service and checked for connectivity.</p>
<p>We can create policies in the OPA service by loading the policies from a file and sending it to the service.</p>
<div class="highlight"><pre><span></span><code><span class="n">client</span><span class="o">.</span><span class="n">update_opa_policy_fromfile</span><span class="p">(</span><span class="s2">"../policies/insurance_charges_model.rego"</span><span class="p">,</span>
<span class="n">endpoint</span><span class="o">=</span><span class="s2">"insurance_charges_model"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>True
</code></pre></div>
<p>The policy was created succesfully in the service, but just to make sure we can ask for a list of the policies:</p>
<div class="highlight"><pre><span></span><code><span class="n">client</span><span class="o">.</span><span class="n">get_policies_list</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>['insurance_charges_model']
</code></pre></div>
<p>Looks like the insurance_charges_model package is loaded, now we can try it out with some data. We'll create some data using the model's input and output schemas:</p>
<div class="highlight"><pre><span></span><code><span class="n">policy_input_data</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"model_qualified_name"</span><span class="p">:</span> <span class="s2">"insurance_charges_model"</span><span class="p">,</span>
<span class="s2">"model_version"</span><span class="p">:</span> <span class="s2">"0.1.0"</span><span class="p">,</span>
<span class="s2">"model_input"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"age"</span><span class="p">:</span> <span class="mi">62</span><span class="p">,</span>
<span class="s2">"sex"</span><span class="p">:</span> <span class="s2">"female"</span><span class="p">,</span>
<span class="s2">"bmi"</span><span class="p">:</span> <span class="mf">24.0</span><span class="p">,</span>
<span class="s2">"children"</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span>
<span class="s2">"smoker"</span><span class="p">:</span> <span class="kc">True</span><span class="p">,</span>
<span class="s2">"region"</span><span class="p">:</span> <span class="s2">"northwest"</span>
<span class="p">},</span>
<span class="s2">"model_output"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"charges"</span><span class="p">:</span> <span class="mf">12345.0</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div>
<p>We'll be sending the model's qualified name and version, along with the model input and model output.</p>
<p>We can execute the policy against this data like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">result</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">check_policy_rule</span><span class="p">(</span><span class="n">input_data</span><span class="o">=</span><span class="n">policy_input_data</span><span class="p">,</span>
<span class="n">package_path</span><span class="o">=</span><span class="s2">"insurance_charges_model"</span><span class="p">)</span>
<span class="n">result</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{<span class="s1">'</span><span class="s">result</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">allow</span><span class="s1">'</span>: <span class="nv">False</span>,
<span class="s1">'</span><span class="s">customer_is_a_smoker_over_60</span><span class="s1">'</span>: <span class="nv">True</span>,
<span class="s1">'</span><span class="s">messages</span><span class="s1">'</span>: [<span class="s1">'</span><span class="s">Prediction cannot be made if customer is a smoker over the age of 60.</span><span class="s1">'</span>]}}
</code></pre></div>
<p>The "allow" rule evaluated to False, the reason being that the customer is a smoker over the age of 60. Let's try it again:</p>
<div class="highlight"><pre><span></span><code><span class="n">policy_input_data</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"model_qualified_name"</span><span class="p">:</span> <span class="s2">"insurance_charges_model"</span><span class="p">,</span>
<span class="s2">"model_version"</span><span class="p">:</span> <span class="s2">"0.1.0"</span><span class="p">,</span>
<span class="s2">"model_input"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"age"</span><span class="p">:</span> <span class="mi">45</span><span class="p">,</span>
<span class="s2">"sex"</span><span class="p">:</span> <span class="s2">"female"</span><span class="p">,</span>
<span class="s2">"bmi"</span><span class="p">:</span> <span class="mf">24.0</span><span class="p">,</span>
<span class="s2">"children"</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span>
<span class="s2">"smoker"</span><span class="p">:</span> <span class="kc">True</span><span class="p">,</span>
<span class="s2">"region"</span><span class="p">:</span> <span class="s2">"northwest"</span>
<span class="p">},</span>
<span class="s2">"model_output"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"charges"</span><span class="p">:</span> <span class="mf">12345.0</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">check_policy_rule</span><span class="p">(</span><span class="n">input_data</span><span class="o">=</span><span class="n">policy_input_data</span><span class="p">,</span>
<span class="n">package_path</span><span class="o">=</span><span class="s2">"insurance_charges_model"</span><span class="p">)</span>
<span class="n">result</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{'result': {'allow': True, 'messages': []}}
</code></pre></div>
<p>This time, the "allow" rule evaluated to true, because the age of the customer is below 60, however they are still a smoker. The rule works as expected because we wanted to disallow a prediction if the customer is a smoker AND also over the age of 60.</p>
<p>In this section we showed how to execute the Rego policy using the Open Policy Agent. </p>
<h2>Testing the Policy</h2>
<p>Rego policies can be tested by creating other Rego policies that assert the the policy is outputting the correct decision by using fake data. A Rego test looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">package</span> <span class="n">insurance_charges_model</span>
<span class="kn">import</span> <span class="nn">future.keywords</span>
<span class="n">test_customer_is_a_smoker_over_60</span> <span class="k">if</span> <span class="p">{</span>
<span class="n">customer_is_a_smoker_over_60</span> <span class="k">with</span> <span class="nb">input</span> <span class="k">as</span> <span class="p">{</span>
<span class="s2">"model_input"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"age"</span><span class="p">:</span> <span class="mi">62</span><span class="p">,</span>
<span class="s2">"sex"</span><span class="p">:</span> <span class="s2">"female"</span><span class="p">,</span>
<span class="s2">"bmi"</span><span class="p">:</span> <span class="mf">24.0</span><span class="p">,</span>
<span class="s2">"children"</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span>
<span class="s2">"smoker"</span><span class="p">:</span> <span class="n">true</span><span class="p">,</span>
<span class="s2">"region"</span><span class="p">:</span> <span class="s2">"northwest"</span>
<span class="p">},</span>
<span class="s2">"model_output"</span><span class="p">:</span> <span class="p">{</span>
<span class="s2">"charges"</span><span class="p">:</span> <span class="mf">12345.0</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div>
<p>The unit test is named "test_customer_is_a_smoker_over_60" and it tests that the rule evaluates to "true" given the input. This unit test along with 9 others is found in the insurance_charges_model_test.rego file in the policies folder in the project repository.</p>
<p>We'll run the test with this Docker command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">it</span> <span class="o">--</span><span class="n">rm</span> \
<span class="o">-</span><span class="n">v</span> <span class="s2">"$(pwd)"</span><span class="o">/../</span><span class="n">policies</span><span class="p">:</span><span class="o">/</span><span class="n">policies</span> \
<span class="n">openpolicyagent</span><span class="o">/</span><span class="n">opa</span><span class="p">:</span><span class="mf">0.43.0</span> <span class="n">test</span> <span class="o">./</span><span class="n">policies</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">PASS</span><span class="o">:</span><span class="w"> </span><span class="mi">10</span><span class="o">/</span><span class="mi">10</span><span class="w"></span>
</code></pre></div>
<p>The rego test command found all 10 tests and executed them. The tests are loaded by sharing the folder containing the policies with the Docker container as a volume. The "opa test" command then automatically found the insurance_charges_model_tests.rego file and executed all of the tests found inside. The tests all passed.</p>
<p>One of the good things about building policies with code is the ability to test the policies to add quality control to the policy codebase.</p>
<h2>Creating the Policy Decorator</h2>
<p>In order to cleanly integrate a deployed ML model with the Open Policy agent, we'll create a decorator that handles the application of policies. The decorator will execute "around" the model's output_schema property and predict() method.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span><span class="p">,</span> <span class="n">Union</span>
<span class="kn">from</span> <span class="nn">pydantic</span> <span class="kn">import</span> <span class="n">BaseModel</span>
<span class="kn">from</span> <span class="nn">ml_base.decorator</span> <span class="kn">import</span> <span class="n">MLModelDecorator</span>
<span class="kn">from</span> <span class="nn">opa_client.opa</span> <span class="kn">import</span> <span class="n">OpaClient</span>
<span class="k">class</span> <span class="nc">PredictionNotAvailable</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="sd">"""Schema returned when a prediction is not available because of a policy decision."""</span>
<span class="n">messages</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">OPAPolicyDecorator</span><span class="p">(</span><span class="n">MLModelDecorator</span><span class="p">):</span>
<span class="sd">"""Decorator to do policy checks using the Open Policy Agent service.</span>
<span class="sd"> Args:</span>
<span class="sd"> host: Hostname of the OPA service.</span>
<span class="sd"> port: Port of the OPA service.</span>
<span class="sd"> policy_package: Name of the policy to apply to the model.</span>
<span class="sd"> """</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">host</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">port</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">policy_package</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="n">host</span><span class="p">,</span> <span class="n">port</span><span class="o">=</span><span class="n">port</span><span class="p">,</span> <span class="n">policy_package</span><span class="o">=</span><span class="n">policy_package</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_client"</span><span class="p">]</span> <span class="o">=</span> <span class="n">OpaClient</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="n">host</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="n">port</span><span class="p">,</span>
<span class="n">version</span><span class="o">=</span><span class="s2">"v1"</span><span class="p">)</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="n">BaseModel</span><span class="p">:</span>
<span class="sd">"""Decorator method that modifies the model's output schema to accomodate the policy decision.</span>
<span class="sd"> Note:</span>
<span class="sd"> This method will create a Union of the model's output schema and the PredictionNotAvailable</span>
<span class="sd"> schema and return it.</span>
<span class="sd"> """</span>
<span class="k">class</span> <span class="nc">NewUnion</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">__root__</span><span class="p">:</span> <span class="n">Union</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">output_schema</span><span class="p">,</span> <span class="n">PredictionNotAvailable</span><span class="p">]</span>
<span class="n">NewUnion</span><span class="o">.</span><span class="vm">__name__</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">output_schema</span><span class="o">.</span><span class="vm">__name__</span>
<span class="k">return</span> <span class="n">NewUnion</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="sd">"""Decorate the model's predict() method, calling the OPA service with the model's input and output.</span>
<span class="sd"> Note:</span>
<span class="sd"> If a prediction is allowed the OPAPolicyDecorator predict() method will return an</span>
<span class="sd"> instance of the model's output schema. If a prediction is not allowed because of a policy </span>
<span class="sd"> violation, the decorator will return an instance of PredictionNotAvailable.</span>
<span class="sd"> """</span>
<span class="c1"># make a prediction with the model</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="c1"># build up data structure to send to the OPA service</span>
<span class="n">policy_check_data</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"model_qualified_name"</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">,</span>
<span class="s2">"model_version"</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">version</span><span class="p">,</span>
<span class="s2">"model_input"</span><span class="p">:</span> <span class="n">data</span><span class="o">.</span><span class="n">dict</span><span class="p">(),</span>
<span class="s2">"model_output"</span><span class="p">:</span> <span class="n">prediction</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="p">}</span>
<span class="c1"># call OPA service with model input and output </span>
<span class="n">response</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_client"</span><span class="p">]</span><span class="o">.</span><span class="n">check_policy_rule</span><span class="p">(</span><span class="n">input_data</span><span class="o">=</span><span class="n">policy_check_data</span><span class="p">,</span>
<span class="n">package_path</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"policy_package"</span><span class="p">])</span>
<span class="c1"># if "allow" is True, then return the prediction</span>
<span class="k">if</span> <span class="n">response</span><span class="p">[</span><span class="s2">"result"</span><span class="p">][</span><span class="s2">"allow"</span><span class="p">]:</span>
<span class="k">return</span> <span class="n">prediction</span>
<span class="c1"># otherwise, return an instance of PredictionNotAvailable</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="n">PredictionNotAvailable</span><span class="p">(</span><span class="n">messages</span><span class="o">=</span><span class="n">response</span><span class="p">[</span><span class="s2">"result"</span><span class="p">][</span><span class="s2">"messages"</span><span class="p">])</span>
<span class="k">def</span> <span class="fm">__del__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">try</span><span class="p">:</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_client"</span><span class="p">]</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">del</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_client"</span><span class="p">]</span>
<span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span>
<span class="k">pass</span>
</code></pre></div>
<p>The OPAPolicyDecorator class implements the decorator. The <code>__init__()</code> method is used to configure the decorator when it is instantiated. It has parameters for the hostname and port of the OPA service, and the policy package that we want to apply to the model.</p>
<p>The decorator actually modifies the output schema of the model that it is decorating. The output schema becomes a Union of the model's output schema and a schema called PredictionNotAvailable. The decorator needs to add this Union because it needs to be able to inform the users of the model when the policy does not allow a prediction to be returned. The modification of the output schema happens transparently to the user of the model, all they need to do is be able to handle the model's output when the PredictionNotAvailable output is returned.</p>
<p>The predict() method is the where the action happens. Every time we make a prediction, the decorator will pass the prediction input to the model instance and receive the prediction output from the model. The decorator then sends the model's input and output to the OPA service along with the name of the policy package that we want to apply. If the "allow" result comes back as True, then the prediction is returned to the calling code, if "allow" result is False then the decorator returns a PredictionNotAvailable instance. The "messages" array is returned inside of the PredictionNotAvailable instance if the policy does not allow the prediction.</p>
<h2>Decorating the Model</h2>
<p>To test out the decorator we’ll first instantiate the model object that we want to use with the decorator.</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">InsuranceChargesModel</span><span class="p">()</span>
</code></pre></div>
<p>Next, we’ll instantiate the decorator with the parameters.</p>
<div class="highlight"><pre><span></span><code><span class="n">decorator</span> <span class="o">=</span> <span class="n">OPAPolicyDecorator</span><span class="p">(</span>
<span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="mi">8181</span><span class="p">,</span>
<span class="n">policy_package</span><span class="o">=</span><span class="s2">"insurance_charges_model"</span>
<span class="p">)</span>
</code></pre></div>
<p>We can add the model instance to the decorator after it’s been instantiated like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_model</span> <span class="o">=</span> <span class="n">decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<p>We can see the decorator and the model objects by printing the reference to the decorator:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_model</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>OPAPolicyDecorator(InsuranceChargesModel)
</code></pre></div>
<p>The decorator object is printing out it's own type along with the type of the model that it is decorating.</p>
<p>The JSON Schema of the model output schema also reflects the Union that was created by the decorator:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_model</span><span class="o">.</span><span class="n">output_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{<span class="s1">'</span><span class="s">title</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">InsuranceChargesModelOutput</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">anyOf</span><span class="s1">'</span>: [{<span class="s1">'</span><span class="s">$ref</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">#/definitions/insurance_charges_model__prediction__schemas__InsuranceChargesModelOutput</span><span class="s1">'</span>},
{<span class="s1">'</span><span class="s">$ref</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">#/definitions/PredictionNotAvailable</span><span class="s1">'</span>}],
<span class="s1">'</span><span class="s">definitions</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">insurance_charges_model__prediction__schemas__InsuranceChargesModelOutput</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">title</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">InsuranceChargesModelOutput</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">description</span><span class="s1">'</span>: <span class="s2">"</span><span class="s">Schema for output of the model's predict method.</span><span class="s2">"</span>,
<span class="s1">'</span><span class="s">type</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">object</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">properties</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">charges</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">title</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">Charges</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">description</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">Individual medical costs billed by health insurance to customer in US dollars.</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">type</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">number</span><span class="s1">'</span>}}},
<span class="s1">'</span><span class="s">PredictionNotAvailable</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">title</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">PredictionNotAvailable</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">description</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">Schema returned when a prediction is not available because of a policy decision.</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">type</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">object</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">properties</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">messages</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">title</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">Messages</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">type</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">array</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">items</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">type</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">string</span><span class="s1">'</span>}}},
<span class="s1">'</span><span class="s">required</span><span class="s1">'</span>: [<span class="s1">'</span><span class="s">messages</span><span class="s1">'</span>]}}}
</code></pre></div>
<p>As we explained, the PredictionNotAvailable output is added by the OPAPolicyDecorator instance whenever the policy does not allow a prediction to be returned from the model. The Union is shown in the JSON Schema document using the "anyOf" field.</p>
<h2>Trying out the Decorator</h2>
<p>Now that we have some policies in the OPA service and a decorated model, we can try to make predictions with the decorated model.</p>
<p>To begin, we'll try a prediction that we know will succeed:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.schemas</span> <span class="kn">import</span> <span class="n">InsuranceChargesModelInput</span>
<span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=8640.78)
</code></pre></div>
<p>Since the customer is not a smoker or over the age of 60, we got a prediction back from the model. Next, we'll try another prediction:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">62</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="nv">PredictionNotAvailable</span><span class="ss">(</span><span class="nv">messages</span><span class="o">=</span>[<span class="s1">'</span><span class="s">Prediction cannot be made if customer is a smoker over the age of 60.</span><span class="s1">'</span>]<span class="ss">)</span>
</code></pre></div>
<p>The policy decorator stepped in when the OPA service returned a result with "allow" set to false. The decorator threw away the model's prediction and returned an instance of PredictionNotAvailable with the messages array that the policy running in the OPA service created.</p>
<h2>Deploying the Decorator and Model</h2>
<p>Now that we have a model and a decorator, we can combine them together in a service that is able to make predictions and also does policy checks. To do this, we won't need to write any extra code, we can leverage the <a href="https://pypi.org/project/rest-model-service/">rest_model_service package</a> to provide the RESTful API for the service. You can learn more about the package in <a href="https://www.tekhnoal.com/rest-model-service.html">this blog post</a>.</p>
<p>To install the package, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">rest_model_service</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To create a service for our model, all that is needed is that we add a YAML configuration file to the project. The configuration file looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Insurance Charges Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">qualified_name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model</span><span class="w"></span>
<span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model.prediction.model.InsuranceChargesModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
<span class="w"> </span><span class="nt">decorators</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">policy_decorator.policy_decorator.OPAPolicyDecorator</span><span class="w"></span>
<span class="w"> </span><span class="nt">configuration</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="s">"localhost"</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8181</span><span class="w"></span>
<span class="w"> </span><span class="nt">policy_package</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model</span><span class="w"></span>
</code></pre></div>
<p>The service_title field is the name of the service as it will appear in the documentation. The models field is an array that contains the details of the models we would like to deploy in the service. The class_path field points at the MLModel class that implement's the model's prediction logic. The decorators field contains the details of the decorators that we want to attach to the model instance. In this case, we want to use the OPAPolicyDecorator decorator class with the configuration we've used for local testing.</p>
<p>Using the configuration file, we're able to create an OpenAPI specification file for the model service by executing these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>./configuration/local_rest_config.yaml
generate_openapi --output_file<span class="o">=</span><span class="s2">"service_contract.yaml"</span>
</code></pre></div>
<p>The service_contract.yaml file is generated and contains the OpenAPI specification that was generated for the model service. The insurance_charges_model endpoint is the one we'll call to make predictions with the model. The model's input and output schemas were automatically extracted and added to the specification. If you inspect the contract, you'll find that the model's output schema was automatically modified by the decorator in the same way as it was done in the example above, the output schema is a Union of the model's original output schema and the PredictionNotAvailable type. The OpenAPI specification file generated can be found at the root of the repository in the file named service_contract.yaml</p>
<p>To run the service locally, execute these commands:</p>
<div class="highlight"><pre><span></span><code>uvicorn rest_model_service.main:app --reload
</code></pre></div>
<p>The service process starts up and can be accessed in a web browser at http://127.0.0.1:8000. The service renders the OpenAPI specification as a webpage that looks like this:</p>
<p><img alt="Service Documentation" src="https://www.tekhnoal.com/service_documentation_pfmlm.png" width="100%"></p>
<p>By using the MLModel base class provided by the ml_base package and the REST service framework provided by the rest_model_service package we're able to quickly stand up a service to host the model. The decorator that we want to deploy can also be added to the model through configuration, including all of their parameters.</p>
<p>We won't be testing the service right now, so we can stop the service process by hitting CTRL+C.</p>
<h2>Creating a Docker Image</h2>
<p>Now that we have a working model and model service, we'll need to deploy it somewhere. We'll start by deploying the service locally using Docker.</p>
<p>Let's create a docker image and run it locally. The docker image is generated using instructions in the Dockerfile:</p>
<div class="highlight"><pre><span></span><code><span class="c"># syntax=docker/dockerfile:1</span>
<span class="k">FROM</span><span class="w"> </span><span class="s">python:3.9-slim</span>
<span class="k">ARG</span><span class="w"> </span>BUILD_DATE
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.title<span class="o">=</span><span class="s2">"Policies for ML Models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.description<span class="o">=</span><span class="s2">"Policies for machine learning models."</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.created<span class="o">=</span><span class="nv">$BUILD_DATE</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.authors<span class="o">=</span><span class="s2">"6666331+schmidtbri@users.noreply.github.com"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.source<span class="o">=</span><span class="s2">"https://github.com/schmidtbri/policies-for-ml-models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.version<span class="o">=</span><span class="s2">"0.1.0"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.licenses<span class="o">=</span><span class="s2">"MIT License"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.base.name<span class="o">=</span><span class="s2">"python:3.9-slim"</span>
<span class="k">WORKDIR</span><span class="w"> </span><span class="s">/service</span>
<span class="k">ARG</span><span class="w"> </span><span class="nv">USERNAME</span><span class="o">=</span>service-user
<span class="k">ARG</span><span class="w"> </span><span class="nv">USER_UID</span><span class="o">=</span><span class="m">10000</span>
<span class="k">ARG</span><span class="w"> </span><span class="nv">USER_GID</span><span class="o">=</span><span class="m">10000</span>
<span class="c"># install packages</span>
<span class="k">RUN</span><span class="w"> </span>apt-get update <span class="se">\</span>
<span class="o">&&</span> apt-get install --assume-yes --no-install-recommends sudo <span class="se">\</span>
<span class="o">&&</span> apt-get install --assume-yes --no-install-recommends git <span class="se">\</span>
<span class="o">&&</span> apt-get clean <span class="se">\</span>
<span class="o">&&</span> rm -rf /var/lib/apt/lists/*
<span class="c"># create a user</span>
<span class="k">RUN</span><span class="w"> </span>groupadd --gid <span class="nv">$USER_GID</span> <span class="nv">$USERNAME</span> <span class="se">\</span>
<span class="o">&&</span> useradd --uid <span class="nv">$USER_UID</span> --gid <span class="nv">$USER_GID</span> -m <span class="nv">$USERNAME</span> <span class="se">\</span>
<span class="o">&&</span> <span class="nb">echo</span> <span class="nv">$USERNAME</span> <span class="nv">ALL</span><span class="o">=</span><span class="se">\(</span>root<span class="se">\)</span> NOPASSWD:ALL > /etc/sudoers.d/<span class="nv">$USERNAME</span> <span class="se">\</span>
<span class="o">&&</span> chmod <span class="m">0440</span> /etc/sudoers.d/<span class="nv">$USERNAME</span>
<span class="c"># installing dependencies</span>
<span class="k">COPY</span><span class="w"> </span>./service_requirements.txt ./service_requirements.txt
<span class="k">RUN</span><span class="w"> </span>pip install -r service_requirements.txt
<span class="c"># copying code, configuration, and license</span>
<span class="k">COPY</span><span class="w"> </span>./configuration ./configuration
<span class="k">COPY</span><span class="w"> </span>./policy_decorator ./policy_decorator
<span class="k">COPY</span><span class="w"> </span>./LICENSE ./LICENSE
<span class="k">CMD</span><span class="w"> </span><span class="p">[</span><span class="s2">"uvicorn"</span><span class="p">,</span><span class="w"> </span><span class="s2">"rest_model_service.main:app"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--host"</span><span class="p">,</span><span class="w"> </span><span class="s2">"0.0.0.0"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--port"</span><span class="p">,</span><span class="w"> </span><span class="s2">"8000"</span><span class="p">]</span>
<span class="k">USER</span><span class="w"> </span><span class="s">$USERNAME</span>
</code></pre></div>
<p>The Dockerfile is used by this docker command to create a docker image:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">build</span> <span class="o">-</span><span class="n">t</span> <span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="mf">0.1.0</span> <span class="o">../</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make sure everything worked as expected, we'll look through the docker images in our system:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">image</span> <span class="n">ls</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">insurance_charges_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>insurance_charges_model_service 0.1.0 4b2747668a67 18 seconds ago 1.37GB
</code></pre></div>
<p>The insurance_charges_model_service image is listed. Next, we'll start the image to see if everything is working as expected. However, we need to connect the docker containers to the same network first. Let's create a Docker network:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">create</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">8</span><span class="n">a7d2d05523d01dd0fc082adac84bda01a012d7e847dcd4ffcc35df1031e18ab</span><span class="w"></span>
</code></pre></div>
<p>Next, we'll connect the running OPA Docker image to the network.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">connect</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span> <span class="n">opa</span>
</code></pre></div>
<p>Now we can start the service docker image connected to the same network as the OPA container.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">d</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">8000</span><span class="p">:</span><span class="mi">8000</span> \
<span class="o">--</span><span class="n">net</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">REST_CONFIG</span><span class="o">=./</span><span class="n">configuration</span><span class="o">/</span><span class="n">docker_rest_config</span><span class="o">.</span><span class="n">yaml</span> \
<span class="o">--</span><span class="n">name</span> <span class="n">insurance_charges_model_service</span> \
<span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">02</span><span class="n">dd79117cfe53949b30dea9e1aa8834bf2509e2cc707f42972eec955c3364ae</span><span class="w"></span>
</code></pre></div>
<p>Notice that we're using the "docker_rest_config.yaml" configuration file that has a different hostname for the OPA service instance. The opa container is not accesible from localhost inside of the network so we needed to have the hostname "opa" in the configuration.</p>
<p>To make sure the server process started up correctly, we'll look at the logs:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">logs</span> <span class="n">insurance_charges_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="o">/</span><span class="n">usr</span><span class="o">/</span><span class="n">local</span><span class="o">/</span><span class="n">lib</span><span class="o">/</span><span class="n">python3</span><span class="mf">.9</span><span class="o">/</span><span class="n">site</span><span class="o">-</span><span class="n">packages</span><span class="o">/</span><span class="n">tpot</span><span class="o">/</span><span class="n">builtins</span><span class="o">/</span><span class="fm">__init__</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">36</span><span class="p">:</span> <span class="ne">UserWarning</span><span class="p">:</span> <span class="ne">Warning</span><span class="p">:</span> <span class="n">optional</span> <span class="n">dependency</span> <span class="err">`</span><span class="n">torch</span><span class="err">`</span> <span class="ow">is</span> <span class="ow">not</span> <span class="n">available</span><span class="o">.</span> <span class="o">-</span> <span class="n">skipping</span> <span class="kn">import</span> <span class="nn">of</span> <span class="n">NN</span> <span class="n">models</span><span class="o">.</span>
<span class="n">warnings</span><span class="o">.</span><span class="n">warn</span><span class="p">(</span><span class="s2">"Warning: optional dependency `torch` is not available. - skipping import of NN models."</span><span class="p">)</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Started</span> <span class="n">server</span> <span class="n">process</span> <span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Waiting</span> <span class="k">for</span> <span class="n">application</span> <span class="n">startup</span><span class="o">.</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Application</span> <span class="n">startup</span> <span class="n">complete</span><span class="o">.</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Uvicorn</span> <span class="n">running</span> <span class="n">on</span> <span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="mf">0.0.0.0</span><span class="p">:</span><span class="mi">8000</span> <span class="p">(</span><span class="n">Press</span> <span class="n">CTRL</span><span class="o">+</span><span class="n">C</span> <span class="n">to</span> <span class="n">quit</span><span class="p">)</span>
</code></pre></div>
<p>The service should be accessible on port 8000 of localhost, so we'll try to make a prediction using the curl command running inside of a container connected to the network:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">it</span> <span class="o">--</span><span class="n">rm</span> \
<span class="o">--</span><span class="n">net</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span> \
<span class="n">curlimages</span><span class="o">/</span><span class="n">curl</span> \
<span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://insurance_charges_model_service:8000/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 42, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">female</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 24.0, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 2, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: false, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">northwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":8640.78}
</code></pre></div>
<p>The model predicted that the insurance charges will be $8640.78.</p>
<p>We'll try a prediction that will fail the policy check as well:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">it</span> <span class="o">--</span><span class="n">rm</span> \
<span class="o">--</span><span class="n">net</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span> \
<span class="n">curlimages</span><span class="o">/</span><span class="n">curl</span> \
<span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://insurance_charges_model_service:8000/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 62, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">female</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 24.0, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 2, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: true, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">northwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{<span class="s2">"</span><span class="s">messages</span><span class="s2">"</span>:[<span class="s2">"</span><span class="s">Prediction cannot be made if customer is a smoker over the age of 60.</span><span class="s2">"</span>]}
</code></pre></div>
<p>We're done with the local environment, so we'll shut down the OPA container, the model service container and the network we created for them.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">opa</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">opa</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">insurance_charges_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">insurance_charges_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">rm</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>opa
opa
insurance_charges_model_service
insurance_charges_model_service
local-network
</code></pre></div>
<h2>Deploying the Model</h2>
<p>To show the system in action, we’ll deploy the service and the Redis instance to a Kubernetes cluster. A local cluster can be easily started by using <a href="https://minikube.sigs.k8s.io/docs/">minikube</a>. Installation instructions can be found <a href="https://minikube.sigs.k8s.io/docs/start/">here</a>.</p>
<p>To start the minikube cluster execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">start</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="err">😄</span><span class="w"> </span><span class="n">minikube</span><span class="w"> </span><span class="n">v1</span><span class="o">.</span><span class="mf">26.1</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">Darwin</span><span class="w"> </span><span class="mf">12.5</span><span class="o">.</span><span class="mi">1</span><span class="w"></span>
<span class="err">🎉</span><span class="w"> </span><span class="n">minikube</span><span class="w"> </span><span class="mf">1.27</span><span class="o">.</span><span class="mi">0</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="n">available</span><span class="o">!</span><span class="w"> </span><span class="n">Download</span><span class="w"> </span><span class="n">it</span><span class="p">:</span><span class="w"> </span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">github</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">minikube</span><span class="o">/</span><span class="n">releases</span><span class="o">/</span><span class="n">tag</span><span class="o">/</span><span class="n">v1</span><span class="o">.</span><span class="mf">27.0</span><span class="w"></span>
<span class="err">💡</span><span class="w"> </span><span class="n">To</span><span class="w"> </span><span class="n">disable</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="n">notice</span><span class="p">,</span><span class="w"> </span><span class="n">run</span><span class="p">:</span><span class="w"> </span><span class="s1">'minikube config set WantUpdateNotification false'</span><span class="w"></span>
<span class="err">✨</span><span class="w"> </span><span class="n">Using</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">virtualbox</span><span class="w"> </span><span class="n">driver</span><span class="w"> </span><span class="n">based</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">existing</span><span class="w"> </span><span class="n">profile</span><span class="w"></span>
<span class="err">👍</span><span class="w"> </span><span class="n">Starting</span><span class="w"> </span><span class="n">control</span><span class="w"> </span><span class="n">plane</span><span class="w"> </span><span class="n">node</span><span class="w"> </span><span class="n">minikube</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">cluster</span><span class="w"> </span><span class="n">minikube</span><span class="w"></span>
<span class="err">🔄</span><span class="w"> </span><span class="n">Restarting</span><span class="w"> </span><span class="n">existing</span><span class="w"> </span><span class="n">virtualbox</span><span class="w"> </span><span class="n">VM</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="s2">"minikube"</span><span class="w"> </span><span class="o">...</span><span class="w"></span>
<span class="err">🐳</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span><span class="n">Kubernetes</span><span class="w"> </span><span class="n">v1</span><span class="o">.</span><span class="mf">24.3</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">Docker</span><span class="w"> </span><span class="mf">20.10</span><span class="o">.</span><span class="mi">17</span><span class="w"> </span><span class="o">...</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="err"></span><span class="p">[</span><span class="n">K</span><span class="w"></span>
<span class="w"> </span><span class="err">▪</span><span class="w"> </span><span class="n">controller</span><span class="o">-</span><span class="n">manager</span><span class="o">.</span><span class="n">horizontal</span><span class="o">-</span><span class="n">pod</span><span class="o">-</span><span class="n">autoscaler</span><span class="o">-</span><span class="n">sync</span><span class="o">-</span><span class="n">period</span><span class="o">=</span><span class="mi">5</span><span class="n">s</span><span class="w"></span>
<span class="err">🔎</span><span class="w"> </span><span class="n">Verifying</span><span class="w"> </span><span class="n">Kubernetes</span><span class="w"> </span><span class="n">components</span><span class="o">...</span><span class="w"></span>
<span class="w"> </span><span class="err">▪</span><span class="w"> </span><span class="n">Using</span><span class="w"> </span><span class="n">image</span><span class="w"> </span><span class="n">k8s</span><span class="o">.</span><span class="n">gcr</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">metrics</span><span class="o">-</span><span class="n">server</span><span class="o">/</span><span class="n">metrics</span><span class="o">-</span><span class="n">server</span><span class="p">:</span><span class="n">v0</span><span class="o">.</span><span class="mf">6.1</span><span class="w"></span>
<span class="w"> </span><span class="err">▪</span><span class="w"> </span><span class="n">Using</span><span class="w"> </span><span class="n">image</span><span class="w"> </span><span class="n">kubernetesui</span><span class="o">/</span><span class="n">dashboard</span><span class="p">:</span><span class="n">v2</span><span class="o">.</span><span class="mf">6.0</span><span class="w"></span>
<span class="w"> </span><span class="err">▪</span><span class="w"> </span><span class="n">Using</span><span class="w"> </span><span class="n">image</span><span class="w"> </span><span class="n">gcr</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">k8s</span><span class="o">-</span><span class="n">minikube</span><span class="o">/</span><span class="n">storage</span><span class="o">-</span><span class="n">provisioner</span><span class="p">:</span><span class="n">v5</span><span class="w"></span>
<span class="w"> </span><span class="err">▪</span><span class="w"> </span><span class="n">Using</span><span class="w"> </span><span class="n">image</span><span class="w"> </span><span class="n">kubernetesui</span><span class="o">/</span><span class="n">metrics</span><span class="o">-</span><span class="n">scraper</span><span class="p">:</span><span class="n">v1</span><span class="o">.</span><span class="mf">0.8</span><span class="w"></span>
<span class="err">🌟</span><span class="w"> </span><span class="n">Enabled</span><span class="w"> </span><span class="n">addons</span><span class="p">:</span><span class="w"> </span><span class="n">storage</span><span class="o">-</span><span class="n">provisioner</span><span class="w"></span>
<span class="err">🏄</span><span class="w"> </span><span class="n">Done</span><span class="o">!</span><span class="w"> </span><span class="n">kubectl</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="n">now</span><span class="w"> </span><span class="n">configured</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">use</span><span class="w"> </span><span class="s2">"minikube"</span><span class="w"> </span><span class="n">cluster</span><span class="w"> </span><span class="ow">and</span><span class="w"> </span><span class="s2">"default"</span><span class="w"> </span><span class="n">namespace</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="n">default</span><span class="w"></span>
</code></pre></div>
<p>We'll use the <a href="https://github.com/kubernetes/dashboard">Kubernetes Dashboard</a> to view details about the model service. We can start it up in the minikube cluster with this command:</p>
<div class="highlight"><pre><span></span><code>minikube dashboard --url
</code></pre></div>
<p>The command starts up a proxy that must keep running in order to forward the traffic to the dashboard UI in the minikube cluster.</p>
<p>Let's view all of the pods running in the minikube cluster to make sure we can connect.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">-</span><span class="n">A</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d4b75cb6d-wrrwr 1/1 Running 19 (23h ago) 43d
kube-system etcd-minikube 1/1 Running 19 (23h ago) 43d
kube-system kube-apiserver-minikube 1/1 Running 19 (23h ago) 43d
kube-system kube-controller-manager-minikube 1/1 Running 5 (23h ago) 20d
kube-system kube-proxy-5n4t9 1/1 Running 18 (23h ago) 43d
kube-system kube-scheduler-minikube 1/1 Running 17 (23h ago) 43d
kube-system metrics-server-8595bd7d4c-ptcsp 1/1 Running 15 (23h ago) 23d
kube-system storage-provisioner 1/1 Running 29 43d
kubernetes-dashboard dashboard-metrics-scraper-78dbd9dbf5-xslpl 1/1 Running 11 (23h ago) 23d
kubernetes-dashboard kubernetes-dashboard-5fd5574d9f-vbtnd 1/1 Running 14 (23h ago) 23d
</code></pre></div>
<p>The pods running the kubernetes dashboard and other cluster services appear in the kube-system and kubernetes-dashboard namespaces.</p>
<h3>Creating a Kubernetes Namespace</h3>
<p>Now that we have a cluster and are connected to it, we'll create a namespace to hold the resources for our model deployment. The resource definition is in the kubernetes/namespace.yaml file. To apply the manifest to the cluster, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace/model-services created
resourcequota/model-services-resource-quota created
</code></pre></div>
<p>The namespace was created, alongside with a ResourceQuota which limits the amount of resources that can be taken by objects within the namespace.</p>
<p>To take a look at the namespaces, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">namespace</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME STATUS AGE
default Active 43d
kube-node-lease Active 43d
kube-public Active 43d
kube-system Active 43d
kubernetes-dashboard Active 23d
model-services Active 3s
</code></pre></div>
<p>The new namespace appears in the listing along with other namespaces created by default by the system. To use the new namespace for the rest of the operations, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">config</span> <span class="nb">set</span><span class="o">-</span><span class="n">context</span> <span class="o">--</span><span class="n">current</span> <span class="o">--</span><span class="n">namespace</span><span class="o">=</span><span class="n">model</span><span class="o">-</span><span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Context "minikube" modified.
</code></pre></div>
<h3>Creating a Kubernetes Deployment and Service</h3>
<p>The model service is deployed by using Kubernetes resources. These are:</p>
<ul>
<li>Model Service ConfigMap: a set of configuration options, in this case it is a simple YAML file that will be loaded into the running container as a volume mount. This resource allows us to change the configuration of the model service without having to modify the Docker image. The configuration file will overwrite the configuration files that were included with the Docker image.</li>
<li>Deployment: a declarative way to manage a set of pods, the model service pods are managed through the Deployment. This deployment includes the model service as well as the OPA service running as a sidecar container.</li>
<li>Service: a way to expose a set of pods in a Deployment, the model services is made available to the outside world through the Service.</li>
</ul>
<p>The software architecture will look like this when it is running in the Kubernetes cluster:</p>
<p><img alt="Software Architecture" src="https://www.tekhnoal.com/better_software_architecture_pfmlm.png" width="100%"></p>
<p>This way of deploying the OPA service is called the "sidecar" pattern because the service Pods will contain the main model service and the OPA service running right beside it in the same cluster node.</p>
<p>The sidecar OPA container is added to the model service pod with this YAML:</p>
<div class="highlight"><pre><span></span><code><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">...</span><span class="w"></span>
<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">- name</span><span class="p p-Indicator">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">opa</span><span class="w"></span>
<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">image</span><span class="p p-Indicator">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">openpolicyagent/opa:0.43.0</span><span class="w"></span>
<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">ports</span><span class="p p-Indicator">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">http</span><span class="w"></span>
<span class="w"> </span><span class="nt">containerPort</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8181</span><span class="w"></span>
<span class="w"> </span><span class="w w-Error"> </span><span class="nt">imagePullPolicy</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Never</span><span class="w"></span>
<span class="w"> </span><span class="nt">resources</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">requests</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">cpu</span><span class="p">:</span><span class="w"> </span><span class="s">"100m"</span><span class="w"></span>
<span class="w"> </span><span class="nt">memory</span><span class="p">:</span><span class="w"> </span><span class="s">"250Mi"</span><span class="w"></span>
<span class="w"> </span><span class="nt">limits</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">cpu</span><span class="p">:</span><span class="w"> </span><span class="s">"200m"</span><span class="w"></span>
<span class="w"> </span><span class="nt">memory</span><span class="p">:</span><span class="w"> </span><span class="s">"250Mi"</span><span class="w"></span>
<span class="w"> </span><span class="nt">args</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="s">"run"</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="s">"--ignore=.*"</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="s">"--server"</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="s">"/policies"</span><span class="w"></span>
<span class="w"> </span><span class="nt">volumeMounts</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">readOnly</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
<span class="w"> </span><span class="nt">mountPath</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/policies</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">policies</span><span class="w"></span>
<span class="w"> </span><span class="nt">livenessProbe</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">httpGet</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">scheme</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">HTTP</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8181</span><span class="w"></span>
<span class="w"> </span><span class="nt">initialDelaySeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">5</span><span class="w"></span>
<span class="w"> </span><span class="nt">periodSeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">5</span><span class="w"></span>
<span class="w"> </span><span class="nt">readinessProbe</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">httpGet</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">/health?bundle=true</span><span class="w"></span>
<span class="w"> </span><span class="nt">scheme</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">HTTP</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8181</span><span class="w"></span>
<span class="w"> </span><span class="nt">initialDelaySeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">5</span><span class="w"></span>
<span class="w"> </span><span class="nt">periodSeconds</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">5</span><span class="w"></span>
<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">...</span><span class="w"></span>
</code></pre></div>
<p>This is not the complete YAML file, the Deployment is defined in the ./kubernetes/model_service.yaml file.</p>
<p>You'll notice that the policy is not going to be loaded through the API. We'll be adding the policy as a volume mounted on the /policies folder within the OPA container. The contents of the volume are going to come from a ConfigMap that we'll create with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="n">configmap</span> <span class="n">policies</span> <span class="o">--</span><span class="n">from</span><span class="o">-</span><span class="n">file</span> <span class="o">../</span><span class="n">policies</span><span class="o">/</span><span class="n">insurance_charges_model</span><span class="o">.</span><span class="n">rego</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap/policies created
</code></pre></div>
<p>The ConfigMap is managed separately from the OPA service running in the Pod. Let's view the ConfigMap to make sure it was created successfully.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">describe</span> <span class="n">configmaps</span> <span class="n">policies</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">Name</span><span class="p">:</span> <span class="n">policies</span>
<span class="n">Namespace</span><span class="p">:</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span>
<span class="n">Labels</span><span class="p">:</span> <span class="o"><</span><span class="n">none</span><span class="o">></span>
<span class="n">Annotations</span><span class="p">:</span> <span class="o"><</span><span class="n">none</span><span class="o">></span>
<span class="n">Data</span>
<span class="o">====</span>
<span class="n">insurance_charges_model</span><span class="o">.</span><span class="n">rego</span><span class="p">:</span>
<span class="o">----</span>
<span class="n">package</span> <span class="n">insurance_charges_model</span>
<span class="kn">import</span> <span class="nn">future.keywords.contains</span>
<span class="kn">import</span> <span class="nn">future.keywords.if</span>
<span class="n">customer_is_a_smoker_over_60</span> <span class="k">if</span> <span class="p">{</span>
<span class="nb">input</span><span class="o">.</span><span class="n">model_input</span><span class="o">.</span><span class="n">smoker</span>
<span class="nb">input</span><span class="o">.</span><span class="n">model_input</span><span class="o">.</span><span class="n">age</span> <span class="o">></span> <span class="mi">60</span>
<span class="p">}</span>
<span class="n">allow</span> <span class="o">:=</span> <span class="n">true</span> <span class="k">if</span> <span class="p">{</span>
<span class="ow">not</span> <span class="n">customer_is_a_smoker_over_60</span>
<span class="p">}</span> <span class="k">else</span> <span class="o">:=</span> <span class="n">false</span> <span class="p">{</span>
<span class="n">customer_is_a_smoker_over_60</span>
<span class="p">}</span>
<span class="n">messages</span> <span class="n">contains</span> <span class="n">msg</span> <span class="k">if</span> <span class="p">{</span>
<span class="n">customer_is_a_smoker_over_60</span>
<span class="n">msg</span><span class="o">:=</span> <span class="s2">"Prediction cannot be made if customer is a smoker over the age of 60."</span>
<span class="p">}</span>
<span class="n">BinaryData</span>
<span class="o">====</span>
<span class="n">Events</span><span class="p">:</span> <span class="o"><</span><span class="n">none</span><span class="o">></span>
</code></pre></div>
<p>The contents of the ConfigMap match the contents of the original insurance_charges_model.rego file.</p>
<p>We're almost ready to start the model service, but before starting it we'll need to send the docker image from the local docker daemon to the minikube image cache:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">image</span> <span class="n">load</span> <span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<p>We can view the images in the minikube cache with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">image</span> <span class="n">ls</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">insurance_charges_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>docker.io/library/insurance_charges_model_service:0.1.0
</code></pre></div>
<p>The model service resources are created within the Kubernetes cluster with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap/model-service-configuration created
deployment.apps/insurance-charges-model-deployment created
service/insurance-charges-model-service created
</code></pre></div>
<p>Let's get the names of the pods that are running the service:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY STATUS RESTARTS AGE
insurance-charges-model-deployment-66ff696fd-zbzdv 2/2 Running 0 29s
</code></pre></div>
<p>To make sure the service started up correctly, we'll check the logs of the model service:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">logs</span> <span class="n">insurance</span><span class="o">-</span><span class="n">charges</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">deployment</span><span class="o">-</span><span class="mi">66</span><span class="n">ff696fd</span><span class="o">-</span><span class="n">zbzdv</span> <span class="o">-</span><span class="n">c</span> <span class="n">insurance</span><span class="o">-</span><span class="n">charges</span><span class="o">-</span><span class="n">model</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="o">/</span><span class="n">usr</span><span class="o">/</span><span class="n">local</span><span class="o">/</span><span class="n">lib</span><span class="o">/</span><span class="n">python3</span><span class="mf">.9</span><span class="o">/</span><span class="n">site</span><span class="o">-</span><span class="n">packages</span><span class="o">/</span><span class="n">tpot</span><span class="o">/</span><span class="n">builtins</span><span class="o">/</span><span class="fm">__init__</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">36</span><span class="p">:</span> <span class="ne">UserWarning</span><span class="p">:</span> <span class="ne">Warning</span><span class="p">:</span> <span class="n">optional</span> <span class="n">dependency</span> <span class="err">`</span><span class="n">torch</span><span class="err">`</span> <span class="ow">is</span> <span class="ow">not</span> <span class="n">available</span><span class="o">.</span> <span class="o">-</span> <span class="n">skipping</span> <span class="kn">import</span> <span class="nn">of</span> <span class="n">NN</span> <span class="n">models</span><span class="o">.</span>
<span class="n">warnings</span><span class="o">.</span><span class="n">warn</span><span class="p">(</span><span class="s2">"Warning: optional dependency `torch` is not available. - skipping import of NN models."</span><span class="p">)</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Started</span> <span class="n">server</span> <span class="n">process</span> <span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Waiting</span> <span class="k">for</span> <span class="n">application</span> <span class="n">startup</span><span class="o">.</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Application</span> <span class="n">startup</span> <span class="n">complete</span><span class="o">.</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Uvicorn</span> <span class="n">running</span> <span class="n">on</span> <span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="mf">0.0.0.0</span><span class="p">:</span><span class="mi">8000</span> <span class="p">(</span><span class="n">Press</span> <span class="n">CTRL</span><span class="o">+</span><span class="n">C</span> <span class="n">to</span> <span class="n">quit</span><span class="p">)</span>
</code></pre></div>
<p>Looks like the server process started correctly in the Docker container. The UserWarning is generated when we instantiate the model object, which means everything is running as expected.</p>
<p>We can also view the logs of the OPA service sidecar:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">logs</span> <span class="n">insurance</span><span class="o">-</span><span class="n">charges</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">deployment</span><span class="o">-</span><span class="mi">66</span><span class="n">ff696fd</span><span class="o">-</span><span class="n">zbzdv</span> <span class="o">-</span><span class="n">c</span> <span class="n">opa</span> <span class="o">|</span> <span class="n">head</span> <span class="o">-</span><span class="n">n</span> <span class="mi">5</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"addrs":[":8181"],"diagnostic-addrs":[],"level":"info","msg":"Initializing server.","time":"2022-09-21T14:09:05Z"}
{"level":"warning","msg":"OPA running with uid or gid 0. Running OPA with root privileges is not recommended. Use the -rootless image to avoid running with root privileges. This will be made the default in later OPA releases.","time":"2022-09-21T14:09:05Z"}
{"client_addr":"172.17.0.1:48928","level":"info","msg":"Received request.","req_id":1,"req_method":"GET","req_path":"/","time":"2022-09-21T14:09:13Z"}
{"client_addr":"172.17.0.1:48928","level":"info","msg":"Sent response.","req_id":1,"req_method":"GET","req_path":"/","resp_bytes":1391,"resp_duration":2.031405,"resp_status":200,"time":"2022-09-21T14:09:13Z"}
{"client_addr":"172.17.0.1:48930","level":"info","msg":"Received request.","req_id":2,"req_method":"GET","req_path":"/health","time":"2022-09-21T14:09:13Z"}
</code></pre></div>
<p>The deployment and service for the model service were created together. You can see the new service with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
insurance-charges-model-service NodePort 10.107.89.237 <none> 80:30468/TCP 59s
</code></pre></div>
<p>Minikube exposes the service on a local port, we can get a link to the endpoint with this command:</p>
<div class="highlight"><pre><span></span><code>minikube service insurance-charges-model-service --url -n model-services
</code></pre></div>
<p>The command output this URL:</p>
<div class="highlight"><pre><span></span><code>http://192.168.59.100:30468
</code></pre></div>
<p>The command must keep running to keep the tunnel open to the running model service in the minikube cluster.</p>
<p>To make a prediction, we'll hit the service with a request:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://192.168.59.100:30468/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 62, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">male</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 22, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 5, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: true, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">southwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{<span class="s2">"</span><span class="s">messages</span><span class="s2">"</span>:[<span class="s2">"</span><span class="s">Prediction cannot be made if customer is a smoker over the age of 60.</span><span class="s2">"</span>]}
</code></pre></div>
<p>We have the model service up and running in the local minikube cluster!</p>
<p>Looks like the policy was evaluated and the PredictionNotAvailable schema was returned. Let's try it with a request that we know will return a prediction:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://192.168.59.100:30468/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 42, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">male</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 22, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 5, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: false, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">southwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":9762.69}
</code></pre></div>
<p>The service is up and running with the OPA sidecar and the decorator is able to interact with the sidecar correctly to evaluate the policy we created.</p>
<h3>Deleting the Resources</h3>
<p>We're done working with the Kubernetes resources, so we will delete them and shut down the cluster.</p>
<p>To delete the policies ConfigMap, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="n">configmap</span> <span class="n">policies</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap "policies" deleted
</code></pre></div>
<p>To delete the model service pods, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap "model-service-configuration" deleted
deployment.apps "insurance-charges-model-deployment" deleted
service "insurance-charges-model-service" deleted
</code></pre></div>
<p>To delete the model-services namespace, delete this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace "model-services" deleted
resourcequota "model-services-resource-quota" deleted
</code></pre></div>
<p>To shut down the Kubernetes cluster:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">stop</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>✋ Stopping node "minikube" ...
🛑 1 node stopped.
</code></pre></div>
<h2>Closing</h2>
<p>In this blog post we showed how to deploy a machine learning model with a decorator that applied policies to the model's prediction. We built the policy using the Rego language and executed it with the Open Policy Agent. By adding the policy as a decorator, we’re able to decouple the model’s prediction logic from the policy logic, this makes both components more reusable and easier to test. In fact, the policy decorator can easily be reused in other ML deployments, as long as we write a policy that matches our model’s needs. </p>
<p>By writing the policy in an industry-standard language we’re enabling people that don’t have experience with ML or ML deployments to create complex policies that can be deployed alongside an ML model. The person that writes these policies is often a subject matter expert that understands the domain within which the model is working and the effect that the model’s operation will have on it. By using a policy-based approach to the problem of checking ML model predictions we’re able to simplify the deployment process as well, since a policy can be developed and deployed separately from the ML model deployment.</p>
<p>Adding the OPA sidecar to the deployment increased the complexity of the software because we have to worry about deploying an extra container in the Kubernetes pod to run the policy. This approach also increased the latency for each prediction because it requires inter-process communication to execute the policy for each prediction that the model service makes. For both of these reasons, using Rego and the Open Policy Agent may not be the ideal choice for all model deployments. In some situations, it may be better to just write the policy in Python and deploy it as a decorator alongside the model, this will make the policy decision add less time to the total prediction time.</p>Load Tests for ML Models2022-09-01T07:00:00-05:002022-09-01T07:00:00-05:00Brian Schmidttag:www.tekhnoal.com,2022-09-01:/load-tests-for-ml-models.html<p>In a <a href="https://www.tekhnoal.com/rest-model-service.html">previous blog post</a> we showed how to create a RESTful model service for a machine learning model that we want to deploy. A common requirement for RESTful services is to be able to be able to continue working while being used by many users at the same time. In this blog post we'll show how to create a load testing script for an ML model service.</p><h1>Load Tests for ML Models</h1>
<p>In a <a href="https://www.tekhnoal.com/rest-model-service.html">previous blog post</a> we showed how to create a RESTful model service for a machine learning model that we want to deploy. A common requirement for RESTful services is to be able to be able to continue working while being used by many users at the same time. In this blog post we'll show how to create a load testing script for an ML model service.</p>
<p>This blog post was written in a Jupyter notebook, some of the code and commands found in it reflects this.</p>
<h2>Introduction</h2>
<p>Deploying machine learning models is always done in the context of a bigger software system into which the ML model is being integrated. ML models need to be integrated correctly into the software system, and the deployed ML model needs to meet the requirements of the system into which it is being deployed. The requirements that a system must meet are often categorized into two types: functional requirements and non-functional requirements. <a href="https://en.wikipedia.org/wiki/Functional_requirement">Functional requirements</a> are the specific behavior that a sytem must have in order to do its assigned tasks. <a href="https://en.wikipedia.org/wiki/Non-functional_requirement">Non-functional requirements</a> are the operational standards that the system must meet in order to do its assigned tasks. An example of a non-functional requirement is resilience, which is the quality of a system that is able to have errors in its operation and still provide an acceptable level of service. Non-functional requirements are often hard to measure objectively, but we can definitely tell when they are missing from a system. In this blog post we'll be dealing with load non-functional requirements.</p>
<p>Non-functional requirements can be stated by using <a href="https://en.wikipedia.org/wiki/Service_level_indicator">Service Level Indicators (SLI)</a>. An SLI is a simply a metric that measures an aspect of the function of the system. For example, the latency of a system is the amount of time it takes for the system to fulfill one request from beginning to end. An SLI needs to be well-defined and understood by both the clients and operators of a system because it forms the basis for service level objectives. Some examples of SLIs are latency, throughput, availability, error rate, and durability.</p>
<p><a href="https://en.wikipedia.org/wiki/Service-level_objective">Service level objectives (SLO)</a> are requirements on the operation of a system as measured through the SLIs of the system. SLOs are defined and agreed-upon ways to tell when a system is operating outside of the required performance standard. For example, when measuring latency a valid SLO would be something like this: "the latency of the system must be 500 ms or less for 90% of requests". When measuring error rates an SLO would say "the number of errors must not exceed 10 for every 10,000 requests made to the system".</p>
<p><a href="https://en.wikipedia.org/wiki/Service-level_agreement">Service Level Agreements (SLA)</a> are an agreement between a system and its clients about the "level" at which the system will provide its services. SLAs can contain many different types of clauses, the ones we are interested today are the non-functional aspects of the system as measured by SLIs and constrained by SLOs. </p>
<p>Load testing is the process by which we can verify that a deployed ML model that is deployed as a service is able to meet the SLA of the service while under load. Some of the SLIs that we will me measuring will be latency, throughput, and error rate.</p>
<p>All of the code for this blog post is available in <a href="https://github.com/schmidtbri/load-tests-for-ml-models">this github repository</a>.</p>
<h2>Installing the Model</h2>
<p>To make this blog post a little shorter we won't train a completely new model. Instead we'll install a model that we've <a href="https://www.tekhnoal.com/regression-model.html">built in a previous blog post</a>. The code for the model is in <a href="https://github.com/schmidtbri/regression-model">this github repository</a>.</p>
<p>To install the model, we can use the pip command and point it at the github repo of the model.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">clear_output</span>
<span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">Markdown</span> <span class="k">as</span> <span class="n">md</span>
<span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="o">-</span><span class="n">e</span> <span class="n">git</span><span class="o">+</span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">github</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">schmidtbri</span><span class="o">/</span><span class="n">regression</span><span class="o">-</span><span class="n">model</span><span class="c1">#egg=insurance_charges_model</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make a prediction with the model, we'll import the model's class.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.model</span> <span class="kn">import</span> <span class="n">InsuranceChargesModel</span>
</code></pre></div>
<p>Now we can instantiate the model:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">InsuranceChargesModel</span><span class="p">()</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make a prediction, we'll need to use the model's input schema class.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.schemas</span> <span class="kn">import</span> <span class="n">InsuranceChargesModelInput</span><span class="p">,</span> \
<span class="n">SexEnum</span><span class="p">,</span> <span class="n">RegionEnum</span>
<span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
</code></pre></div>
<p>The model's input schema is called InsuranceChargesModelInput and it encompasses all of the features required by the model to make a prediction.</p>
<p>Now we can make a prediction with the model by calling the predict() method with an instance of the InsuranceChargesModelInput class.</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=8640.78)
</code></pre></div>
<p>The model predicts that the charges will be $8640.78.</p>
<p>We can view input schema of the model as a JSON schema document by calling the .schema() method on the class.</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'InsuranceChargesModelInput'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Schema for input of the model's predict method."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'object'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'properties'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'age'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Age'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Age of primary beneficiary in years.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mi">18</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mi">65</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'integer'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'sex'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Sex'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Gender of beneficiary.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/SexEnum'</span><span class="p">}]},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'bmi'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Body Mass Index'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Body mass index of beneficiary.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mf">15.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mf">50.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'number'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'children'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Children'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Number of children covered by health insurance.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'integer'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'smoker'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Smoker'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Whether beneficiary is a smoker.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'boolean'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'region'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Region'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Region where beneficiary lives.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/RegionEnum'</span><span class="p">}]}},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'definitions'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'SexEnum'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'SexEnum'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Enumeration for the value of the 'sex' input of the model."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'male'</span><span class="p">,</span><span class="w"> </span><span class="s1">'female'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'RegionEnum'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'RegionEnum'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Enumeration for the value of the 'region' input of the model."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'southwest'</span><span class="p">,</span><span class="w"> </span><span class="s1">'southeast'</span><span class="p">,</span><span class="w"> </span><span class="s1">'northwest'</span><span class="p">,</span><span class="w"> </span><span class="s1">'northeast'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">}}}</span><span class="w"></span>
</code></pre></div>
<p>We'll make use of the model's input schema to create the load testing script.</p>
<h2>Profiling the Model</h2>
<p>In order to get an idea of how much time it takes for our model to make a prediction, we'll profile it by making predictions with random data. To do this, we'll use the <a href="https://faker.readthedocs.io/en/master/">Faker package</a>. We can install it with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">Faker</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>We'll create a function that can generate a random sample that meets the model's input schema:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">faker</span> <span class="kn">import</span> <span class="n">Faker</span>
<span class="n">faker</span> <span class="o">=</span> <span class="n">Faker</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">generate_record</span><span class="p">()</span> <span class="o">-></span> <span class="n">InsuranceChargesModelInput</span><span class="p">:</span>
<span class="n">record</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"age"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">18</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">65</span><span class="p">),</span>
<span class="s2">"sex"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_choices</span><span class="p">(</span><span class="n">elements</span><span class="o">=</span><span class="p">(</span><span class="s2">"male"</span><span class="p">,</span> <span class="s2">"female"</span><span class="p">),</span> <span class="n">length</span><span class="o">=</span><span class="mi">1</span><span class="p">)[</span><span class="mi">0</span><span class="p">],</span>
<span class="s2">"bmi"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">15000</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">50000</span><span class="p">)</span><span class="o">/</span><span class="mf">1000.0</span><span class="p">,</span>
<span class="s2">"children"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">5</span><span class="p">),</span>
<span class="s2">"smoker"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">boolean</span><span class="p">(),</span>
<span class="s2">"region"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_choices</span><span class="p">(</span><span class="n">elements</span><span class="o">=</span><span class="p">(</span><span class="s2">"southwest"</span><span class="p">,</span> <span class="s2">"southeast"</span><span class="p">,</span> <span class="s2">"northwest"</span><span class="p">,</span> <span class="s2">"northeast"</span><span class="p">),</span> <span class="n">length</span><span class="o">=</span><span class="mi">1</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span><span class="o">**</span><span class="n">record</span><span class="p">)</span>
</code></pre></div>
<p>The function returns an instance of the InsuranceChargesModelInput class, which is the type required by the model's predict() method. We'll use this function to profile the predict() method of the model.</p>
<p>It's really hard to get a complete picture of the performance with one sample, so we'll perform a test with many random samples to see the difference. To start, we'll generate 1000 samples and save them:</p>
<div class="highlight"><pre><span></span><code><span class="n">samples</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1000</span><span class="p">):</span>
<span class="n">samples</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">generate_record</span><span class="p">())</span>
</code></pre></div>
<p>By using the timeit module from the standard library, we can measure how much time it takes to call the model's predict method with a random sample. We'll make 1000 predictions.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">timeit</span>
<span class="n">total_seconds</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="s2">"[model.predict(sample) for sample in samples]"</span><span class="p">,</span>
<span class="n">number</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">())</span>
<span class="n">seconds_per_sample</span> <span class="o">=</span> <span class="n">total_seconds</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">samples</span><span class="p">)</span>
<span class="n">milliseconds_per_sample</span> <span class="o">=</span> <span class="n">seconds_per_sample</span> <span class="o">*</span> <span class="mf">1000.0</span>
</code></pre></div>
<p>The model took 31.74 seconds to perform 1000 predictions, therefore it took 0.032 seconds to make a single prediction. The model takes about 31.74 milliseconds to make a prediction.</p>
<p>We now have enough information to establish an SLO for the model itself. An acceptable amount of time for the model to make a prediction is 100 ms (this is made up for the sake of the example). Based on the results from the test above, we're pretty sure that the model meets this standard. However, we want to write the requirement directly into the code of the notebook. To do this in a notebook cell, we can simply write an assert statement which checks for the condition:</p>
<div class="highlight"><pre><span></span><code><span class="k">assert</span> <span class="n">milliseconds_per_sample</span> <span class="o"><</span> <span class="mi">100</span><span class="p">,</span> <span class="s2">"Model does not meet the latency SLO."</span>
</code></pre></div>
<p>The assertion above did not fail, so the model meets the requirement. This is an example of a way to encode an SLO for the model so that it is checked programatically. We can add code like this to the training code of a model so that we always check the SLO right after a model is trained. If the requirement is not met, the assert statement will cause the notebook to stop executing immediately.</p>
<p>We've profiled the model and this provided us with some information about it's performance, however a real load test can only be performed on the model when it is deployed. The reason for this is that in the real world, the users of the model will be accessing the model concurrently, in the example we just did the model was making predictions serially and was not used by many users at the same time. The model was also running in the local memory of the computer, while in a real model deployment there would be a RESTful service working around it, and the model would be accessed through the network.</p>
<h2>Creating the Model Service</h2>
<p>Now that we have profiled the model, we can deploy the model inside of a RESTful service and do a load test on it. To do this, we'll use the <a href="https://pypi.org/project/rest-model-service/">rest_model_service package</a> to quickly create a RESTful service. You can learn more about this package in <a href="https://www.tekhnoal.com/rest-model-service.html">this blog post</a>.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">rest_model_service</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To create a service for our model, all that is needed is that we add a YAML configuration file to the project. The configuration file looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Insurance Charges Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">qualified_name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model</span><span class="w"></span>
<span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model.prediction.model.InsuranceChargesModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
</code></pre></div>
<p>This YAML file is in the "configuration" folder of the project repository.</p>
<p>The service_title field is the name of the service as it will appear in the documentation. The models field is an array that contains the details of the models we would like to deploy in the service. The class_path points at the MLModel class that implement's the model's prediction logic, in this case we'll be using the same model as in the examples above. </p>
<p>To run the service locally, execute these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>./configuration/local_rest_config.yaml
uvicorn rest_model_service.main:app --reload
</code></pre></div>
<p>The service should come up and can be accessed in a web browser at http://127.0.0.1:8000. When you access that URL using a web browser you will be redirected to the documentation page that is generated by the FastAPI package. The documentation looks like this:</p>
<p><img alt="Service Documentation" src="https://www.tekhnoal.com/service_documentation_ltfmlm.png" width="100%"></p>
<p>As you can see the Insurance Charges Model got it's own endpoint.</p>
<p>We can try out the service with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:8000/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 42, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">female</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 24.0, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 2, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: false, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">northwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":8640.78}
</code></pre></div>
<p>By accessing the model's endpoint we were able to make a prediction. We got exactly the same prediction as when we installed the model in the example above.</p>
<p>By using the MLModel base class provided by the ml_base package and the REST service framework provided by the rest_model_service package we're able to quickly stand up a service to host the model. </p>
<h2>Creating a Load Testing Script</h2>
<p>To create a load testing script, we'll use the <a href="https://locust.io/">locust package</a>. We'll install the package with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">locust</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>In order to run a load test with locust, we need to define what requests the locust package will make to the model service. To do this we need to define an HttpUser class.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">locust</span> <span class="kn">import</span> <span class="n">HttpUser</span><span class="p">,</span> <span class="n">constant_throughput</span><span class="p">,</span> <span class="n">task</span>
<span class="kn">from</span> <span class="nn">faker</span> <span class="kn">import</span> <span class="n">Faker</span>
<span class="k">class</span> <span class="nc">ModelServiceUser</span><span class="p">(</span><span class="n">HttpUser</span><span class="p">):</span>
<span class="n">wait_time</span> <span class="o">=</span> <span class="n">constant_throughput</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="nd">@task</span>
<span class="k">def</span> <span class="nf">post_prediction</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">faker</span> <span class="o">=</span> <span class="n">Faker</span><span class="p">()</span>
<span class="n">record</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"age"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">18</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">65</span><span class="p">),</span>
<span class="s2">"sex"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_choices</span><span class="p">(</span><span class="n">elements</span><span class="o">=</span><span class="p">(</span><span class="s2">"male"</span><span class="p">,</span> <span class="s2">"female"</span><span class="p">),</span> <span class="n">length</span><span class="o">=</span><span class="mi">1</span><span class="p">)[</span><span class="mi">0</span><span class="p">],</span>
<span class="s2">"bmi"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">15000</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">50000</span><span class="p">)</span> <span class="o">/</span> <span class="mf">1000.0</span><span class="p">,</span>
<span class="s2">"children"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">5</span><span class="p">),</span>
<span class="s2">"smoker"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">boolean</span><span class="p">(),</span>
<span class="s2">"region"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_choices</span><span class="p">(</span>
<span class="n">elements</span><span class="o">=</span><span class="p">(</span><span class="s2">"southwest"</span><span class="p">,</span> <span class="s2">"southeast"</span><span class="p">,</span> <span class="s2">"northwest"</span><span class="p">,</span> <span class="s2">"northeast"</span><span class="p">),</span> <span class="n">length</span><span class="o">=</span><span class="mi">1</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
<span class="p">}</span>
<span class="bp">self</span><span class="o">.</span><span class="n">client</span><span class="o">.</span><span class="n">post</span><span class="p">(</span><span class="s2">"/api/models/insurance_charges_model/prediction"</span><span class="p">,</span> <span class="n">json</span><span class="o">=</span><span class="n">record</span><span class="p">)</span>
</code></pre></div>
<p>The class above makes a single request to the prediction endpoint in the model service, generating a random sample using the same code that we used to profile the model above. The load test consists of a single task that will be executed over and over, but we can easily add other tasks if we wanted to use the model in different ways. The wait_time attribute of the class is set to a constant throughout of 1, which means that each task will be executed at most 1 time per second by each simulated user in the load test. We can use this throuput and the number of concurrent users to create a realistic load test profile.</p>
<p>The code above is saved in the load_test.py file in the tests folder in the repository. We can launch a load test with this command:</p>
<div class="highlight"><pre><span></span><code>locust -f tests/load_test.py
</code></pre></div>
<p>The load test process starts up a web app that can be accessed locally on http://127.0.0.1:8089.</p>
<p><img alt="Locust UI" src="https://www.tekhnoal.com/locust_ui_ltfmlm.png" width="100%"></p>
<p>To start a load test, the locust web app asks for the number of users to simulate, the spawn rate of users, and the base url of the service to send requests to. We set the number of users to 1, the spawn rate to 1 per second, and the url to the service instance that is currently running on the local host.</p>
<p>When we click on the "Start swarming" button, the load test starts and we can see this screen:</p>
<p><img alt="Locust Load Test" src="https://www.tekhnoal.com/locust_load_test_ltfmlm.png" width="100%"></p>
<p>The load test is running and sending requests to the model service at the rate of one request per second from one user. The web UI also shows some charts in a separate tab in the UI, for example the total requests per second:</p>
<p><img alt="Total Requests Per Second" src="https://www.tekhnoal.com/total_requests_per_second_ltfmlm.png" width="100%"></p>
<p>The response time is milliseconds:</p>
<p><img alt="Response Time in Milliseconds" src="https://www.tekhnoal.com/response_time_in_milliseconds_ltfmlm.png" width="100%"></p>
<p>And the number of users:</p>
<p><img alt="Number Of Users" src="https://www.tekhnoal.com/number_of_users_ltfmlm.png" width="100%"></p>
<p>When we're ready to stop the load test, we can click on the "Stop" button in the upper right corner.</p>
<p>Determining whether the model service meets the SLO is as simple as inspecting the "Statistics" tab.</p>
<p><img alt="Statistics Tab" src="https://www.tekhnoal.com/statistics_ltfmlm.png" width="100%"></p>
<p>We can see that the maximum latency of the prediction request was 122 milliseconds, which does not meet our SLO of 100 ms. However, using the max is often a noisy measurement because it can be affected by many different environmental factors. It's better to use the 90th or 99th percentile. In this case the 99th percentile is 89 ms, which does meet our SLO.</p>
<p>This load test is not very realistic because it only has one concurrent user. In the next load tests, we'll add more concurrent users to make it more realistic.</p>
<h2>Adding Shape to the Load Test</h2>
<p>Right now the load test script is able to simulate one concurrent user making one request to the service per second. This is a good place to start, but we should test the service with more users. The load test is also designed to run indefinitely with the same number of users. We will add "shape" to the load test by raising the number of users over of time and then lowering the number of users back down. This will show us the performance of the service over many load conditions. We'll also stop the load test after the load test returns to the baseline, this will help us to automate the load test later.</p>
<p>To add a "shape" to the load test, we'll add a class that is a subclass of LoadTestShape to the load test file:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">locust</span> <span class="kn">import</span> <span class="n">LoadTestShape</span>
<span class="k">class</span> <span class="nc">StagesShape</span><span class="p">(</span><span class="n">LoadTestShape</span><span class="p">):</span>
<span class="sd">"""Simple load test shape class."""</span>
<span class="n">stages</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span><span class="s2">"duration"</span><span class="p">:</span> <span class="mi">30</span><span class="p">,</span> <span class="s2">"users"</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">"spawn_rate"</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"duration"</span><span class="p">:</span> <span class="mi">60</span><span class="p">,</span> <span class="s2">"users"</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="s2">"spawn_rate"</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"duration"</span><span class="p">:</span> <span class="mi">90</span><span class="p">,</span> <span class="s2">"users"</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span> <span class="s2">"spawn_rate"</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"duration"</span><span class="p">:</span> <span class="mi">120</span><span class="p">,</span> <span class="s2">"users"</span><span class="p">:</span> <span class="mi">4</span><span class="p">,</span> <span class="s2">"spawn_rate"</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"duration"</span><span class="p">:</span> <span class="mi">150</span><span class="p">,</span> <span class="s2">"users"</span><span class="p">:</span> <span class="mi">5</span><span class="p">,</span> <span class="s2">"spawn_rate"</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"duration"</span><span class="p">:</span> <span class="mi">180</span><span class="p">,</span> <span class="s2">"users"</span><span class="p">:</span> <span class="mi">4</span><span class="p">,</span> <span class="s2">"spawn_rate"</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"duration"</span><span class="p">:</span> <span class="mi">210</span><span class="p">,</span> <span class="s2">"users"</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span> <span class="s2">"spawn_rate"</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"duration"</span><span class="p">:</span> <span class="mi">240</span><span class="p">,</span> <span class="s2">"users"</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="s2">"spawn_rate"</span><span class="p">:</span> <span class="mi">1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"duration"</span><span class="p">:</span> <span class="mi">270</span><span class="p">,</span> <span class="s2">"users"</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">"spawn_rate"</span><span class="p">:</span> <span class="mi">1</span><span class="p">}</span>
<span class="p">]</span>
<span class="k">def</span> <span class="nf">tick</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">run_time</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">get_run_time</span><span class="p">()</span>
<span class="k">for</span> <span class="n">stage</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">stages</span><span class="p">:</span>
<span class="k">if</span> <span class="n">run_time</span> <span class="o"><</span> <span class="n">stage</span><span class="p">[</span><span class="s2">"duration"</span><span class="p">]:</span>
<span class="n">tick_data</span> <span class="o">=</span> <span class="p">(</span><span class="n">stage</span><span class="p">[</span><span class="s2">"users"</span><span class="p">],</span> <span class="n">stage</span><span class="p">[</span><span class="s2">"spawn_rate"</span><span class="p">])</span>
<span class="k">return</span> <span class="n">tick_data</span>
<span class="c1"># returning None to stop the load test</span>
<span class="k">return</span> <span class="kc">None</span>
</code></pre></div>
<p>The tick() method is called once per second by the locust framework to determine the number of users needed and how fast to spawn the users. The tick() method looks up the desired number of users and spawn rate from the stages list. The tick() method simply iterates through the list until it finds the correct stage to use based on the number of elapsed seconds since the beginning of the load test. We defined 9 stages in the stages list, with each stage taking 30 seconds, the max number of concurrent users will be 5.</p>
<p>To run the load test, simply execute the same command as above:</p>
<div class="highlight"><pre><span></span><code>locust -f tests/load_test.py
</code></pre></div>
<p>The load test will start when we press the "Start swarming" button, as before. However, this load test will vary the number of users according to the shape defined in the class. Since the number of users and spawn rate is determined by the shape class, we dont need to provide these to start the load test.</p>
<p>The load test runs for six minutes and the number of users chart looks like this:</p>
<p><img alt="Number Of Users" src="https://www.tekhnoal.com/shaped_number_of_users_ltfmlm.png" width="100%"></p>
<p>The response time chart looks like this:</p>
<p><img alt="Response Time" src="https://www.tekhnoal.com/shaped_response_time_ltfmlm.png" width="100%"></p>
<p>The response time of the service definitely suffered when the number of users went above 1, and the maximum response time of the service was 225 ms. It looks like a single instance of the model service cannot handle much more than 1 concurrent users making one request per second.</p>
<p>The requests per second chart looks like this:</p>
<p><img alt="Requests Per Second" src="https://www.tekhnoal.com/shaped_requests_per_second_ltfmlm.png" width="100%"></p>
<p>The number of requests per second scaled with the number of users because we're making one request per second per user.</p>
<h2>Adding Service Level Objectives</h2>
<p>Right now, the load test script simply runs the load test and displays the results on a webpage. However, we can make it more useful by adding support for SLOs. For example, we can have the load test fail if the latency of any request is above a certain threshold, or if the average latency of all requests is above a certain threshold.</p>
<p>We'll add support for checking the following SLOs:
- latency, we'll check that the latency at the 99th percentile is less than 100 ms
- error rate, we'll check that there are no errors returned on any request
- throughput, we'll check that the service can handle at least 5 requests per second</p>
<p>To do this we'll add a listener function that receives events from the locust package:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">logging</span>
<span class="n">logger</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">getLogger</span><span class="p">(</span><span class="vm">__name__</span><span class="p">)</span>
<span class="nd">@events</span><span class="o">.</span><span class="n">test_stop</span><span class="o">.</span><span class="n">add_listener</span>
<span class="k">def</span> <span class="nf">on_test_stop</span><span class="p">(</span><span class="n">environment</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="n">process_exit_code</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">max_requests_per_second</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span>
<span class="p">[</span><span class="n">requests_per_second</span> <span class="k">for</span> <span class="n">requests_per_second</span> <span class="ow">in</span> <span class="n">environment</span><span class="o">.</span><span class="n">stats</span><span class="o">.</span><span class="n">total</span><span class="o">.</span><span class="n">num_reqs_per_sec</span><span class="o">.</span><span class="n">values</span><span class="p">()])</span>
<span class="k">if</span> <span class="n">environment</span><span class="o">.</span><span class="n">stats</span><span class="o">.</span><span class="n">total</span><span class="o">.</span><span class="n">fail_ratio</span> <span class="o">></span> <span class="mf">0.0</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s2">"Test failed because there was one or more errors."</span><span class="p">)</span>
<span class="n">process_exit_code</span> <span class="o">=</span> <span class="mi">1</span>
<span class="k">if</span> <span class="n">environment</span><span class="o">.</span><span class="n">stats</span><span class="o">.</span><span class="n">total</span><span class="o">.</span><span class="n">get_response_time_percentile</span><span class="p">(</span><span class="mf">0.99</span><span class="p">)</span> <span class="o">></span> <span class="mi">100</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s2">"Test failed because the response time at the 99th percentile was above 100 ms. The 99th "</span>
<span class="s2">"percentile latency is '</span><span class="si">{}</span><span class="s2">'."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">environment</span><span class="o">.</span><span class="n">stats</span><span class="o">.</span><span class="n">total</span><span class="o">.</span><span class="n">get_response_time_percentile</span><span class="p">(</span><span class="mf">0.99</span><span class="p">)))</span>
<span class="n">process_exit_code</span> <span class="o">=</span> <span class="mi">1</span>
<span class="k">if</span> <span class="n">max_requests_per_second</span> <span class="o"><</span> <span class="mi">5</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span>
<span class="s2">"Test failed because the max requests per second never reached 5. The max requests per second "</span>
<span class="s2">"is: '</span><span class="si">{}</span><span class="s2">'."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">max_requests_per_second</span><span class="p">))</span>
<span class="n">process_exit_code</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">environment</span><span class="o">.</span><span class="n">process_exit_code</span> <span class="o">=</span> <span class="n">process_exit_code</span>
</code></pre></div>
<p>The on_test_quitting function is going to execute at the end of every load test. This function can access all of the statistics saved by the load test, we can check the different conditions by accessing the statistics. If any of the SLOs are not met, we set the process exit code to be 1, which signals a failure to the operating system.</p>
<p>To run the load test, execute the same command as above. When the load test finishes, the process will output the results to the command line. In this case the load test failed with this output:</p>
<div class="highlight"><pre><span></span><code>Test failed because the response <span class="nb">time</span> at the 99th percentile was above <span class="m">100</span> ms. The 99th percentile latency is <span class="s1">'180.0'</span>.
</code></pre></div>
<h2>Running a Headless Load Test</h2>
<p>The locust package can also run load test without the web UI. This is useful for doing automated load tests that run in a server, without anyone watching the UI. The command is:</p>
<div class="highlight"><pre><span></span><code>locust -f tests/load_test.py --host<span class="o">=</span>http://127.0.0.1:8000 --headless --loglevel ERROR --csv<span class="o">=</span>./load_test_report/load_test --html ./load_test_report/load_test_report.html
</code></pre></div>
<p>Once the test finishes, we see the same error as above because the load test did not meet the SLO required. The error message is:</p>
<div class="highlight"><pre><span></span><code>Test failed because the response time at the 99th percentile was above 100 ms. The 99th percentile latency is '180.0'.
</code></pre></div>
<p>All of the code for the load test script is found in the "test/load_test.py" file in the repository for this blog post. The results are stored in CSV files and an HTML file in the "load_test_report" folder.</p>
<h2>Building a Docker Image</h2>
<p>Now that we have a working model and model service, we'll need to deploy it somewhere. We'll start by deploying the service locally using Docker. </p>
<p>Let's create a docker image and run it locally. The docker image is generated using instructions in the Dockerfile:</p>
<div class="highlight"><pre><span></span><code><span class="c"># syntax=docker/dockerfile:1</span>
<span class="k">FROM</span><span class="w"> </span><span class="s">python:3.9-slim</span>
<span class="k">ARG</span><span class="w"> </span>BUILD_DATE
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.title<span class="o">=</span><span class="s2">"Load Tests for ML Models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.description<span class="o">=</span><span class="s2">"Load tests for machine learning models."</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.created<span class="o">=</span><span class="nv">$BUILD_DATE</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.authors<span class="o">=</span><span class="s2">"6666331+schmidtbri@users.noreply.github.com"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.source<span class="o">=</span><span class="s2">"https://github.com/schmidtbri/load-tests-for-ml-models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.version<span class="o">=</span><span class="s2">"0.1.0"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.licenses<span class="o">=</span><span class="s2">"MIT License"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.base.name<span class="o">=</span><span class="s2">"python:3.9-slim"</span>
<span class="k">WORKDIR</span><span class="w"> </span><span class="s">/service</span>
<span class="k">ARG</span><span class="w"> </span><span class="nv">USERNAME</span><span class="o">=</span>service-user
<span class="k">ARG</span><span class="w"> </span><span class="nv">USER_UID</span><span class="o">=</span><span class="m">1000</span>
<span class="k">ARG</span><span class="w"> </span><span class="nv">USER_GID</span><span class="o">=</span><span class="m">1000</span>
<span class="k">RUN</span><span class="w"> </span>apt-get update
<span class="c"># create a user</span>
<span class="k">RUN</span><span class="w"> </span>groupadd --gid <span class="nv">$USER_GID</span> <span class="nv">$USERNAME</span> <span class="se">\</span>
<span class="o">&&</span> useradd --uid <span class="nv">$USER_UID</span> --gid <span class="nv">$USER_GID</span> -m <span class="nv">$USERNAME</span> <span class="se">\</span>
<span class="o">&&</span> apt-get install --assume-yes --no-install-recommends sudo <span class="se">\</span>
<span class="o">&&</span> <span class="nb">echo</span> <span class="nv">$USERNAME</span> <span class="nv">ALL</span><span class="o">=</span><span class="se">\(</span>root<span class="se">\)</span> NOPASSWD:ALL > /etc/sudoers.d/<span class="nv">$USERNAME</span> <span class="se">\</span>
<span class="o">&&</span> chmod <span class="m">0440</span> /etc/sudoers.d/<span class="nv">$USERNAME</span>
<span class="c"># installing git because we need to install the model package from it's own github repository</span>
<span class="k">RUN</span><span class="w"> </span>apt-get install --assume-yes --no-install-recommends git
<span class="k">RUN</span><span class="w"> </span>apt-get clean <span class="se">\</span>
<span class="o">&&</span> rm -rf /var/lib/apt/lists/*
<span class="c"># installing dependencies first in order to speed up build by using cached layers</span>
<span class="k">COPY</span><span class="w"> </span>./service_requirements.txt ./service_requirements.txt
<span class="k">RUN</span><span class="w"> </span>pip install -r service_requirements.txt
<span class="k">COPY</span><span class="w"> </span>./configuration ./configuration
<span class="k">COPY</span><span class="w"> </span>./LICENSE ./LICENSE
<span class="k">CMD</span><span class="w"> </span><span class="p">[</span><span class="s2">"uvicorn"</span><span class="p">,</span><span class="w"> </span><span class="s2">"rest_model_service.main:app"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--host"</span><span class="p">,</span><span class="w"> </span><span class="s2">"0.0.0.0"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--port"</span><span class="p">,</span><span class="w"> </span><span class="s2">"8000"</span><span class="p">]</span>
<span class="k">USER</span><span class="w"> </span><span class="s">$USERNAME</span>
</code></pre></div>
<p>The Dockerfile is used by this docker command to create a docker image:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">build</span> <span class="o">-</span><span class="n">t</span> <span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="n">latest</span> <span class="o">../</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make sure everything worked as expected, we'll look through the docker images in our system:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">image</span> <span class="n">ls</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">insurance_charges_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>insurance_charges_model_service latest 446f5f06805f 37 seconds ago 1.25GB
</code></pre></div>
<p>Next, we'll start the image to see if everything is working as expected.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">d</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">8000</span><span class="p">:</span><span class="mi">8000</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">REST_CONFIG</span><span class="o">=./</span><span class="n">configuration</span><span class="o">/</span><span class="n">local_rest_config</span><span class="o">.</span><span class="n">yaml</span> \
<span class="o">--</span><span class="n">name</span> <span class="n">insurance_charges_model_service</span> \
<span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="n">latest</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">44</span><span class="n">c4794160f941e44d1670b70c7fd5722c41bf0c2e470a0b0c8648c966b9923b</span><span class="w"></span>
</code></pre></div>
<p>The service should be accessible on port 8000 of localhost, so we'll try to make a prediction:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:8000/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 42, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">female</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 24.0, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 2, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: false, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">northwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":8640.78}
</code></pre></div>
<p>We'll use the model service Docker image to deploy the model service and automate the load test later.</p>
<p>Now that we're done with the local redis instance we'll stop and remove the docker container.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">insurance_charges_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">insurance_charges_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>insurance_charges_model_service
insurance_charges_model_service
</code></pre></div>
<h2>Deploying the Model Service</h2>
<p>To show the system in action, we’ll deploy the service and the Redis instance to a Kubernetes cluster. A local cluster can be easily started by using <a href="https://minikube.sigs.k8s.io/docs/">minikube</a>. Installation instructions can be found <a href="https://minikube.sigs.k8s.io/docs/start/">here</a>.</p>
<p>To start the minikube cluster execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">start</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>😄 <span class="nv">minikube</span> <span class="nv">v1</span>.<span class="mi">26</span>.<span class="mi">1</span> <span class="nv">on</span> <span class="nv">Darwin</span> <span class="mi">12</span>.<span class="mi">5</span>
✨ <span class="nv">Using</span> <span class="nv">the</span> <span class="nv">virtualbox</span> <span class="nv">driver</span> <span class="nv">based</span> <span class="nv">on</span> <span class="nv">existing</span> <span class="nv">profile</span>
👍 <span class="nv">Starting</span> <span class="nv">control</span> <span class="nv">plane</span> <span class="nv">node</span> <span class="nv">minikube</span> <span class="nv">in</span> <span class="nv">cluster</span> <span class="nv">minikube</span>
🔄 <span class="nv">Restarting</span> <span class="nv">existing</span> <span class="nv">virtualbox</span> <span class="nv">VM</span> <span class="k">for</span> <span class="s2">"</span><span class="s">minikube</span><span class="s2">"</span> ...
🐳 <span class="nv">Preparing</span> <span class="nv">Kubernetes</span> <span class="nv">v1</span>.<span class="mi">24</span>.<span class="mi">3</span> <span class="nv">on</span> <span class="nv">Docker</span> <span class="mi">20</span>.<span class="mi">10</span>.<span class="mi">17</span> ...[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>
▪ <span class="nv">controller</span><span class="o">-</span><span class="nv">manager</span>.<span class="nv">horizontal</span><span class="o">-</span><span class="nv">pod</span><span class="o">-</span><span class="nv">autoscaler</span><span class="o">-</span><span class="nv">sync</span><span class="o">-</span><span class="nv">period</span><span class="o">=</span><span class="mi">5</span><span class="nv">s</span>
🔎 <span class="nv">Verifying</span> <span class="nv">Kubernetes</span> <span class="nv">components</span>...
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">k8s</span>.<span class="nv">gcr</span>.<span class="nv">io</span><span class="o">/</span><span class="nv">metrics</span><span class="o">-</span><span class="nv">server</span><span class="o">/</span><span class="nv">metrics</span><span class="o">-</span><span class="nv">server</span>:<span class="nv">v0</span>.<span class="mi">6</span>.<span class="mi">1</span>
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">gcr</span>.<span class="nv">io</span><span class="o">/</span><span class="nv">k8s</span><span class="o">-</span><span class="nv">minikube</span><span class="o">/</span><span class="nv">storage</span><span class="o">-</span><span class="nv">provisioner</span>:<span class="nv">v5</span>
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">kubernetesui</span><span class="o">/</span><span class="nv">dashboard</span>:<span class="nv">v2</span>.<span class="mi">6</span>.<span class="mi">0</span>
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">kubernetesui</span><span class="o">/</span><span class="nv">metrics</span><span class="o">-</span><span class="nv">scraper</span>:<span class="nv">v1</span>.<span class="mi">0</span>.<span class="mi">8</span>
🌟 <span class="nv">Enabled</span> <span class="nv">addons</span>: <span class="nv">dashboard</span>
🏄 <span class="nv">Done</span><span class="o">!</span> <span class="nv">kubectl</span> <span class="nv">is</span> <span class="nv">now</span> <span class="nv">configured</span> <span class="nv">to</span> <span class="nv">use</span> <span class="s2">"</span><span class="s">minikube</span><span class="s2">"</span> <span class="nv">cluster</span> <span class="nv">and</span> <span class="s2">"</span><span class="s">default</span><span class="s2">"</span> <span class="nv">namespace</span> <span class="nv">by</span> <span class="nv">default</span>
</code></pre></div>
<p>We'll need to use the <a href="https://github.com/kubernetes/dashboard">Kubernetes Dashboard</a> to view details about the model service. We can start it up in the minikube cluster with this command:</p>
<div class="highlight"><pre><span></span><code>minikube dashboard --url
</code></pre></div>
<p>The command starts up a proxy that must keep running in order to forward the traffic to the dashboard UI in the minikube cluster.</p>
<p>The dashboard UI looks like this:</p>
<p><img alt="Kubernetes Dashboard" src="https://www.tekhnoal.com/kubernetes_dashboard_ltfmlm.png" width="100%"></p>
<p>We'll also need to use the <a href="https://github.com/kubernetes-sigs/metrics-server#readme">metrics server</a> in Kubernetes. We can enable that in minikube with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">addons</span> <span class="n">enable</span> <span class="n">metrics</span><span class="o">-</span><span class="n">server</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>💡 <span class="nv">metrics</span><span class="o">-</span><span class="nv">server</span> <span class="nv">is</span> <span class="nv">an</span> <span class="nv">addon</span> <span class="nv">maintained</span> <span class="nv">by</span> <span class="nv">Kubernetes</span>. <span class="k">For</span> <span class="nv">any</span> <span class="nv">concerns</span> <span class="nv">contact</span> <span class="nv">minikube</span> <span class="nv">on</span> <span class="nv">GitHub</span>.
<span class="nv">You</span> <span class="nv">can</span> <span class="nv">view</span> <span class="nv">the</span> <span class="nv">list</span> <span class="nv">of</span> <span class="nv">minikube</span> <span class="nv">maintainers</span> <span class="nv">at</span>: <span class="nv">https</span>:<span class="o">//</span><span class="nv">github</span>.<span class="nv">com</span><span class="o">/</span><span class="nv">kubernetes</span><span class="o">/</span><span class="nv">minikube</span><span class="o">/</span><span class="nv">blob</span><span class="o">/</span><span class="nv">master</span><span class="o">/</span><span class="nv">OWNERS</span>
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">k8s</span>.<span class="nv">gcr</span>.<span class="nv">io</span><span class="o">/</span><span class="nv">metrics</span><span class="o">-</span><span class="nv">server</span><span class="o">/</span><span class="nv">metrics</span><span class="o">-</span><span class="nv">server</span>:<span class="nv">v0</span>.<span class="mi">6</span>.<span class="mi">1</span>
🌟 <span class="nv">The</span> <span class="s1">'</span><span class="s">metrics-server</span><span class="s1">'</span> <span class="nv">addon</span> <span class="nv">is</span> <span class="nv">enabled</span>
</code></pre></div>
<p>Let's view all of the pods running in the minikube cluster to make sure we can connect.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">-</span><span class="n">A</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d4b75cb6d-wrrwr 1/1 Running 16 (22h ago) 23d
kube-system etcd-minikube 1/1 Running 16 (22h ago) 23d
kube-system kube-apiserver-minikube 1/1 Running 16 (22h ago) 23d
kube-system kube-controller-manager-minikube 1/1 Running 2 (22h ago) 24h
kube-system kube-proxy-5n4t9 1/1 Running 15 (22h ago) 23d
kube-system kube-scheduler-minikube 1/1 Running 14 (22h ago) 23d
kube-system metrics-server-8595bd7d4c-ptcsp 1/1 Running 12 (22h ago) 4d2h
kube-system storage-provisioner 1/1 Running 25 (24s ago) 23d
kubernetes-dashboard dashboard-metrics-scraper-78dbd9dbf5-xslpl 1/1 Running 8 (22h ago) 4d2h
kubernetes-dashboard kubernetes-dashboard-5fd5574d9f-vbtnd 1/1 Running 10 (22h ago) 4d2h
</code></pre></div>
<p>The pods running the kubernetes dashboard and metrics server appear in the kube-system and kubernetes-dashboard namespaces.</p>
<h3>Creating a Kubernetes Namespace</h3>
<p>Now that we have a cluster and are connected to it, we'll create a namespace to hold the resources for our model deployment. The resource definition is in the kubernetes/namespace.yaml file. To apply the manifest to the cluster, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace/model-services created
</code></pre></div>
<p>To take a look at the namespaces, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">namespace</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME STATUS AGE
default Active 23d
kube-node-lease Active 23d
kube-public Active 23d
kube-system Active 23d
kubernetes-dashboard Active 4d2h
model-services Active 1s
</code></pre></div>
<p>The new namespace appears in the listing along with other namespaces created by default by the system. To use the new namespace for the rest of the operations, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">config</span> <span class="nb">set</span><span class="o">-</span><span class="n">context</span> <span class="o">--</span><span class="n">current</span> <span class="o">--</span><span class="n">namespace</span><span class="o">=</span><span class="n">model</span><span class="o">-</span><span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Context "minikube" modified.
</code></pre></div>
<h3>Creating a Model Deployment and Service</h3>
<p>The model service is deployed by using Kubernetes resources. These are:</p>
<ul>
<li>ConfigMap: a set of configuration options, in this case it is a simple YAML file that will be loaded into the running container as a volume mount. This resource allows us to change the configuration of the model service without having to modify the Docker image.</li>
<li>Deployment: a declarative way to manage a set of pods, the model service pods are managed through the Deployment.</li>
<li>Service: a way to expose a set of pods in a Deployment, the model services is made available to the outside world through the Service, the service type is LoadBalancer which means that a load balancer will be created for the service.</li>
</ul>
<p>These resources are defined in the ./kubernetes/model_service.yaml file in the project repository.</p>
<p>To start the model service, first we'll need to send the docker image from the local docker daemon to the minikube image cache:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">image</span> <span class="n">load</span> <span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="n">latest</span>
</code></pre></div>
<p>We can view the images in the minikube cache like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">cache</span> <span class="nb">list</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">insurance_charges_model_service</span><span class="o">:</span><span class="n">latest</span><span class="w"></span>
</code></pre></div>
<p>The model service resources are created within the Kubernetes cluster with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap/model-service-configuration created
deployment.apps/insurance-charges-model-deployment created
service/insurance-charges-model-service created
</code></pre></div>
<p>Let's get the names of the pods that are running the service:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY STATUS RESTARTS AGE
insurance-charges-model-deployment-5454fc7cfb-rhl2t 1/1 Running 0 4s
</code></pre></div>
<p>To make sure the service started up correctly, we'll check the logs of the single pod running the service:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">logs</span> <span class="n">insurance</span><span class="o">-</span><span class="n">charges</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">deployment</span><span class="o">-</span><span class="mi">5454</span><span class="n">fc7cfb</span><span class="o">-</span><span class="n">rhl2t</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="o">/</span><span class="n">usr</span><span class="o">/</span><span class="n">local</span><span class="o">/</span><span class="n">lib</span><span class="o">/</span><span class="n">python3</span><span class="mf">.9</span><span class="o">/</span><span class="n">site</span><span class="o">-</span><span class="n">packages</span><span class="o">/</span><span class="n">tpot</span><span class="o">/</span><span class="n">builtins</span><span class="o">/</span><span class="fm">__init__</span><span class="o">.</span><span class="n">py</span><span class="p">:</span><span class="mi">36</span><span class="p">:</span> <span class="ne">UserWarning</span><span class="p">:</span> <span class="ne">Warning</span><span class="p">:</span> <span class="n">optional</span> <span class="n">dependency</span> <span class="err">`</span><span class="n">torch</span><span class="err">`</span> <span class="ow">is</span> <span class="ow">not</span> <span class="n">available</span><span class="o">.</span> <span class="o">-</span> <span class="n">skipping</span> <span class="kn">import</span> <span class="nn">of</span> <span class="n">NN</span> <span class="n">models</span><span class="o">.</span>
<span class="n">warnings</span><span class="o">.</span><span class="n">warn</span><span class="p">(</span><span class="s2">"Warning: optional dependency `torch` is not available. - skipping import of NN models."</span><span class="p">)</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Started</span> <span class="n">server</span> <span class="n">process</span> <span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Waiting</span> <span class="k">for</span> <span class="n">application</span> <span class="n">startup</span><span class="o">.</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Application</span> <span class="n">startup</span> <span class="n">complete</span><span class="o">.</span>
<span class="n">INFO</span><span class="p">:</span> <span class="n">Uvicorn</span> <span class="n">running</span> <span class="n">on</span> <span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="mf">0.0.0.0</span><span class="p">:</span><span class="mi">8000</span> <span class="p">(</span><span class="n">Press</span> <span class="n">CTRL</span><span class="o">+</span><span class="n">C</span> <span class="n">to</span> <span class="n">quit</span><span class="p">)</span>
</code></pre></div>
<p>Looks like the server process started correctly in the Docker container. The UserWarning is generated when we instantiate the model object, which means everything is running as expected.</p>
<p>The deployment and service for the model service were created together. You can see the new service with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
insurance-charges-model-service NodePort 10.98.168.223 <none> 80:31687/TCP 48s
</code></pre></div>
<p>Minikube exposes the service on a local port, we can get a link to the endpoint with this command:</p>
<div class="highlight"><pre><span></span><code>minikube service insurance-charges-model-service --url -n model-services
</code></pre></div>
<p>The command output this URL:</p>
<div class="highlight"><pre><span></span><code>http://192.168.59.100:31687
</code></pre></div>
<p>The command must keep running to keep the tunnel open to the running model service in the minikube cluster.</p>
<p>To make a prediction, we'll hit the service with a request:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://192.168.59.100:31687/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 65, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">male</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 22, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 5, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: true, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">southwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":25390.95}
</code></pre></div>
<p>We have the model service up and running in the local minikube cluster!</p>
<h3>Running the Load Test</h3>
<p>We can run the load test by using the IP address and port of the service running in minikube.</p>
<div class="highlight"><pre><span></span><code>locust -f tests/load_test.py --host<span class="o">=</span>http://192.168.59.100:31687 --headless --loglevel ERROR --csv<span class="o">=</span>./load_test_report/load_test --html ./load_test_report/load_test_report.html
</code></pre></div>
<p>While the load test is running, we'll check the CPU usage of the single pod running the model service every 15 seconds:</p>
<div class="highlight"><pre><span></span><code>%%bash
kubectl top pods
<span class="k">while</span> sleep <span class="m">15</span><span class="p">;</span> <span class="k">do</span>
kubectl top pods <span class="p">|</span> grep insurance-charges-model-deployment
<span class="k">done</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME CPU(cores) MEMORY(bytes)
insurance-charges-model-deployment-5454fc7cfb-rhl2t 4m 104Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 27m 104Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 27m 104Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 27m 104Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 27m 104Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 132m 105Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 132m 105Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 132m 105Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 132m 105Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 198m 107Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 198m 107Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 198m 107Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 198m 107Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 200m 107Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 200m 107Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 200m 107Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 200m 107Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 94m 107Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 94m 107Mi
insurance-charges-model-deployment-5454fc7cfb-rhl2t 94m 107Mi
Process is interrupted.
</code></pre></div>
<p>We can clearly see how the CPU usage is affected as the load goes from 1 user to 5 users. The CPU request for the deployment is 100 millicores, and the CPU usage goes as high as 200 millicores. The memory usage did not change very much based on the load.</p>
<p>The load test output this error message right before stopping:</p>
<div class="highlight"><pre><span></span><code> Test failed because the response time at the 99th percentile was above 100 ms. The 99th percentile latency is '3300.0'.
</code></pre></div>
<p>We can see that the single instance of the service running in Kubernetes is not enough to meet the requirements of the load test, and that the CPU usage is the limiting factor.</p>
<h2>Adding Autoscaling to the Model Service</h2>
<p>Kubernetes supports autoscaling, which is the ability to change the resources assigned to a service based on the current load on the service. We'll be doing horizontal scaling, which means that the number of replicas increases and decreases according to the load. Kubernetes supports this kind of autoscaling through the <a href="https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/">HorizontalAutoScaler</a> resource.</p>
<p>The HorizontalAutoScaler resource is defined like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">autoscaling/v2</span><span class="w"></span>
<span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">HorizontalPodAutoscaler</span><span class="w"></span>
<span class="nt">metadata</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance-charges-model-autoscaler</span><span class="w"></span>
<span class="nt">spec</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">scaleTargetRef</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">apps/v1</span><span class="w"></span>
<span class="w"> </span><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Deployment</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance-charges-model-deployment</span><span class="w"></span>
<span class="w"> </span><span class="nt">minReplicas</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1</span><span class="w"></span>
<span class="w"> </span><span class="nt">maxReplicas</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">10</span><span class="w"></span>
<span class="w"> </span><span class="nt">metrics</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Resource</span><span class="w"></span>
<span class="w"> </span><span class="nt">resource</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">cpu</span><span class="w"></span>
<span class="w"> </span><span class="nt">target</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Utilization</span><span class="w"></span>
<span class="w"> </span><span class="nt">averageUtilization</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">50</span><span class="w"></span>
</code></pre></div>
<p>This resource is defined in the /kubernetes/autoscaler.yaml file in the repository.</p>
<p>The HorizontalPodAutoscaler resource simply states that each pod of the deployment be kept at 50% CPU utilization. Since the pods of our service request 100 millicores, the autoscaler controller will step in whenever the CPU usage goes above 50 millicores and add a replica to the deployment.</p>
<p>We can deploy the HorizontalPodAutoscaler resource with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">autoscaler</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>horizontalpodautoscaler.autoscaling/insurance-charges-model-autoscaler created
</code></pre></div>
<p>We can view the number of replicas in the Deployment in the Kubernetes Dashboard:</p>
<p><img alt="Kubernetes Deployments" src="https://www.tekhnoal.com/deployments_ltfmlm.png" width="100%"></p>
<p>The deployment currently has 1 pod, with 1 requested pod.</p>
<p>We can also see the HorizontalPodAutoscaler:</p>
<p><img alt="Kubernetes HPA" src="https://www.tekhnoal.com/hpa_ltfmlm.png" width="100%"></p>
<p>The number of replicas is currently set to 1, the autoscaler will increase and decrease this number automatically.</p>
<p>Let's try running the load test with more concurrent users and see if we can trigger an autoscaling event.</p>
<div class="highlight"><pre><span></span><code>locust -f tests/load_test.py --host<span class="o">=</span>http://192.168.59.100:31687 --headless --loglevel ERROR --csv<span class="o">=</span>./load_test_report/load_test --html ./load_test_report/load_test_report.html
</code></pre></div>
<p>While it's running, let's watch the deployment for the number of replicas:</p>
<div class="highlight"><pre><span></span><code>%%bash
kubectl get deployment insurance-charges-model-deployment
<span class="k">while</span> sleep <span class="m">15</span><span class="p">;</span> <span class="k">do</span>
kubectl get deployment insurance-charges-model-deployment <span class="p">|</span> grep insurance-charges-model-deployment
<span class="k">done</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY UP-TO-DATE AVAILABLE AGE
insurance-charges-model-deployment 1/1 1 1 14m
insurance-charges-model-deployment 1/1 1 1 14m
insurance-charges-model-deployment 1/1 1 1 15m
insurance-charges-model-deployment 2/2 2 2 15m
insurance-charges-model-deployment 2/2 2 2 15m
insurance-charges-model-deployment 2/2 2 2 15m
insurance-charges-model-deployment 2/2 2 2 16m
insurance-charges-model-deployment 4/4 4 4 16m
insurance-charges-model-deployment 4/4 4 4 16m
insurance-charges-model-deployment 4/4 4 4 16m
insurance-charges-model-deployment 4/4 4 4 17m
insurance-charges-model-deployment 6/6 6 6 17m
insurance-charges-model-deployment 6/6 6 6 17m
insurance-charges-model-deployment 6/6 6 6 17m
insurance-charges-model-deployment 6/6 6 6 18m
insurance-charges-model-deployment 6/6 6 6 18m
insurance-charges-model-deployment 6/6 6 6 18m
insurance-charges-model-deployment 6/6 6 6 18m
insurance-charges-model-deployment 6/6 6 6 19m
insurance-charges-model-deployment 6/6 6 6 19m
insurance-charges-model-deployment 6/6 6 6 19m
insurance-charges-model-deployment 6/6 6 6 19m
insurance-charges-model-deployment 6/6 6 6 20m
insurance-charges-model-deployment 6/6 6 6 20m
insurance-charges-model-deployment 6/6 6 6 20m
insurance-charges-model-deployment 6/6 6 6 20m
insurance-charges-model-deployment 6/6 6 6 21m
insurance-charges-model-deployment 6/6 6 6 21m
Process is interrupted.
</code></pre></div>
<p>The increasing caused the number of replicas to go up to 6.</p>
<p>Autoscaling can be triggered by using other metrics, such as memory usage. Autoscaling can ensure that a service can scale to meet the current needs of the clients of the system.</p>
<h2>Deleting the Resources</h2>
<p>Now that we're done with the service we need to destroy the resources. </p>
<p>To delete the service autoscaler, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">autoscaler</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>horizontalpodautoscaler.autoscaling "insurance-charges-model-autoscaler" deleted
</code></pre></div>
<p>To delete the model service, we'll execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>configmap "model-service-configuration" deleted
deployment.apps "insurance-charges-model-deployment" deleted
service "insurance-charges-model-service" deleted
</code></pre></div>
<p>To delete the namespace:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace "model-services" deleted
</code></pre></div>
<p>Lastly, to stop the kubernetes cluster, execute these commands:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">stop</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>✋ Stopping node "minikube" ...
🛑 1 node stopped.
</code></pre></div>
<h2>Closing</h2>
<p>In this blog post we showed how to create a load testing script for a machine learning model that is deployed within a RESTful service. The load testing script is able to generate random inputs for the model. We also showed how to add a shape to the load test in order to simplify load testing and how to add SLOs to the load testing script so that we can quickly tell if the model and model service are able to meet the requirements of the deployment. Lastly, we deployed the model service to a Kubernetes and showed how to implement autoscaling so that the model service can meet the SLO adaptively.</p>Caching for ML Model Deployments2022-08-10T07:00:00-05:002022-08-10T07:00:00-05:00Brian Schmidttag:www.tekhnoal.com,2022-08-10:/caching-for-ml-models.html<p>In a software system, a <a href="https://en.wikipedia.org/wiki/Cache_(computing)">cache</a> is a data store that is used to temporarily store computation results or frequently-accessed data. When accessing the results of a computation from a cache, we are able to avoid paying the cost of recomputing the result. When accessing a frequently accessed piece of data we are able to avoid paying the cost of accessing the data from a slower data store. This type of caching is used when accessing data from a slower data store than the cache. When a cache hit occurs, the data being sought is found and returned to the caller. When a “miss” occurs, the data is not found and must be recomputed or accessed from the slower data store by the caller. A data cache is generally built using storage that has low latency, which means that it is more expensive to run. Machine learning model deployments can benefit from caching because making predictions with a model can be a CPU-intensive process, especially for large and complex models. Predictions that take a long time to make can be cached and returned later when the same prediction is requested. This type of caching is also known as <a href="https://en.wikipedia.org/wiki/Memoization">memoization</a>. Another reason that a prediction can take a long time to create is if data enrichment is needed. Data enrichment is the process of adding fields to a model's input from a data store before a prediction is made, this process can add latency to the prediction and can benefit from caching.</p><h1>Caching for ML Model Deployments</h1>
<p>In a <a href="https://www.tekhnoal.com/ml-model-decorators.html">previous blog post</a> we introduced the decorator pattern for ML model deployments and then showed how to use the pattern to build extensions to a normal model deployment. For example, in <a href="https://www.tekhnoal.com/data-enrichment-for-ml-models.html">this blog post</a> we added data enrichment to a deployed model. This extension was added without having to modify the machine learning model code at all, we were able to do it by using the decorator pattern. In this blog post we’ll add caching functionality to a model in the same way.</p>
<p>This blog post was written in a Jupyter notebook, some of the code and commands found in it reflects this.</p>
<h2>Introduction</h2>
<p>In a software system, a <a href="https://en.wikipedia.org/wiki/Cache_(computing)">cache</a> is a data store that is used to temporarily store computation results or frequently-accessed data. When accessing the results of a computation from a cache, we are able to avoid paying the cost of recomputing the result. When accessing a frequently accessed piece of data we are able to avoid paying the cost of accessing the data from a slower data store. This type of caching is used when accessing data from a slower data store than the cache. When a cache hit occurs, the data being sought is found and returned to the caller. When a “miss” occurs, the data is not found and must be recomputed or accessed from the slower data store by the caller. A data cache is generally built using storage that has low latency, which means that it is more expensive to run. </p>
<p>Machine learning model deployments can benefit from caching because making predictions with a model can be a CPU-intensive process, especially for large and complex models. Predictions that take a long time to make can be cached and returned later when the same prediction is requested. This type of caching is also known as <a href="https://en.wikipedia.org/wiki/Memoization">memoization</a>. Another reason that a prediction can take a long time to create is if data enrichment is needed. Data enrichment is the process of adding fields to a model's input from a data store before a prediction is made, this process can add latency to the prediction and can benefit from caching.</p>
<p>In order to enable prediction caching possible from ML models, we need to make sure that the model produces deterministic predictions. Determinism is a property of algorithms that says that the algorithm will always return the same output for the same input. If the model for which we want to cache predictions returns a different prediction for the same inputs, then we wouldn’t be able to cache the predictions at all since we wouldn’t be able to guarantee that the model would return the same prediction that we had cached.</p>
<p>In this blog post, we’ll show how to create a simple decorator that is able to cache predictions for an ML model that is deployed to a production system. We'll also show how to deploy the decorator along with the model to a RESTful service.</p>
<p>All of the code is available in this <a href="https://github.com/schmidtbri/caching-for-ml-models">github repository</a>.</p>
<h2>Software Architecture</h2>
<p><img alt="Software Architecture" src="https://www.tekhnoal.com/software_architecture_cfmlm.png" width="100%"></p>
<p>For caching predictions, we’ll be using <a href="https://en.wikipedia.org/wiki/Redis">Redis</a>. Redis is a data structure store that allows users to save and modify data structures in a remote service. This allows many clients to safely access the same data from a centralized service. Redis supports many different data structures, but we’ll be using the key-value store functionality to save our predictions.</p>
<h2>Installing the Model</h2>
<p>To make this blog post a little shorter we won't train a completely new model. Instead we'll install a model that we've <a href="https://www.tekhnoal.com/regression-model.html">built in a previous blog post</a>. The code for the model is in <a href="https://github.com/schmidtbri/regression-model">this github repository</a>.</p>
<p>To install the model, we can use the pip command and point it at the github repo of the model.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">clear_output</span>
<span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">Markdown</span> <span class="k">as</span> <span class="n">md</span>
<span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="o">-</span><span class="n">e</span> <span class="n">git</span><span class="o">+</span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">github</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">schmidtbri</span><span class="o">/</span><span class="n">regression</span><span class="o">-</span><span class="n">model</span><span class="c1">#egg=insurance_charges_model</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make a prediction with the model, we'll import the model's class.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.model</span> <span class="kn">import</span> <span class="n">InsuranceChargesModel</span>
</code></pre></div>
<p>Now we can instantiate the model:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">InsuranceChargesModel</span><span class="p">()</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make a prediction, we'll need to use the model's input schema class.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.schemas</span> <span class="kn">import</span> <span class="n">InsuranceChargesModelInput</span><span class="p">,</span> \
<span class="n">SexEnum</span><span class="p">,</span> <span class="n">RegionEnum</span>
<span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
</code></pre></div>
<p>The model's input schema is called InsuranceChargesModelInput and it encompasses all of the features required by the model to make a prediction.</p>
<p>Now we can make a prediction with the model by calling the predict() method with an instance of the InsuranceChargesModelInput class.</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=8640.78)
</code></pre></div>
<p>The model predicts that the charges will be $8640.78.</p>
<p>We can view input schema of the model as a JSON schema document by calling the .schema() method on the class.</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'InsuranceChargesModelInput'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Schema for input of the model's predict method."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'object'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'properties'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'age'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Age'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Age of primary beneficiary in years.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mi">18</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mi">65</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'integer'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'sex'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Sex'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Gender of beneficiary.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/SexEnum'</span><span class="p">}]},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'bmi'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Body Mass Index'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Body mass index of beneficiary.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mf">15.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mf">50.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'number'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'children'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Children'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Number of children covered by health insurance.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'integer'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'smoker'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Smoker'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Whether beneficiary is a smoker.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'boolean'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'region'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Region'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Region where beneficiary lives.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/RegionEnum'</span><span class="p">}]}},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'definitions'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'SexEnum'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'SexEnum'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Enumeration for the value of the 'sex' input of the model."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'male'</span><span class="p">,</span><span class="w"> </span><span class="s1">'female'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'RegionEnum'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'RegionEnum'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Enumeration for the value of the 'region' input of the model."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'southwest'</span><span class="p">,</span><span class="w"> </span><span class="s1">'southeast'</span><span class="p">,</span><span class="w"> </span><span class="s1">'northwest'</span><span class="p">,</span><span class="w"> </span><span class="s1">'northeast'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">}}}</span><span class="w"></span>
</code></pre></div>
<h2>Profiling the Model</h2>
<p>In order to get an idea of how much time it takes for our model to make a prediction, we'll profile it by making predictions with random data. To do this, we'll use the <a href="https://faker.readthedocs.io/en/master/">Faker package</a>. We can install it with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">Faker</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>We'll create a function that can generate a random sample that meets the model's input schema:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">faker</span> <span class="kn">import</span> <span class="n">Faker</span>
<span class="n">faker</span> <span class="o">=</span> <span class="n">Faker</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">generate_record</span><span class="p">()</span> <span class="o">-></span> <span class="n">InsuranceChargesModelInput</span><span class="p">:</span>
<span class="n">record</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"age"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">18</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">65</span><span class="p">),</span>
<span class="s2">"sex"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_choices</span><span class="p">(</span><span class="n">elements</span><span class="o">=</span><span class="p">(</span><span class="s2">"male"</span><span class="p">,</span> <span class="s2">"female"</span><span class="p">),</span> <span class="n">length</span><span class="o">=</span><span class="mi">1</span><span class="p">)[</span><span class="mi">0</span><span class="p">],</span>
<span class="s2">"bmi"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">15000</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">50000</span><span class="p">)</span><span class="o">/</span><span class="mf">1000.0</span><span class="p">,</span>
<span class="s2">"children"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">5</span><span class="p">),</span>
<span class="s2">"smoker"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">boolean</span><span class="p">(),</span>
<span class="s2">"region"</span><span class="p">:</span> <span class="n">faker</span><span class="o">.</span><span class="n">random_choices</span><span class="p">(</span><span class="n">elements</span><span class="o">=</span><span class="p">(</span><span class="s2">"southwest"</span><span class="p">,</span> <span class="s2">"southeast"</span><span class="p">,</span> <span class="s2">"northwest"</span><span class="p">,</span> <span class="s2">"northeast"</span><span class="p">),</span> <span class="n">length</span><span class="o">=</span><span class="mi">1</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span><span class="o">**</span><span class="n">record</span><span class="p">)</span>
</code></pre></div>
<p>The function returns an instance of the InsuranceChargesModelInput class, which is the type required by the model's predict() method. We'll use this function to profile the predict() method of the model.</p>
<p>It's really hard to see a performance difference with one sample, so we'll perform a test with many random samples to see the difference. To start, we'll generate 1000 samples and save them:</p>
<div class="highlight"><pre><span></span><code><span class="n">samples</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1000</span><span class="p">):</span>
<span class="n">samples</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">generate_record</span><span class="p">())</span>
</code></pre></div>
<p>By using the timeit module from the standard library, we can measure how much time it takes to call the model's predict method with a random sample. We'll make 1000 predictions.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">timeit</span>
<span class="n">total_seconds</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="s2">"[model.predict(sample) for sample in samples]"</span><span class="p">,</span>
<span class="n">number</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">())</span>
<span class="n">seconds_per_sample</span> <span class="o">=</span> <span class="n">total_seconds</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">samples</span><span class="p">)</span>
<span class="n">milliseconds_per_sample</span> <span class="o">=</span> <span class="n">seconds_per_sample</span> <span class="o">*</span> <span class="mf">1000.0</span>
</code></pre></div>
<p>The model took 32.997 seconds to perform 1000 predictions, therefore it took 0.033 seconds to make a single prediction.
The model takes about 32.997 milliseconds to make a prediction.</p>
<h2>Hashing Model Inputs</h2>
<p>Before we can build a caching decorator, we'll need to understand a little bit about hashing and how to use it for caching. A hashing operation is an operation takes in data of arbritrary size as input and returns data of a fixed size. A "hash" value refers to the fixed-size data that is returned from a hashing operation. Hashing has many uses in computer science, in this application we'll us hashing to uniquely identify some inputs that are provided to the ML model that we are decorating.</p>
<p>Hashing is already built into the Python standard library through the hash() function, but it is only supported on certain types of objects. We can try it out using an instance of the model's input schema:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">model_input_dict</span> <span class="o">=</span> <span class="n">model_input</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="n">frozen_dict</span> <span class="o">=</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">model_input_dict</span><span class="o">.</span><span class="n">keys</span><span class="p">()),</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">model_input_dict</span><span class="o">.</span><span class="n">values</span><span class="p">())</span>
<span class="nb">hash</span><span class="p">(</span><span class="n">frozen_dict</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>-4360805119606244359
</code></pre></div>
<p>To try out hashing, we converted an instance of the model's input schema into a dictionary, and then converted the keys and values of the dictionary into <a href="https://docs.python.org/3/library/stdtypes.html#frozenset">frozensets</a>. We then used the frozensets with the hash() function to create an integer value. The integer is the hashed value that we need to uniquely identify the inputs to the model.</p>
<p>To see how hashing works, we'll create a separate input instance for the model that has the exact same values and hash it:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">model_input_dict</span> <span class="o">=</span> <span class="n">model_input</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="n">frozen_dict</span> <span class="o">=</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">model_input_dict</span><span class="o">.</span><span class="n">keys</span><span class="p">()),</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">model_input_dict</span><span class="o">.</span><span class="n">values</span><span class="p">())</span>
<span class="nb">hash</span><span class="p">(</span><span class="n">frozen_dict</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>-4360805119606244359
</code></pre></div>
<p>The hashed values are exactly the same, as we expected. The hashes value should be different if any of the values in the model input change:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.2</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">model_input_dict</span> <span class="o">=</span> <span class="n">model_input</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="n">frozen_dict</span> <span class="o">=</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">model_input_dict</span><span class="o">.</span><span class="n">keys</span><span class="p">()),</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">model_input_dict</span><span class="o">.</span><span class="n">values</span><span class="p">())</span>
<span class="nb">hash</span><span class="p">(</span><span class="n">frozen_dict</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>-7065881474845529459
</code></pre></div>
<p>The "bmi" field changed from 24.0 to 24.2, so we got a completely different hashed value.</p>
<p>Hashing is a quick and easy way to identify inputs which will allow us to store the predictions of the model in the cache and retrieve them later. </p>
<h2>Creating the Redis Cache Decorator</h2>
<p>We'll be using Redis to hold the cached predictions of the model. To access the Redis instance, we'll use the redis python package, which we'll install with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">redis</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Now we can implement the decorator class:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">os</span>
<span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span><span class="p">,</span> <span class="n">Optional</span>
<span class="kn">from</span> <span class="nn">ml_base.decorator</span> <span class="kn">import</span> <span class="n">MLModelDecorator</span>
<span class="kn">import</span> <span class="nn">redis</span>
<span class="kn">import</span> <span class="nn">json</span>
<span class="k">class</span> <span class="nc">RedisCachingDecorator</span><span class="p">(</span><span class="n">MLModelDecorator</span><span class="p">):</span>
<span class="sd">"""Decorator for caching around an MLModel instance."""</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">host</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">port</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">database</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">prefix</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span>
<span class="n">hashing_fields</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">)</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="n">host</span><span class="p">,</span> <span class="n">port</span><span class="o">=</span><span class="n">port</span><span class="p">,</span> <span class="n">database</span><span class="o">=</span><span class="n">database</span><span class="p">,</span> <span class="n">prefix</span><span class="o">=</span><span class="n">prefix</span><span class="p">,</span>
<span class="n">hashing_fields</span><span class="o">=</span><span class="n">hashing_fields</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_redis_client"</span><span class="p">]</span> <span class="o">=</span> <span class="n">redis</span><span class="o">.</span><span class="n">Redis</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="n">host</span><span class="p">,</span> <span class="n">port</span><span class="o">=</span><span class="n">port</span><span class="p">,</span> <span class="n">db</span><span class="o">=</span><span class="n">database</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"prefix"</span><span class="p">]</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">prefix</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">/</span><span class="si">{}</span><span class="s2">/</span><span class="si">{}</span><span class="s2">/"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"prefix"</span><span class="p">],</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">,</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">version</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">prefix</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">/</span><span class="si">{}</span><span class="s2">/"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">,</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">version</span><span class="p">)</span>
<span class="c1"># select hashing fields from input</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"hashing_fields"</span><span class="p">]</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">data_dict</span> <span class="o">=</span> <span class="p">{</span><span class="n">key</span><span class="p">:</span> <span class="n">data</span><span class="o">.</span><span class="n">dict</span><span class="p">()[</span><span class="n">key</span><span class="p">]</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"hashing_fields"</span><span class="p">]}</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">data_dict</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="c1"># creating a key for the prediction inputs provided</span>
<span class="n">frozen_data</span> <span class="o">=</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">data_dict</span><span class="o">.</span><span class="n">keys</span><span class="p">()),</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">data_dict</span><span class="o">.</span><span class="n">values</span><span class="p">())</span>
<span class="n">key</span> <span class="o">=</span> <span class="n">prefix</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="nb">hash</span><span class="p">(</span><span class="n">frozen_data</span><span class="p">))</span>
<span class="c1"># check if the prediction is in the cache</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_redis_client"</span><span class="p">]</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">)</span>
<span class="c1"># if the prediction is present in the cache, then deserialize it and return the prediction</span>
<span class="k">if</span> <span class="n">prediction</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">prediction</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">output_schema</span><span class="p">(</span><span class="o">**</span><span class="n">prediction</span><span class="p">)</span>
<span class="k">return</span> <span class="n">prediction</span>
<span class="c1"># if the prediction is not present in the cache, then make a prediction, save it to the cache, and return the prediction</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="n">serialized_prediction</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">prediction</span><span class="o">.</span><span class="n">dict</span><span class="p">())</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_redis_client"</span><span class="p">]</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">serialized_prediction</span><span class="p">)</span>
<span class="k">return</span> <span class="n">prediction</span>
</code></pre></div>
<p>The caching decorator works very simply, when it receives inputs for the model it:</p>
<ul>
<li>creates a key for the model input using hashing</li>
<li>checks if the key is present in the cache</li>
<li>if the key is present:<ul>
<li>retrieves the prediction for that key </li>
<li>deserializes the contents of the cache into the output type of the model</li>
<li>returns the prediction to the caller</li>
</ul>
</li>
<li>if the key is not present:<ul>
<li>makes a prediction with the model it is decorating</li>
<li>serializes the prediction to a JSON string</li>
<li>saves the prediction to the cache with the key created</li>
<li>returns the prediction to the caller</li>
</ul>
</li>
</ul>
<p>The key created for each cache entry is made up of the model's qualified name, the model version and an optional prefix. The prefix is used to differentiate the predictions that are cached in a more flexible way. The caching decorator uses JSON as a serialization format to store information in the cache. </p>
<h2>Using the Redis Cache Decorator</h2>
<p>In order to try out the decorator, we'll need to run a local Redis instance. We can start one using Docker with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">d</span> <span class="o">-</span><span class="n">p</span> <span class="mi">6379</span><span class="p">:</span><span class="mi">6379</span> <span class="o">--</span><span class="n">name</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">/</span><span class="n">redis</span><span class="o">-</span><span class="n">stack</span><span class="o">-</span><span class="n">server</span><span class="p">:</span><span class="n">latest</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">836</span><span class="n">c0d557926df641a2e657bcf0d935ec7b1e361b4de5dab6a9abad9371262ea</span><span class="w"></span>
</code></pre></div>
<p>To test out the decorator we first need to instantiate the model object that we want to use with the decorator.</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">InsuranceChargesModel</span><span class="p">()</span>
</code></pre></div>
<p>Next, we’ll instantiate the decorator with the connection parameters for the Redis docker container.</p>
<div class="highlight"><pre><span></span><code><span class="n">caching_decorator</span> <span class="o">=</span> <span class="n">RedisCachingDecorator</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="mi">6379</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">prefix</span><span class="o">=</span><span class="s2">"prefix"</span><span class="p">)</span>
</code></pre></div>
<p>We can add the model instance to the decorator after it’s been instantiated like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_model</span> <span class="o">=</span> <span class="n">caching_decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<p>We can see the decorator and the model objects by printing the reference to the decorator:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_model</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>RedisCachingDecorator(InsuranceChargesModel)
</code></pre></div>
<p>The decorator object is printing out it's own type along with the type of the model that it is decorating.</p>
<p>Now we’ll try to use the decorator and the model together by making a few predictions.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">46</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=9612.64)
</code></pre></div>
<p>The first time we make a prediction with a given input, we'll get the prediction made by the model and the decorator will store the prediction in the cache. </p>
<p>We can view the key in the redis database to see how it is stored.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">SCAN</span> <span class="mi">0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">0</span><span class="w"></span>
<span class="n">prefix</span><span class="o">/</span><span class="n">insurance_charges_model</span><span class="o">/</span><span class="mf">0.1.0</span><span class="o">/</span><span class="mf">5926980192354242260</span><span class="w"></span>
</code></pre></div>
<p>There is a single key in the redis database. We'll access they contents of the key like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">GET</span> <span class="n">prefix</span><span class="o">/</span><span class="n">insurance_charges_model</span><span class="o">/</span><span class="mf">0.1.0</span><span class="o">/</span><span class="mi">5926980192354242260</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges": 9612.64}
</code></pre></div>
<p>The prediction is stored in the key as a JSON string.</p>
<p>We'll try the same prediction again:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">46</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=9612.64)
</code></pre></div>
<p>This time the prediction was not made by the model, it was found in the Redis cache and returned by the decorator instead of being made again.</p>
<p>Next, we'll use the 1000 samples we generated above to make predictions with the decorated model:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_total_seconds</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="s2">"[decorated_model.predict(sample) for sample in samples]"</span><span class="p">,</span>
<span class="n">number</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">())</span>
<span class="n">decorated_seconds_per_sample</span> <span class="o">=</span> <span class="n">decorated_total_seconds</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">samples</span><span class="p">)</span>
<span class="n">decorated_milliseconds_per_sample</span> <span class="o">=</span> <span class="n">decorated_seconds_per_sample</span> <span class="o">*</span> <span class="mf">1000.0</span>
</code></pre></div>
<p>The decorated model took 36.419 seconds to perform 1000 predictions the first time that it saw the prediction inputs, therefore it took 0.0364 seconds to make a single prediction.
The decorated model takes about 36.419 milliseconds to make a prediction.</p>
<p>We'll run the same samples through again:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_total_seconds</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="s2">"[decorated_model.predict(sample) for sample in samples]"</span><span class="p">,</span>
<span class="n">number</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">())</span>
<span class="n">decorated_seconds_per_sample</span> <span class="o">=</span> <span class="n">decorated_total_seconds</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">samples</span><span class="p">)</span>
<span class="n">decorated_milliseconds_per_sample</span> <span class="o">=</span> <span class="n">decorated_seconds_per_sample</span> <span class="o">*</span> <span class="mf">1000.0</span>
</code></pre></div>
<p>The decorated model took 0.88 seconds to perform 1000 predictions the second time that it saw the prediction inputs, therefore it took 0.0009 seconds to make a single prediction.
The decorated model takes about 0.88 milliseconds to access a single prediction and return it.</p>
<p>It took less time because the cached predictions were returned more quickly because we requested the same predictions from the model.</p>
<p>We can get the amount of memory used by the cache by accessing the keys and summing up the number of bytes.</p>
<div class="highlight"><pre><span></span><code><span class="n">r</span> <span class="o">=</span> <span class="n">redis</span><span class="o">.</span><span class="n">StrictRedis</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="s1">'localhost'</span><span class="p">,</span> <span class="n">port</span><span class="o">=</span><span class="mi">6379</span><span class="p">,</span> <span class="n">db</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
<span class="n">decorated_number_of_bytes</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">decorated_total_entries</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">r</span><span class="o">.</span><span class="n">scan_iter</span><span class="p">(</span><span class="s2">"prefix*"</span><span class="p">):</span>
<span class="n">decorated_number_of_bytes</span> <span class="o">+=</span> <span class="nb">len</span><span class="p">(</span><span class="n">r</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">))</span>
<span class="n">decorated_total_entries</span> <span class="o">=</span> <span class="n">decorated_total_entries</span> <span class="o">+</span> <span class="mi">1</span>
<span class="n">decorated_average_number_of_bytes</span> <span class="o">=</span> <span class="n">decorated_number_of_bytes</span> <span class="o">/</span> <span class="n">decorated_total_entries</span>
</code></pre></div>
<p>The keys in the cache take up a total of 20624 bytes. The average number of bytes per cache entry is 20.6.</p>
<p>We'll clear the redis database to make sure the contents don't intefere with the next things we want to try.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">FLUSHDB</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>OK
</code></pre></div>
<h2>Selecting Fields For Hashing</h2>
<p>In certain situations, not all of the fields in the model's input should be used to create a hash. This may be because not all of the model's input fields are actually used for making a prediction. Some fields may be used for logging or debugging and do not actually affect the prediction created by the model. If changing the value of a field does not affect the value of the prediction created by the model, it should not be used to create the hashed key for the cache.</p>
<p>The caching decorator supports selecting specific fields from the input to create the cache key. The option is called "hashing_fields" and is provided to the decorator instance like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">caching_decorator</span> <span class="o">=</span> <span class="n">RedisCachingDecorator</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="mi">6379</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">prefix</span><span class="o">=</span><span class="s2">"prefix"</span><span class="p">,</span>
<span class="n">hashing_fields</span><span class="o">=</span><span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"bmi"</span><span class="p">,</span> <span class="s2">"children"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">])</span>
<span class="n">decorated_model</span> <span class="o">=</span> <span class="n">caching_decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<p>The decorator now uses all of the input fields except for the "region" field to create the key.</p>
<p>To try out the functionality, we'll create a prediction with the decorated model. The prediction will get saved in the cache.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">52</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=15219.19)
</code></pre></div>
<p>We'll now make the same prediction, but this time the prediction will come from the cache because it was saved there previously.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">52</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=15219.19)
</code></pre></div>
<p>We'll make the prediction one more time, but this time we'll change the value of the "region" field.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">52</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">southeast</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=15219.19)
</code></pre></div>
<p>The predicted value should have changed because the region changed. It didn't change because we accessed the prediction from the cache instead of creating a new one. This happened because we ignored the value of the "region" field when creating the hashed key in the cache.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">FLUSHDB</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>OK
</code></pre></div>
<h2>Improving the Performance of the Decorator</h2>
<p>When a prediction is stored in the cache, it is currently serialized using the JSON format. This format is simple and easy to understand, but it is not the most efficient format for serialization in terms of the size of the data and the time it takes to do the serialization.</p>
<p>To try to improve the efficiency of the caching decorator we'll add options for other serialization formats and also try to use compression. Another way to reduce the memory usage of the cache is to reduce the precision of the numbers given to the model. These approaches will be fully explained below.</p>
<p>We'll be using <a href="https://msgpack.org/index.html">MessagePack</a> to do serialization and <a href="https://en.wikipedia.org/wiki/Snappy_(compression)">Snappy</a> for compression, so we need to install the packages:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">msgpack</span>
<span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">python</span><span class="o">-</span><span class="n">snappy</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>We'll recreate the RedisCachingDecorator class with the code needed to support the new features we want to work with.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">msgpack</span>
<span class="kn">import</span> <span class="nn">snappy</span>
<span class="k">class</span> <span class="nc">RedisCachingDecorator</span><span class="p">(</span><span class="n">MLModelDecorator</span><span class="p">):</span>
<span class="sd">"""Decorator for caching around an MLModel instance."""</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">host</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">port</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">database</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">prefix</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span>
<span class="n">hashing_fields</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">serder</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="s2">"JSON"</span><span class="p">,</span>
<span class="n">use_compression</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">False</span><span class="p">,</span>
<span class="n">reduced_precision</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">False</span><span class="p">,</span>
<span class="n">number_of_places</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span>
<span class="p">)</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="k">if</span> <span class="n">serder</span> <span class="ow">not</span> <span class="ow">in</span> <span class="p">[</span><span class="s2">"JSON"</span><span class="p">,</span> <span class="s2">"MessagePack"</span><span class="p">]:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Serder option not supported."</span><span class="p">)</span>
<span class="k">if</span> <span class="n">reduced_precision</span> <span class="ow">is</span> <span class="kc">True</span> <span class="ow">and</span> <span class="n">number_of_places</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"number_of_places must be provided when reduced_precision is True."</span><span class="p">)</span>
<span class="k">if</span> <span class="n">number_of_places</span> <span class="ow">is</span> <span class="kc">None</span> <span class="ow">and</span> <span class="n">reduced_precision</span> <span class="ow">is</span> <span class="kc">True</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"reduced_precision must be True when number_of_places is provided."</span><span class="p">)</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="n">host</span><span class="p">,</span> <span class="n">port</span><span class="o">=</span><span class="n">port</span><span class="p">,</span> <span class="n">database</span><span class="o">=</span><span class="n">database</span><span class="p">,</span> <span class="n">prefix</span><span class="o">=</span><span class="n">prefix</span><span class="p">,</span>
<span class="n">hashing_fields</span><span class="o">=</span><span class="n">hashing_fields</span><span class="p">,</span> <span class="n">serder</span><span class="o">=</span><span class="n">serder</span><span class="p">,</span>
<span class="n">use_compression</span><span class="o">=</span><span class="n">use_compression</span><span class="p">,</span>
<span class="n">reduced_precision</span><span class="o">=</span><span class="n">reduced_precision</span><span class="p">,</span>
<span class="n">number_of_places</span><span class="o">=</span><span class="n">number_of_places</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_redis_client"</span><span class="p">]</span> <span class="o">=</span> <span class="n">redis</span><span class="o">.</span><span class="n">Redis</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="n">host</span><span class="p">,</span> <span class="n">port</span><span class="o">=</span><span class="n">port</span><span class="p">,</span> <span class="n">db</span><span class="o">=</span><span class="n">database</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"prefix"</span><span class="p">]</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">prefix</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">/</span><span class="si">{}</span><span class="s2">/</span><span class="si">{}</span><span class="s2">/"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"prefix"</span><span class="p">],</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">,</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">version</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">prefix</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">/</span><span class="si">{}</span><span class="s2">/"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">,</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">version</span><span class="p">)</span>
<span class="c1"># reducing the precision of the numerical fields, if it is enabled</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"reduced_precision"</span><span class="p">]</span> <span class="ow">is</span> <span class="kc">True</span><span class="p">:</span>
<span class="k">for</span> <span class="n">field_name</span><span class="p">,</span> <span class="n">field_attributes</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()[</span><span class="s2">"properties"</span><span class="p">]</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
<span class="k">if</span> <span class="s2">"type"</span> <span class="ow">in</span> <span class="n">field_attributes</span><span class="o">.</span><span class="n">keys</span><span class="p">()</span> <span class="ow">and</span> <span class="n">field_attributes</span><span class="p">[</span><span class="s2">"type"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"number"</span><span class="p">:</span>
<span class="n">field_value</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">field_name</span><span class="p">)</span>
<span class="nb">setattr</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">field_name</span><span class="p">,</span> <span class="nb">round</span><span class="p">(</span><span class="n">field_value</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"number_of_places"</span><span class="p">]))</span>
<span class="c1"># select hashing fields from input</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"hashing_fields"</span><span class="p">]</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">data_dict</span> <span class="o">=</span> <span class="p">{</span><span class="n">key</span><span class="p">:</span> <span class="n">data</span><span class="o">.</span><span class="n">dict</span><span class="p">()[</span><span class="n">key</span><span class="p">]</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"hashing_fields"</span><span class="p">]}</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">data_dict</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="c1"># creating a key for the prediction inputs provided</span>
<span class="n">frozen_data</span> <span class="o">=</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">data_dict</span><span class="o">.</span><span class="n">keys</span><span class="p">()),</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">data_dict</span><span class="o">.</span><span class="n">values</span><span class="p">())</span>
<span class="n">key</span> <span class="o">=</span> <span class="n">prefix</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="nb">hash</span><span class="p">(</span><span class="n">frozen_data</span><span class="p">))</span>
<span class="c1"># check if the prediction is in the cache</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_redis_client"</span><span class="p">]</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">)</span>
<span class="c1"># if the prediction is present in the cache</span>
<span class="k">if</span> <span class="n">prediction</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="c1"># optionally decompressing the bytes</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"use_compression"</span><span class="p">]:</span>
<span class="n">decompressed_prediction</span> <span class="o">=</span> <span class="n">snappy</span><span class="o">.</span><span class="n">decompress</span><span class="p">(</span><span class="n">prediction</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">decompressed_prediction</span> <span class="o">=</span> <span class="n">prediction</span>
<span class="c1"># deserializing to bytes</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"serder"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"JSON"</span><span class="p">:</span>
<span class="n">deserialized_prediction</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">decompressed_prediction</span><span class="o">.</span><span class="n">decode</span><span class="p">())</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"serder"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"MessagePack"</span><span class="p">:</span>
<span class="n">deserialized_prediction</span> <span class="o">=</span> <span class="n">msgpack</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">decompressed_prediction</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Serder option not supported."</span><span class="p">)</span>
<span class="c1"># creating the output instance</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">output_schema</span><span class="p">(</span><span class="o">**</span><span class="n">deserialized_prediction</span><span class="p">)</span>
<span class="k">return</span> <span class="n">prediction</span>
<span class="c1"># if the prediction is not present in the cache</span>
<span class="k">else</span><span class="p">:</span>
<span class="c1"># making a prediction with the model</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="c1"># serializing to bytes</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"serder"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"JSON"</span><span class="p">:</span>
<span class="n">serialized_prediction</span> <span class="o">=</span> <span class="nb">str</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">prediction</span><span class="o">.</span><span class="n">dict</span><span class="p">()))</span>
<span class="k">elif</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"serder"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"MessagePack"</span><span class="p">:</span>
<span class="n">serialized_prediction</span> <span class="o">=</span> <span class="n">msgpack</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">prediction</span><span class="o">.</span><span class="n">dict</span><span class="p">())</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Serder option not supported."</span><span class="p">)</span>
<span class="c1"># optionally compressing the bytes</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"use_compression"</span><span class="p">]:</span>
<span class="n">serialized_prediction</span> <span class="o">=</span> <span class="n">snappy</span><span class="o">.</span><span class="n">compress</span><span class="p">(</span><span class="n">serialized_prediction</span><span class="p">)</span>
<span class="c1"># saving the prediction to the cache</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_redis_client"</span><span class="p">]</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">serialized_prediction</span><span class="p">)</span>
<span class="k">return</span> <span class="n">prediction</span>
</code></pre></div>
<p>The new implementation above includes options to enable MessagePack for serialization/deserialization, snappy for compression, and the ability to reduce the precision of numerical fields in the model input. We'll try out each option individually.</p>
<h3>MessagePack Serialization</h3>
<p><a href="https://msgpack.org/index.html">MessagePack</a> is a binary serialization format designed for small, efficient and flexible serialization. </p>
<p>To enable MessagePack, we'll instantiate the decorator setting the "serder" option to "MessagePack". We'll use a prefix to separate the cache entries that use MessagePack from the other cache entries.</p>
<div class="highlight"><pre><span></span><code><span class="n">msgpack_caching_decorator</span> <span class="o">=</span> <span class="n">RedisCachingDecorator</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="mi">6379</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">prefix</span><span class="o">=</span><span class="s2">"msgpack"</span><span class="p">,</span>
<span class="n">serder</span><span class="o">=</span><span class="s2">"MessagePack"</span><span class="p">)</span>
<span class="n">mspgpack_decorated_model</span> <span class="o">=</span> <span class="n">msgpack_caching_decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<p>The first time we make a prediction, the model will be used and the prediction will get serialized to MessagePack and saved to the cache.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">55</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">25.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">mspgpack_decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=15113.29)
</code></pre></div>
<p>The second time we make a prediction, the cache entry will be used instead.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">55</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">25.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">mspgpack_decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=15113.29)
</code></pre></div>
<p>The MessagePack format works, now we'll do some testing to see if it improves the serialization/deserialization performance.</p>
<p>As before, we'll make the predictions on the samples to fill in the cache with predictions. We'll be using the 1000 samples generated above to keep the comparison fair.</p>
<div class="highlight"><pre><span></span><code><span class="n">msgpack_total_seconds</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="s2">"[mspgpack_decorated_model.predict(sample) for sample in samples]"</span><span class="p">,</span>
<span class="n">number</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">())</span>
<span class="n">msgpack_seconds_per_sample</span> <span class="o">=</span> <span class="n">msgpack_total_seconds</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">samples</span><span class="p">)</span>
<span class="n">msgpack_milliseconds_per_sample</span> <span class="o">=</span> <span class="n">msgpack_seconds_per_sample</span> <span class="o">*</span> <span class="mf">1000.0</span>
</code></pre></div>
<p>The decorated model that uses MessagePack took 35.627 seconds to perform 1000 predictions the first time that it saw the prediction inputs. The decorated model takes about 35.627 milliseconds to make a single prediction.</p>
<p>Most of the time for this step is taken up by the model's prediction algorithm, this is the reason why its a similar amount of time as the JSON serder we used before.</p>
<p>Now we can try the same predictions again. This time, they'll be accessed from the cache and returned more quickly.</p>
<div class="highlight"><pre><span></span><code><span class="n">msgpack_total_seconds</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="s2">"[mspgpack_decorated_model.predict(sample) for sample in samples]"</span><span class="p">,</span>
<span class="n">number</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">())</span>
<span class="n">msgpack_seconds_per_sample</span> <span class="o">=</span> <span class="n">msgpack_total_seconds</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">samples</span><span class="p">)</span>
<span class="n">msgpack_milliseconds_per_sample</span> <span class="o">=</span> <span class="n">msgpack_seconds_per_sample</span> <span class="o">*</span> <span class="mf">1000.0</span>
</code></pre></div>
<p>The model that uses MessagePack took 0.955 seconds to perform 1000 predictions the second time that it saw the prediction inputs. The decorated model takes about 0.955 milliseconds to access a single prediction and return it.</p>
<p>The MessagePack serder performs at around the same speed as the JSON serder. The test we did with JSON above took about 0.88 ms for each sample, the MessagePack serder took 0.955 ms per sample.</p>
<p>We can see how much space the cache entries is taking up by querying each key and summing up the number of bytes:</p>
<div class="highlight"><pre><span></span><code><span class="n">msgpack_number_of_bytes</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">msgpack_total_entries</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">r</span><span class="o">.</span><span class="n">scan_iter</span><span class="p">(</span><span class="s2">"msgpack*"</span><span class="p">):</span>
<span class="n">msgpack_number_of_bytes</span> <span class="o">+=</span> <span class="nb">len</span><span class="p">(</span><span class="n">r</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">))</span>
<span class="n">msgpack_total_entries</span> <span class="o">=</span> <span class="n">msgpack_total_entries</span> <span class="o">+</span> <span class="mi">1</span>
<span class="n">msgpack_average_number_of_bytes</span> <span class="o">=</span> <span class="n">msgpack_number_of_bytes</span> <span class="o">/</span> <span class="n">msgpack_total_entries</span>
</code></pre></div>
<p>The keys in the original JSON cache took up a total of 20624 bytes. The keys in the MessagePack cache take up a total of 18018 bytes and the average number of bytes per MessagePack cache entry is 18.0.</p>
<p>By using MessagePack serialization we were able to use less memory in the cache.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">FLUSHDB</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>OK
</code></pre></div>
<h3>Snappy Compression</h3>
<p><a href="https://github.com/google/snappy">Snappy</a> is a compression algorithm built by Google that targets high compression ratios and high compressions speed. We can try to reduce the memory used by the cache by compressing the cache entries with the Snappy algorithm. This approach was inspired by <a href="https://doordash.engineering/2019/01/02/speeding-up-redis-with-compression/">another blog post</a>.</p>
<p>Enabling compression on the decorator is very simple, we'll just set the "use_compression" parameter to "True" when instantiating the caching decorator. In this example we'll use JSON serialization combined with compression.</p>
<div class="highlight"><pre><span></span><code><span class="n">compressing_caching_decorator</span> <span class="o">=</span> <span class="n">RedisCachingDecorator</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="mi">6379</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">prefix</span><span class="o">=</span><span class="s2">"json+compression"</span><span class="p">,</span>
<span class="n">serder</span><span class="o">=</span><span class="s2">"JSON"</span><span class="p">,</span>
<span class="n">use_compression</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">compressing_decorated_model</span> <span class="o">=</span> <span class="n">compressing_caching_decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<p>The first time we make a prediction, the model will be used and the prediction will get serialized to JSON, then compressed, and saved to the cache.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">53</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">25.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">compressing_decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=15207.01)
</code></pre></div>
<p>The second time we make a prediction, the compressed cache entry will be used instead.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">53</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">25.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">compressing_decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=15207.01)
</code></pre></div>
<p>The compression works, now we'll do some testing to see if it improves the serialization/deserialization performance.</p>
<div class="highlight"><pre><span></span><code><span class="n">compressed_total_seconds</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="s2">"[compressing_decorated_model.predict(sample) for sample in samples]"</span><span class="p">,</span>
<span class="n">number</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">())</span>
<span class="n">compressed_seconds_per_sample</span> <span class="o">=</span> <span class="n">compressed_total_seconds</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">samples</span><span class="p">)</span>
<span class="n">compressed_milliseconds_per_sample</span> <span class="o">=</span> <span class="n">compressed_seconds_per_sample</span> <span class="o">*</span> <span class="mf">1000.0</span>
</code></pre></div>
<p>The decorator that does compression took around 35.224 ms to make a prediction and add it to the cache the first time that it sees the prediction inputs.</p>
<p>Most of the time for this step is taken up by the model's prediction algorithm.</p>
<p>Now we can try the same predictions again.</p>
<div class="highlight"><pre><span></span><code><span class="n">compressed_total_seconds</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="s2">"[compressing_decorated_model.predict(sample) for sample in samples]"</span><span class="p">,</span>
<span class="n">number</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="nb">globals</span><span class="p">())</span>
<span class="n">compressed_seconds_per_sample</span> <span class="o">=</span> <span class="n">compressed_total_seconds</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">samples</span><span class="p">)</span>
<span class="n">compressed_milliseconds_per_sample</span> <span class="o">=</span> <span class="n">compressed_seconds_per_sample</span> <span class="o">*</span> <span class="mf">1000.0</span>
</code></pre></div>
<p>The decorator that uses compressed JSON took 0.906 ms to make a prediction the second time that it saw the prediction inputs.</p>
<p>The serder that uses JSON serialization and compression performs around the same as the JSON serder. The test we did with uncompressed JSON above took about 0.88 ms for each sample.</p>
<p>We can see how much space the cache entries is taking up by querying each key and summing up the number of bytes:</p>
<div class="highlight"><pre><span></span><code><span class="n">compressed_number_of_bytes</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">compressed_total_entries</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">r</span><span class="o">.</span><span class="n">scan_iter</span><span class="p">(</span><span class="s2">"json+compression*"</span><span class="p">):</span>
<span class="n">compressed_number_of_bytes</span> <span class="o">+=</span> <span class="nb">len</span><span class="p">(</span><span class="n">r</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">))</span>
<span class="n">compressed_total_entries</span> <span class="o">=</span> <span class="n">compressed_total_entries</span> <span class="o">+</span> <span class="mi">1</span>
<span class="n">compressed_average_number_of_bytes</span> <span class="o">=</span> <span class="n">compressed_number_of_bytes</span> <span class="o">/</span> <span class="n">compressed_total_entries</span>
</code></pre></div>
<p>The keys in the original JSON cache took up a total of 20624 bytes. The keys in the MessagePack cache take up a total of 18018 bytes. The keys in the compressed JSON cache take up a total of 22627 bytes, and the average number of bytes per cache entry is 22.6.</p>
<p>The keys that were serialized with JSON and compressed were a few bytes bigger than the keys serialized and not compressed. It seems that compression is not saving memory in the cache, this is probably due to the small size of the entries and the fact that information was not repeated inside of the serialized data structures.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">FLUSHDB</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>OK
</code></pre></div>
<h3>Reducing the Precision of the Inputs</h3>
<p>We can also try to limit the size of the cache by reducing the number of possible inputs to the hashing function. We'll demonstrate this with a few examples.</p>
<p>We'll start by hashing a single sample of the input of the model:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.12345</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">model_input_dict</span> <span class="o">=</span> <span class="n">model_input</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="n">frozen_dict</span> <span class="o">=</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">model_input_dict</span><span class="o">.</span><span class="n">keys</span><span class="p">()),</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">model_input_dict</span><span class="o">.</span><span class="n">values</span><span class="p">())</span>
<span class="nb">hash</span><span class="p">(</span><span class="n">frozen_dict</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>-2801283067008197552
</code></pre></div>
<p>Next, we'll hash a very similar model input:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.12346</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">model_input_dict</span> <span class="o">=</span> <span class="n">model_input</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="n">frozen_dict</span> <span class="o">=</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">model_input_dict</span><span class="o">.</span><span class="n">keys</span><span class="p">()),</span> <span class="nb">frozenset</span><span class="p">(</span><span class="n">model_input_dict</span><span class="o">.</span><span class="n">values</span><span class="p">())</span>
<span class="nb">hash</span><span class="p">(</span><span class="n">frozen_dict</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">5034586836711654789</span><span class="w"></span>
</code></pre></div>
<p>The hash value produced is the second time is completely different even though the "bmi" field only changed by 0.00001. This means that these two predictions will have two different cache entries even though they are very lilely to produce exactly the same prediction. Just to make sure, we'll make the predictions using these inputs:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.12345</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=8640.78)
</code></pre></div>
<p>Let's try the prediction and hash with a different value for the "bmi" field:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.12346</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=8640.78)
</code></pre></div>
<p>The prediction came out to be the same for both values of "bmi". However, the hashed value of the input was completely different. These predictions would be saved separately from each other in the cache, event though they are exactly the same. We can cut down on the number of entries in the cache by reducing the precision of floating point numbers so that these predictions can be cached one time instead of many. By rounding down the number we'll be reducing the number of cache entries that will be placed in the cache but also affecting the accuracy of the model's predictions. </p>
<p>The caching decorator supports this feature, we'll just enable it by adding the "reduced_precision" and "number_of_places" options to the configuration:</p>
<div class="highlight"><pre><span></span><code><span class="n">low_precision_caching_decorator</span> <span class="o">=</span> <span class="n">RedisCachingDecorator</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="mi">6379</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">prefix</span><span class="o">=</span><span class="s2">"low_precision"</span><span class="p">,</span>
<span class="n">reduced_precision</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">number_of_places</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
<span class="n">low_precision_decorated_model</span> <span class="o">=</span> <span class="n">low_precision_caching_decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<p>The first time we make a prediction, the model will be used and the prediction input will get the precision of the "bmi" field reduced to one decimal place, then the prediction will get serialized to JSON, and saved to the cache.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.12345</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">low_precision_decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=8640.78)
</code></pre></div>
<p>The second time the prediction is requested, the precision of the "bmi" field is reduced again in the same way, making the prediction input the same as before even though the values for the "bmi" field are not exactly the same. This will create the same hashed value which will retrieve the prediction from the cache and return it to the user.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.4321</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">low_precision_decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=8640.78)
</code></pre></div>
<p>The predictions are the same even though the inputs were different. We can view the keys in the cache like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">SCAN</span> <span class="mi">0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">0</span><span class="w"></span>
<span class="n">low_precision</span><span class="o">/</span><span class="n">insurance_charges_model</span><span class="o">/</span><span class="mf">0.1.0</span><span class="o">/-</span><span class="mf">4360805119606244359</span><span class="w"></span>
</code></pre></div>
<p>There's only one entry in the cache, which means that first prediction was used and no new entry was made for the second set of inputs.</p>
<p>Although this is not always an ideal way to save memory, there are some model deployments that can benefit from this approach. All that is needed is to analyze how much precision the model needs from its numerical inputs. It rarely makes sense to store predictions with an unlimited precision in their numerical inputs in the cache.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">FLUSHDB</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>OK
</code></pre></div>
<h2>Adding the Decorator to a Deployed Model</h2>
<p>Now that we have a working decorator, we can use it inside of a service alongside the model. To do this, we'll use the <a href="https://pypi.org/project/rest-model-service/">rest_model_service</a> package to quickly create a RESTful service. You can learn more about this package in <a href="https://www.tekhnoal.com/rest-model-service.html">this blog post</a>.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">rest_model_service</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To create a service for our model, all that is needed is that we add a YAML configuration file to the project. The configuration file looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Insurance Charges Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">qualified_name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model</span><span class="w"></span>
<span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model.prediction.model.InsuranceChargesModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
<span class="w"> </span><span class="nt">decorators</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">ml_model_caching.redis.RedisCachingDecorator</span><span class="w"></span>
<span class="w"> </span><span class="nt">configuration</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="s">"localhost"</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">6379</span><span class="w"></span>
<span class="w"> </span><span class="nt">database</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">0</span><span class="w"></span>
</code></pre></div>
<p>The service_title field is the name of the service as it will appear in the documentation. The models field is an array that contains the details of the models we would like to deploy in the service. The class_path points at the MLModel class that implement's the model's prediction logic, in this case we'll be using the same model as in the examples above. The decorators field contains the details of the decorators that we want to attach to the model instance. We want to use the RedisCachingDecorator decorator class with the configuration we've used for local testing.</p>
<p>To run the service locally, execute these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>./configuration/rest_configuration.yaml
uvicorn rest_model_service.main:app --reload
</code></pre></div>
<p>The service should come up and can be accessed in a web browser at http://127.0.0.1:8000. When you access that URL using a web browser you will be redirected to the documentation page that is generated by the FastAPI package.</p>
<p>We can try out the service with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:8000/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 65, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">male</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 50, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 5, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: true, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">southwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":46277.67}
</code></pre></div>
<p>We can check the Redis instance to make sure that the cache is being used:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">SCAN</span> <span class="mi">0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">0</span><span class="w"></span>
<span class="n">insurance_charges_model</span><span class="o">/</span><span class="mf">0.1.0</span><span class="o">/-</span><span class="mf">3948524794153351987</span><span class="w"></span>
</code></pre></div>
<p>By using the MLModel base class provided by the ml_base package and the REST service framework provided by the rest_model_service package we're able to quickly stand up a service to host the model. The decorator that we want to test can also be added to the model through configuration, including all of its parameters. </p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">FLUSHDB</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>OK
</code></pre></div>
<h2>Deploying the Caching Decorator</h2>
<p>Now that we have a working model and model service, we'll need to deploy it somewhere. We'll start by deploying the service locally using Docker. Once we have the service and Redis working locally, we'll deploy everything to a local Minikube instance.</p>
<h3>Creating a Docker Image</h3>
<p>Let's create a docker image and run it locally. The docker image is generated using instructions in the Dockerfile:</p>
<div class="highlight"><pre><span></span><code><span class="k">FROM</span><span class="w"> </span><span class="s">python:3.9-slim</span>
<span class="k">ARG</span><span class="w"> </span>BUILD_DATE
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.title<span class="o">=</span><span class="s2">"Caching for ML Models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.description<span class="o">=</span><span class="s2">"Caching for machine learning models."</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.created<span class="o">=</span><span class="nv">$BUILD_DATE</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.authors<span class="o">=</span><span class="s2">"6666331+schmidtbri@users.noreply.github.com"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.source<span class="o">=</span><span class="s2">"https://github.com/schmidtbri/caching-for-ml-models"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.version<span class="o">=</span><span class="s2">"0.1.0"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.licenses<span class="o">=</span><span class="s2">"MIT License"</span>
<span class="k">LABEL</span><span class="w"> </span>org.opencontainers.image.base.name<span class="o">=</span><span class="s2">"python:3.9-slim"</span>
<span class="k">WORKDIR</span><span class="w"> </span><span class="s">./service</span>
<span class="c"># installing git because we need to install the model package from the github repository</span>
<span class="k">RUN</span><span class="w"> </span>apt-get update
<span class="k">RUN</span><span class="w"> </span>apt-get --assume-yes install git
<span class="k">COPY</span><span class="w"> </span>./ml_model_caching ./ml_model_caching
<span class="k">COPY</span><span class="w"> </span>./configuration ./configuration
<span class="k">COPY</span><span class="w"> </span>./LICENSE ./LICENSE
<span class="k">COPY</span><span class="w"> </span>./service_requirements.txt ./service_requirements.txt
<span class="k">RUN</span><span class="w"> </span>pip install -r service_requirements.txt
<span class="k">CMD</span><span class="w"> </span><span class="p">[</span><span class="s2">"uvicorn"</span><span class="p">,</span><span class="w"> </span><span class="s2">"rest_model_service.main:app"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--host"</span><span class="p">,</span><span class="w"> </span><span class="s2">"0.0.0.0"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--port"</span><span class="p">,</span><span class="w"> </span><span class="s2">"8000"</span><span class="p">]</span>
</code></pre></div>
<p>The Dockerfile is used by this command to create a docker image:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">build</span> <span class="o">-</span><span class="n">t</span> <span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="n">latest</span> <span class="o">../</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make sure everything worked as expected, we'll look through the docker images in our system:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">image</span> <span class="n">ls</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">insurance_charges_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>insurance_charges_model_service latest 2c8c19151e65 32 hours ago 1.26GB
</code></pre></div>
<p>Next, we'll start the image to see if everything is working as expected. To do this we'll create a local docker network and connect the redis container and the model service container to it.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">create</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">1</span><span class="n">d8ad0b59ad831f1c6205cea3e799ee31f40109006b9a02d39db8207a7e3f339</span><span class="w"></span>
</code></pre></div>
<p>We'll connect the running redis container that we were working with to the network.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">connect</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span>
</code></pre></div>
<p>Now we can start the service docker image connected to the same network as the redis container.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">d</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">8000</span><span class="p">:</span><span class="mi">8000</span> \
<span class="o">--</span><span class="n">net</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">REST_CONFIG</span><span class="o">=./</span><span class="n">configuration</span><span class="o">/</span><span class="n">local_rest_config</span><span class="o">.</span><span class="n">yaml</span> \
<span class="o">--</span><span class="n">name</span> <span class="n">insurance_charges_model_service</span> \
<span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="n">latest</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">83</span><span class="n">db77417dfa5cd33c3d7fabea8349df8b3932ef0cd2544a94b7d4958eed93bc</span><span class="w"></span>
</code></pre></div>
<p>Notice that we're using a different configuration file that has a different hostname for the redis instance. The redis container is not accesible from localhost inside of the network so we needed to have the hostname "local-redis" in the configuration.</p>
<p>The service should be accessible on port 8000 of localhost, so we'll try to make a prediction using the curl command running inside of a container connected to the network:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">it</span> <span class="o">--</span><span class="n">rm</span> \
<span class="o">--</span><span class="n">net</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span> \
<span class="n">curlimages</span><span class="o">/</span><span class="n">curl</span> \
<span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://insurance_charges_model_service:8000/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 65, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">male</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 50, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 5, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: true, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">southwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":46277.67}
</code></pre></div>
<p>The model predicted that the insurance charges would be $46277.67 and also saved the prediction to the Redis cache. We can view the cache entries in Redis with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">SCAN</span> <span class="mi">0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">0</span><span class="w"></span>
<span class="n">insurance_charges_model</span><span class="o">/</span><span class="mf">0.1.0</span><span class="o">/</span><span class="mf">7732985413081947687</span><span class="w"></span>
</code></pre></div>
<p>The key in the cache has this value:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">exec</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">GET</span> <span class="n">insurance_charges_model</span><span class="o">/</span><span class="mf">0.1.0</span><span class="o">/</span><span class="mi">7732985413081947687</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges": 46277.67}
</code></pre></div>
<p>Since we didn't use MessagePack or Snappy compression the value is easily read as a plain JSON string.</p>
<p>Now that we're done with the local redis instance we'll stop and remove the docker container.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">local</span><span class="o">-</span><span class="n">redis</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">insurance_charges_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">insurance_charges_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">rm</span> <span class="n">local</span><span class="o">-</span><span class="n">network</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>local-redis
local-redis
insurance_charges_model_service
insurance_charges_model_service
local-network
</code></pre></div>
<h2>Deploying the Solution</h2>
<p>To show the system in action, we’ll deploy the service and the Redis instance to a Kubernetes cluster. A local cluster can be easily started by using <a href="https://minikube.sigs.k8s.io/docs/">minikube</a>. Installation instructions can be found <a href="https://minikube.sigs.k8s.io/docs/start/">here</a>.</p>
<h3>Creating the Kubernetes Cluster</h3>
<p>To start the minikube cluster execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">start</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>😄 <span class="nv">minikube</span> <span class="nv">v1</span>.<span class="mi">26</span>.<span class="mi">1</span> <span class="nv">on</span> <span class="nv">Darwin</span> <span class="mi">12</span>.<span class="mi">5</span>
✨ <span class="nv">Using</span> <span class="nv">the</span> <span class="nv">virtualbox</span> <span class="nv">driver</span> <span class="nv">based</span> <span class="nv">on</span> <span class="nv">existing</span> <span class="nv">profile</span>
👍 <span class="nv">Starting</span> <span class="nv">control</span> <span class="nv">plane</span> <span class="nv">node</span> <span class="nv">minikube</span> <span class="nv">in</span> <span class="nv">cluster</span> <span class="nv">minikube</span>
🔄 <span class="nv">Restarting</span> <span class="nv">existing</span> <span class="nv">virtualbox</span> <span class="nv">VM</span> <span class="k">for</span> <span class="s2">"</span><span class="s">minikube</span><span class="s2">"</span> ...
🐳 <span class="nv">Preparing</span> <span class="nv">Kubernetes</span> <span class="nv">v1</span>.<span class="mi">24</span>.<span class="mi">3</span> <span class="nv">on</span> <span class="nv">Docker</span> <span class="mi">20</span>.<span class="mi">10</span>.<span class="mi">17</span> ...[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>[<span class="nv">K</span>
🔎 <span class="nv">Verifying</span> <span class="nv">Kubernetes</span> <span class="nv">components</span>...
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">gcr</span>.<span class="nv">io</span><span class="o">/</span><span class="nv">k8s</span><span class="o">-</span><span class="nv">minikube</span><span class="o">/</span><span class="nv">storage</span><span class="o">-</span><span class="nv">provisioner</span>:<span class="nv">v5</span>
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">kubernetesui</span><span class="o">/</span><span class="nv">dashboard</span>:<span class="nv">v2</span>.<span class="mi">6</span>.<span class="mi">0</span>
▪ <span class="nv">Using</span> <span class="nv">image</span> <span class="nv">kubernetesui</span><span class="o">/</span><span class="nv">metrics</span><span class="o">-</span><span class="nv">scraper</span>:<span class="nv">v1</span>.<span class="mi">0</span>.<span class="mi">8</span>
🌟 <span class="nv">Enabled</span> <span class="nv">addons</span>: <span class="nv">default</span><span class="o">-</span><span class="nv">storageclass</span>, <span class="nv">storage</span><span class="o">-</span><span class="nv">provisioner</span>, <span class="nv">dashboard</span>
🏄 <span class="nv">Done</span><span class="o">!</span> <span class="nv">kubectl</span> <span class="nv">is</span> <span class="nv">now</span> <span class="nv">configured</span> <span class="nv">to</span> <span class="nv">use</span> <span class="s2">"</span><span class="s">minikube</span><span class="s2">"</span> <span class="nv">cluster</span> <span class="nv">and</span> <span class="s2">"</span><span class="s">default</span><span class="s2">"</span> <span class="nv">namespace</span> <span class="nv">by</span> <span class="nv">default</span>
</code></pre></div>
<p>Let's view all of the pods running in the minikube cluster to make sure we can connect.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">-</span><span class="n">A</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d4b75cb6d-wrrwr 1/1 Running 7 (9h ago) 2d10h
kube-system etcd-minikube 1/1 Running 7 (9h ago) 2d10h
kube-system kube-apiserver-minikube 0/1 Running 7 (9h ago) 2d10h
kube-system kube-controller-manager-minikube 0/1 Running 6 (9h ago) 2d10h
kube-system kube-proxy-5n4t9 1/1 Running 7 (9h ago) 2d10h
kube-system kube-scheduler-minikube 1/1 Running 6 (9h ago) 2d10h
kube-system storage-provisioner 1/1 Running 12 (9h ago) 2d10h
kubernetes-dashboard dashboard-metrics-scraper-78dbd9dbf5-d4zv8 1/1 Running 4 (9h ago) 2d10h
kubernetes-dashboard kubernetes-dashboard-5fd5574d9f-7mjlt 1/1 Running 5 (9h ago) 2d10h
</code></pre></div>
<h3>Creating a Kubernetes Namespace</h3>
<p>Now that we have a cluster and are connected to it, we'll create a namespace to hold the resources for our model deployment. The resource definition is in the kubernetes/namespace.yaml file. To apply the manifest to the cluster, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace/model-services created
</code></pre></div>
<p>To take a look at the namespaces, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">namespace</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME STATUS AGE
default Active 2d10h
kube-node-lease Active 2d10h
kube-public Active 2d10h
kube-system Active 2d10h
kubernetes-dashboard Active 2d10h
model-services Active 2s
</code></pre></div>
<p>The new namespace should appear in the listing along with other namespaces created by default by the system. To use the new namespace for the rest of the operations, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">config</span> <span class="nb">set</span><span class="o">-</span><span class="n">context</span> <span class="o">--</span><span class="n">current</span> <span class="o">--</span><span class="n">namespace</span><span class="o">=</span><span class="n">model</span><span class="o">-</span><span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Context "minikube" modified.
</code></pre></div>
<h3>Creating the Redis Service</h3>
<p>Before we can deploy the model service we need to create the Redis service that will hold the cached predictions. For this service we will create a StatefulSet that manages two instances of the Redis service. We will use both instances from the decorator running in the model service.</p>
<p>A StatefulSet is similar to a Deployment because it deploys Pods that are based on an identical specification. However, a StatefulSet will maintain an identity for each Pod and each one will be able to keep internal state. This is important because the Redis service is saving the cache for us, which is stateful. </p>
<p>Using Redis in this manner is an example of sharding. Sharding is the process of splitting up data that is too big to fit into a single computer into multiple computers. By using sharding we can make our data layer distributed, which can make it more easily to scale in the future. </p>
<p>A more detailed diagram of our software architecture looks like this:</p>
<p><img alt="Better Software Architecture" src="https://www.tekhnoal.com/better_software_architecture_cfmlm.png" width="100%"></p>
<p>The Redis service is defined in the kubernetes/redis_service.yaml file. We can create it with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">redis_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>service/redis-service created
statefulset.apps/redis-st created
</code></pre></div>
<p>We can view the pods associated with this service:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">redis</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>redis-st-0 1/1 Running 0 4s
redis-st-1 0/1 ContainerCreating 0 1s
</code></pre></div>
<p>We wanted to create two instances of Redis in the StatefulSet, because the pods are part of a Stateful set their names end with a number and we will be able to reach individual pod from the model service.</p>
<p>The .yaml file also created a Service for the StatefulSet pods which makes them accesible through DNS. We can view the service with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
redis-service ClusterIP None <none> 6379/TCP 7s
</code></pre></div>
<h3>Creating a Model Deployment and Service</h3>
<p>The model service now has a Redis instance to access, so we'll be creating the model service resources. These are:</p>
<ul>
<li>Deployment: a declarative way to manage a set of pods, the model service pods are managed through the Deployment.</li>
<li>Service: a way to expose a set of pods in a Deployment, the model services is made available to the outside world through the Service, the service type is LoadBalancer which means that a load balancer will be created for the service.</li>
</ul>
<p>The model service pod requires an extra container running inside of it to enable easy access to the Redis service. Because we sharded the Redis service into two instances, the caching decorator would need to be aware of both instances of Redis in order to access the right one for each cache entry. We can avoid this by adding an ambassador service to the model service pod. An ambassador takes care of interactions between the application
and any outside services. In this case, the ambassador container will take care of routing the cache request to the right Redis instance. We'll use <a href="https://github.com/twitter/twemproxy">Twemproxy</a> to act as the ambassador between the model service and the Redis instances.</p>
<p>The YAML for the ambassador container is defined in the Deployment resource of the model service and it looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nn">...</span><span class="w"></span>
<span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">ambassador</span><span class="w"></span>
<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">image</span><span class="p p-Indicator">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">malexer/twemproxy</span><span class="w"></span>
<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">env</span><span class="p p-Indicator">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">REDIS_SERVERS</span><span class="w"></span>
<span class="w"> </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">redis-st-0.redis-service.model-services.svc.cluster.local:6379:1,redis-st-1.redis-service.model-services.svc.cluster.local:6379:1</span><span class="w"></span>
<span class="w"> </span><span class="w w-Error"> </span><span class="nt">ports</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">containerPort</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">6380</span><span class="w"></span>
<span class="nn">...</span><span class="w"></span>
</code></pre></div>
<p>Notice that the ambassador is listening on localhost port 6380. We'll need to set this correctly in the caching decorator's configuration.</p>
<p>To start the model service, first we'll need to send the docker image from the local docker daemon to the minikube image cache:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">image</span> <span class="n">load</span> <span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="n">latest</span>
</code></pre></div>
<p>We can view the images in the minikube cache like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">cache</span> <span class="nb">list</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">insurance_charges_model_service</span><span class="o">:</span><span class="n">latest</span><span class="w"></span>
</code></pre></div>
<p>The model service with the ambassador are created within the Kubernetes cluster with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps/insurance-charges-model-deployment created
service/insurance-charges-model-service created
</code></pre></div>
<p>The deployment and service for the model service were created together. You can see the new service with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">services</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">insurance</span><span class="o">-</span><span class="n">charges</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>insurance-charges-model-service NodePort 10.107.94.124 <none> 80:32440/TCP 3s
</code></pre></div>
<p>Minikube exposes the service on a local port, we can get a link to the endpoint with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">service</span> <span class="n">insurance</span><span class="o">-</span><span class="n">charges</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">service</span> <span class="o">--</span><span class="n">url</span> <span class="o">-</span><span class="n">n</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>http://192.168.59.100:32440
</code></pre></div>
<p>To make a prediction, we'll hit the service with a request:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">time</span> <span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://192.168.59.100:32440/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 65, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">male</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 22, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 5, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: true, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">southwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":25390.95}curl -X 'POST' -H 'accept: application/json' -H -d 0.01s user 0.01s system 8% cpu 0.158 total
</code></pre></div>
<p>The service and decorator are working! The prediction request took 0.158 seconds. We'll try the same prediction one more time to see if it takes less time.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">time</span> <span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://192.168.59.100:32440/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 65, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">male</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 22, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 5, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: true, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">southwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":25390.95}curl -X 'POST' -H 'accept: application/json' -H -d 0.01s user 0.01s system 55% cpu 0.022 total
</code></pre></div>
<p>The second time we made the prediction it took 0.022 seconds, which is faster than the first time we made the prediction. This tells us that the caching is working as expected.</p>
<p>We can review the contents of the Redis caches by executing the Redis CLI in the pods:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">exec</span> <span class="o">--</span><span class="n">stdin</span> <span class="o">--</span><span class="n">tty</span> <span class="n">redis</span><span class="o">-</span><span class="n">st</span><span class="o">-</span><span class="mi">1</span> <span class="o">--</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">SCAN</span> <span class="mi">0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">1</span><span class="p">)</span><span class="w"> </span><span class="s">"0"</span><span class="w"></span>
<span class="mf">2</span><span class="p">)</span><span class="w"> </span><span class="mf">1</span><span class="p">)</span><span class="w"> </span><span class="s">"insurance_charges_model/0.1.0/-4784352684431719157"</span><span class="w"></span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">exec</span> <span class="o">--</span><span class="n">stdin</span> <span class="o">--</span><span class="n">tty</span> <span class="n">redis</span><span class="o">-</span><span class="n">st</span><span class="o">-</span><span class="mi">1</span> <span class="o">--</span> <span class="n">redis</span><span class="o">-</span><span class="n">cli</span> <span class="n">GET</span> <span class="n">insurance_charges_model</span><span class="o">/</span><span class="mf">0.1.0</span><span class="o">/-</span><span class="mi">4784352684431719157</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>"{\"charges\": 25390.95}"
</code></pre></div>
<p>Notice that the cache entry was found in the second instance of Redis in the StatefulSet.</p>
<h3>Adding a Prediction ID</h3>
<p>The model has a single decorator working on it within the model service but we can add any number of decorators to add functionality. In a <a href="https://www.tekhnoal.com/ml-model-decorators.html">previous blog post</a> we created a decorator that added a unique prediction id to every prediction returned by the model. We can add this decorator to the service by simply changing the configuration:</p>
<div class="highlight"><pre><span></span><code><span class="nn">...</span><span class="w"></span>
<span class="nt">decorators</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">data_enrichment.prediction_id.PredictionIDDecorator</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">ml_model_caching.redis.RedisCachingDecorator</span><span class="w"></span>
<span class="w"> </span><span class="nt">configuration</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="s">"localhost"</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">6380</span><span class="w"></span>
<span class="w"> </span><span class="nt">database</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">0</span><span class="w"></span>
<span class="w"> </span><span class="nt">hashing_fields</span><span class="p">:</span><span class="w"> </span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">age</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">sex</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">bmi</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">children</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">smoker</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">region</span><span class="w"></span>
<span class="nn">...</span><span class="w"></span>
</code></pre></div>
<p>The PredictionIDDecorator decorator adds a unique identifier field to the prediction input data structure before the prediction request is passed to the caching decorator. We'll need to remove this field from the list of hashing fields because it should not be used to create the cached prediction, if we left the prediction_id field in the hashing fields then every single prediction request would be unique and we would not benefit from the cache.</p>
<p>This configuration is in the ./configuration/kubernetes_rest_config2.yaml file. We'll change the configuration file being used and recreate the Deployment again:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps/insurance-charges-model-deployment configured
service/insurance-charges-model-service unchanged
</code></pre></div>
<p>We'll try the service one more time:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://192.168.59.100:32440/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 65, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">sex</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">male</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 22, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">children</span><span class="se">\"</span><span class="s2">: 5, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: true, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">region</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">southwest</span><span class="se">\"</span><span class="s2"> </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":25390.95,"prediction_id":"1aed2c71-9451-4cba-8d42-640d4b9695d8"}
</code></pre></div>
<p>The service returned a unique identifier field called "prediction_id" along with the prediction. This field was generated by the decorator we added through configuration. A full explanation of how the prediction ID decorator works can be found in the previous blog post.</p>
<p>This shows how easy and powerful it is to combine decorator with models in order to do more complex operations.</p>
<h3>Deleting the Resources</h3>
<p>Now that we're done with the service we need to destroy the resources. To delete the Redis deploymet, we'll delete the kubernetes resources:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">redis_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>service "redis-service" deleted
statefulset.apps "redis-st" deleted
</code></pre></div>
<p>To delete the model service, we'll execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps "insurance-charges-model-deployment" deleted
service "insurance-charges-model-service" deleted
</code></pre></div>
<p>To delete the namespace:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="o">../</span><span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yaml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace "model-services" deleted
</code></pre></div>
<p>Lastly, to stop the kubernetes cluster, execute these commands:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">minikube</span> <span class="n">stop</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>✋ Stopping node "minikube" ...
🛑 1 node stopped.
</code></pre></div>
<h2>Closing</h2>
<p>In this blog post, we showed how to build a decorator class that is able to cache predictions made by a machine learning model. Caching is a simple way to speed up predictions that we know can be reused and are requested often from a model. </p>
<p>The cache decorator classes can be applied to any model that uses the MLModel base class without having to modify the model class at all. The caching functionality is contained completely in the RedisCacheDecorator class. The same thing is true for the RESTful model service, the cache functionality did not need to be added to the service because we separated the concerns of the service and the cache decorator. We were able to add caching to the deployed model by modifying the configuration. By using decorators we’re able to create software components that can be reused in many different contexts. For example, if we chose to deploy the cache decorator in a gRPC service we should be able to do so as long as we instantiate and manage the decorator instance correctly.</p>
<p>Combining the caching decorator with other decorators that require I/O like data enrichment is very easy because of the way that decorators can be "stacked" together. We showed how to do this in this blog post by adding a decorator that adds unique identifier to each prediction.</p>Data Enrichment for ML Model Deployments2022-05-01T07:00:00-05:002022-05-01T07:00:00-05:00Brian Schmidttag:www.tekhnoal.com,2022-05-01:/data-enrichment-for-ml-models.html<p>Machine learning models need data to make predictions. When deploying a model to a production setting, this data is not necessarily available from the client system that is requesting the prediction. When this happens, some other source is needed for the data that is required by the model but not provided by the client system. The process of accessing the data and joining it to the client's prediction request is called data enrichment. In all cases, the model itself should not need to be modified in order to do data enrichment, the process should be transparent to the model. In this blog post, we'll show a method for doing data enrichment that does not require the model itself to be modified.</p><h1>Data Enrichment for ML Model Deployments</h1>
<p>In the <a href="https://www.tekhnoal.com/ml-model-decorators.html">previous blog post</a> we introduced the decorator pattern for ML model deployments and then showed how to use the pattern to build extensions for machine learning models. The extensions that we showed in the previous post were added without having to modify the machine learning model code at all, we were able to do it by creating a decorator class that wrapped the model. In this blog post we’ll use decorators to add data enrichment capabilities to an ML model.</p>
<h2>Introduction</h2>
<p>Machine learning models need data to make predictions. When deploying a model to a production setting, this data is not necessarily available from the client system that is requesting the prediction. When this happens, some other source is needed for the data that is required by the model but not provided by the client system. The process of accessing the data and joining it to the client's prediction request is called data enrichment. In all cases, the model itself should not need to be modified in order to do data enrichment, the process should be transparent to the model. In this blog post, we'll show a method for doing data enrichment that does not require the model itself to be modified.</p>
<p>Data enrichment is often done because the client system does not have access to the data that the model needs to make a prediction. In this case, the client must provide a field that the model can use to find the data that it needs to make a prediction, we'll call this the "index field". For example, in order to load customer details that need to be used to make a prediction, we need to get a customer id field that uniquely identifies the customer record.
Once the data is loaded from a data source, the model can be called to make a prediction using the fields that it expects.</p>
<p>Other times, the client system is simply not the right place to manage the data that the model needs for predictions because of it's complexity. In this case, we would like to prevent the client system from having to manage data that really does not fall within it's responsabilities. In order to allow the client system to use the model without having to manage the extra data, we can add data enrichment capabilities to the model deployment.</p>
<p>Data enrichment simplifies the work of the client system because a client system can simply provide a way to find the correct data to the deployed ML model. The model deployment is then responsible for going and fetching the correct record, joining it to the data provided by the client system, and making a prediction. Data enrichment also prevents the client system from having to manage the data needed by the model, which keeps the two systems from becoming too coupled. </p>
<p>One more benefit of doing data enrichment is that the model can evolve by using new fields for predictions without affecting the client system at all. By having the model access the data that it needs to make a prediction, the model can access new data and the client system is not responsible for providing or managing the new fields. This allows the deployed model to evolve more easily.</p>
<p>In this blog post, we’ll show how to create a simple decorator that is able to access a database in order to do data enrichment for an ML model that is deployed to a production system. We'll also show how to deploy the decorator along with the model to a RESTful service, and how to create the necessary database to hold the data.</p>
<p>All of the code is available in this <a href="https://github.com/schmidtbri/data-enrichment-for-ml-models">github repository</a>.</p>
<h2>Software Architecture</h2>
<p>The decorator that we will be building requires an outside database in order to access data to do data enrichment. The software architecture will be a little more complicated because we’ll have to deploy a service for the model as well as a database for the data.</p>
<p><img alt="Software Architecture" src="https://www.tekhnoal.com/software_architecture_defml.png" width="100%"></p>
<p>The client system accesses the model by reaching out to the model service which hosts both the model and the decorator that we will be building in this blog post. The decorator is the software component that does the data enrichment needed by the model. The decorator reaches out to the database to access data needed by the model, provides the data to the model to make a prediction, and then returns the prediction to the client system. </p>
<p>To store the data that we want to use for enrichment, we’ll use a PostgreSQL database.</p>
<h1>Installing a Model</h1>
<p>To make this blog post a little shorter we won't train a completely new model. Instead we'll install a model that we've built in <a href="https://www.tekhnoal.com/regression-model.html">a previous blog post</a>. The code for the model is in <a href="https://github.com/schmidtbri/regression-model">this github repository</a>.</p>
<p>To install the model, we can use the pip command and point it at the github repo of the model.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">clear_output</span>
<span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="o">-</span><span class="n">e</span> <span class="n">git</span><span class="o">+</span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">github</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">schmidtbri</span><span class="o">/</span><span class="n">regression</span><span class="o">-</span><span class="n">model</span><span class="c1">#egg=insurance_charges_model</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make a prediction with the model, we'll import the model's class.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.model</span> <span class="kn">import</span> <span class="n">InsuranceChargesModel</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Now we can instantiate the model:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">InsuranceChargesModel</span><span class="p">()</span>
</code></pre></div>
<p>To make a prediction, we'll need to use the model's input schema class.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.schemas</span> <span class="kn">import</span> <span class="n">InsuranceChargesModelInput</span><span class="p">,</span> \
<span class="n">SexEnum</span><span class="p">,</span> <span class="n">RegionEnum</span>
<span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span>
<span class="n">age</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">female</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">24.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">northwest</span><span class="p">)</span>
</code></pre></div>
<p>The model's input schema is called InsuranceChargesModelInput and it encompasses all of the features required by the model to make a prediction.</p>
<p>Now we can make a prediction with the model by calling the predict() method with an instance of the InsuranceChargesModelInput class.</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=8640.78)
</code></pre></div>
<p>The model predicts that the charges will be $8640.78.</p>
<p>When deploying the model we’ll pretend that the age, sex, bmi, children, smoker, and region fields are not available from the client system that is calling the model. Because of this, we’ll need to add it to the model input by loading the data from the database.</p>
<p>We can view input schema of the model as a JSON schema document by calling the .schema() method on the instance.</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'InsuranceChargesModelInput'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Schema for input of the model's predict method."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'object'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'properties'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'age'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Age'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Age of primary beneficiary in years.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mi">18</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mi">65</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'integer'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'sex'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Sex'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Gender of beneficiary.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/SexEnum'</span><span class="p">}]},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'bmi'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Body Mass Index'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Body mass index of beneficiary.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mf">15.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mf">50.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'number'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'children'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Children'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Number of children covered by health insurance.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'integer'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'smoker'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Smoker'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Whether beneficiary is a smoker.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'boolean'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'region'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Region'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Region where beneficiary lives.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/RegionEnum'</span><span class="p">}]}},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'definitions'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'SexEnum'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'SexEnum'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Enumeration for the value of the 'sex' input of the model."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'male'</span><span class="p">,</span><span class="w"> </span><span class="s1">'female'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'RegionEnum'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'RegionEnum'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Enumeration for the value of the 'region' input of the model."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'southwest'</span><span class="p">,</span><span class="w"> </span><span class="s1">'southeast'</span><span class="p">,</span><span class="w"> </span><span class="s1">'northwest'</span><span class="p">,</span><span class="w"> </span><span class="s1">'northeast'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">}}}</span><span class="w"></span>
</code></pre></div>
<h2>Creating the Data Enrichment Decorator</h2>
<p>A decorator needs to inherit from the MLModelDecorator base class, which requires a specific set of methods and properties be implemented. The decorator that can access PostgreSQL looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">os</span>
<span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span>
<span class="kn">from</span> <span class="nn">pydantic</span> <span class="kn">import</span> <span class="n">BaseModel</span><span class="p">,</span> <span class="n">create_model</span>
<span class="kn">import</span> <span class="nn">psycopg2</span>
<span class="kn">from</span> <span class="nn">ml_base.decorator</span> <span class="kn">import</span> <span class="n">MLModelDecorator</span>
<span class="kn">from</span> <span class="nn">ml_base.ml_model</span> <span class="kn">import</span> <span class="n">MLModelSchemaValidationException</span>
<span class="k">class</span> <span class="nc">PostgreSQLEnrichmentDecorator</span><span class="p">(</span><span class="n">MLModelDecorator</span><span class="p">):</span>
<span class="sd">"""Decorator to do data enrichment using a PostgreSQL database."""</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">host</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">port</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">username</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">password</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">database</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
<span class="n">table</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">index_field_name</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">index_field_type</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
<span class="n">enrichment_fields</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">])</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="c1"># if password has ${}, then replace with environment variable</span>
<span class="k">if</span> <span class="n">password</span><span class="p">[</span><span class="mi">0</span><span class="p">:</span><span class="mi">2</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"${"</span> <span class="ow">and</span> <span class="n">password</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"}"</span><span class="p">:</span>
<span class="n">password</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="n">password</span><span class="p">[</span><span class="mi">2</span><span class="p">:</span><span class="o">-</span><span class="mi">1</span><span class="p">]]</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="n">host</span><span class="p">,</span> <span class="n">port</span><span class="o">=</span><span class="n">port</span><span class="p">,</span> <span class="n">username</span><span class="o">=</span><span class="n">username</span><span class="p">,</span> <span class="n">password</span><span class="o">=</span><span class="n">password</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="n">database</span><span class="p">,</span> <span class="n">table</span><span class="o">=</span><span class="n">table</span><span class="p">,</span> <span class="n">index_field_name</span><span class="o">=</span><span class="n">index_field_name</span><span class="p">,</span>
<span class="n">index_field_type</span><span class="o">=</span><span class="n">index_field_type</span><span class="p">,</span> <span class="n">enrichment_fields</span><span class="o">=</span><span class="n">enrichment_fields</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_connection"</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="n">BaseModel</span><span class="p">:</span>
<span class="c1"># converting the index field type from a string to a class</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">index_field_type</span> <span class="o">=</span> <span class="n">__builtins__</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"index_field_type"</span><span class="p">]]</span>
<span class="k">except</span> <span class="ne">TypeError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">index_field_type</span> <span class="o">=</span> <span class="n">__builtins__</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"index_field_type"</span><span class="p">]]</span>
<span class="n">input_schema</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">input_schema</span>
<span class="c1"># adding index field to schema because it is required in order to retrieve</span>
<span class="c1"># the right record in the database</span>
<span class="n">fields</span> <span class="o">=</span> <span class="p">{</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"index_field_name"</span><span class="p">]:</span> <span class="p">(</span><span class="n">index_field_type</span><span class="p">,</span> <span class="o">...</span><span class="p">)</span>
<span class="p">}</span>
<span class="k">for</span> <span class="n">field_name</span><span class="p">,</span> <span class="n">schema</span> <span class="ow">in</span> <span class="n">input_schema</span><span class="o">.</span><span class="n">__fields__</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
<span class="c1"># remove enrichment_fields from schema because they'll be added from the</span>
<span class="c1"># database and don't need to be provided by the client</span>
<span class="k">if</span> <span class="n">field_name</span> <span class="ow">not</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"enrichment_fields"</span><span class="p">]:</span>
<span class="k">if</span> <span class="n">schema</span><span class="o">.</span><span class="n">required</span><span class="p">:</span>
<span class="n">fields</span><span class="p">[</span><span class="n">field_name</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">schema</span><span class="o">.</span><span class="n">type_</span><span class="p">,</span> <span class="o">...</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">fields</span><span class="p">[</span><span class="n">field_name</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">schema</span><span class="o">.</span><span class="n">type_</span><span class="p">,</span> <span class="n">schema</span><span class="o">.</span><span class="n">default</span><span class="p">)</span>
<span class="n">new_input_schema</span> <span class="o">=</span> <span class="n">create_model</span><span class="p">(</span>
<span class="n">input_schema</span><span class="o">.</span><span class="vm">__name__</span><span class="p">,</span>
<span class="o">**</span><span class="n">fields</span>
<span class="p">)</span>
<span class="k">return</span> <span class="n">new_input_schema</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="c1"># create a connection to the database, if it doesn't exist already</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_connection"</span><span class="p">]</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_connection"</span><span class="p">]</span> <span class="o">=</span> <span class="n">psycopg2</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span>
<span class="n">host</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"host"</span><span class="p">],</span>
<span class="n">port</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"port"</span><span class="p">],</span>
<span class="n">database</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"database"</span><span class="p">],</span>
<span class="n">user</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"username"</span><span class="p">],</span>
<span class="n">password</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"password"</span><span class="p">])</span>
<span class="n">cursor</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_connection"</span><span class="p">]</span><span class="o">.</span><span class="n">cursor</span><span class="p">()</span>
<span class="c1"># build a SELECT statement using the index_field and the enrichment_fields</span>
<span class="n">enrichment_fields</span> <span class="o">=</span> <span class="s2">", "</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"enrichment_fields"</span><span class="p">])</span>
<span class="n">sql_statement</span> <span class="o">=</span> <span class="s2">"SELECT </span><span class="si">{}</span><span class="s2"> FROM </span><span class="si">{}</span><span class="s2"> WHERE </span><span class="si">{}</span><span class="s2"> = </span><span class="si">%s</span><span class="s2">;"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
<span class="n">enrichment_fields</span><span class="p">,</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"table"</span><span class="p">],</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"index_field_name"</span><span class="p">])</span>
<span class="c1"># executing the SELECT statement</span>
<span class="n">cursor</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="n">sql_statement</span><span class="p">,</span>
<span class="p">(</span><span class="nb">getattr</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"index_field_name"</span><span class="p">]),</span> <span class="p">))</span>
<span class="n">records</span> <span class="o">=</span> <span class="n">cursor</span><span class="o">.</span><span class="n">fetchall</span><span class="p">()</span>
<span class="n">cursor</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">records</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Could not find a record for data enrichment."</span><span class="p">)</span>
<span class="k">elif</span> <span class="nb">len</span><span class="p">(</span><span class="n">records</span><span class="p">)</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
<span class="n">record</span> <span class="o">=</span> <span class="n">records</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Query returned more than one record."</span><span class="p">)</span>
<span class="c1"># creating an instance of the model's input schema using the fields that</span>
<span class="c1"># came back from the database and fields that are provided by calling code</span>
<span class="n">input_schema</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">input_schema</span>
<span class="n">enriched_data</span> <span class="o">=</span> <span class="p">{}</span>
<span class="k">for</span> <span class="n">field_name</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">__fields__</span><span class="o">.</span><span class="n">keys</span><span class="p">():</span>
<span class="k">if</span> <span class="n">field_name</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"index_field_name"</span><span class="p">]:</span>
<span class="k">pass</span>
<span class="k">elif</span> <span class="n">field_name</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"enrichment_fields"</span><span class="p">]:</span>
<span class="n">field_index</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_configuration</span><span class="p">[</span><span class="s2">"enrichment_fields"</span><span class="p">]</span><span class="o">.</span><span class="n">index</span><span class="p">(</span><span class="n">field_name</span><span class="p">)</span>
<span class="n">enriched_data</span><span class="p">[</span><span class="n">field_name</span><span class="p">]</span> <span class="o">=</span> <span class="n">record</span><span class="p">[</span><span class="n">field_index</span><span class="p">]</span>
<span class="k">elif</span> <span class="n">field_name</span> <span class="ow">in</span> <span class="n">data</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span><span class="o">.</span><span class="n">keys</span><span class="p">():</span>
<span class="n">enriched_data</span><span class="p">[</span><span class="n">field_name</span><span class="p">]</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">field_name</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Could not find value for field '</span><span class="si">{}</span><span class="s2">'."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">field_name</span><span class="p">))</span>
<span class="c1"># making a prediction with the model, using the enriched fields</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">enriched_data</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">input_schema</span><span class="p">(</span><span class="o">**</span><span class="n">enriched_data</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">ValueError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">MLModelSchemaValidationException</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">e</span><span class="p">))</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">enriched_data</span><span class="p">)</span>
<span class="k">return</span> <span class="n">prediction</span>
<span class="k">def</span> <span class="fm">__del__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">try</span><span class="p">:</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_connection"</span><span class="p">]</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_connection"</span><span class="p">]</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="k">except</span> <span class="ne">KeyError</span><span class="p">:</span>
<span class="k">pass</span>
</code></pre></div>
<p>The code is quite long, it is mainly made up of two methods: the input_schema method and the predict method. The input_schema method modifies the model's input schema according to the requirements of the data enrichment we want to do. The predict method is responsible for retrieving the data needed by the model and joining it to the data already provided by the client system.</p>
<p>The <code>__init__()</code> method accepts configuration that is used to customize the way that the decorator finds data in the database. The decorator accepts these parameters:</p>
<ul>
<li>host: hostname for connecting to the database server</li>
<li>port: port for connecting to the database server</li>
<li>username: username for accessing the database</li>
<li>password: password for accessing the database</li>
<li>table: name of the table in the database where data used for enrichment is found</li>
<li>index_field_name: name of the field used for selecting a record</li>
<li>index_field_type: type of the index field</li>
<li>enrichment_fields: names of the fields that will be added to the data sent to the model to make a prediction</li>
</ul>
<p>The configuration is saved by passing it up to the super class using the <code>super().__init__()</code> method. The configuration values can then be accessed inside of the decorator instance in the <code>self._configuration</code> attribute, which is a dictionary.</p>
<p>When the decorator is applied to a model, it modifies the input_schema of the model. It removes the enrichment_fields from the input schema because these fields are going to be added from the database. This means that the client does not need to provide values for them anymore. It also adds the index_field to the input schema because the decorator needs to use this field to access the correct record in the database table. The index_field is added as a required field in the model’s input_schema because the decorator always needs it.</p>
<p>When a prediction request is made to the decorator, it uses the value in the index_field to access the record in the database table. If the decorator finds the record in the table, it selects the enrichment fields and creates a new input object for the model and sends it to the model. If the record is not found, the decorator raises an exception. The index_field is actually not sent to the model at all, it is used purely to access the data needed by the model in the database. If more than one record is returned from the database, an exception is raised.</p>
<p>The SQL statement is built dynamically based on the fields required by the model and the index field selected through configuration. For example, if we wanted to do enrichment with all of the input fields of the InsuranceChargesModel, the SELECT statement would look like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">SELECT</span><span class="w"> </span><span class="n">age</span><span class="p">,</span><span class="w"> </span><span class="n">sex</span><span class="p">,</span><span class="w"> </span><span class="n">bmi</span><span class="p">,</span><span class="w"> </span><span class="n">children</span><span class="p">,</span><span class="w"> </span><span class="n">smoker</span><span class="p">,</span><span class="w"> </span><span class="n">region</span><span class="w"></span>
<span class="k">FROM</span><span class="w"> </span><span class="n">clients</span><span class="w"></span>
<span class="k">WHERE</span><span class="w"> </span><span class="n">ssn</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'123-45-6789'</span><span class="w"></span>
</code></pre></div>
<p>In this case we would be accessing a client record by using their social security number as the index field.</p>
<h2>Decorating the Model</h2>
<p>To test out the decorator we’ll first instantiate the model object that we want to use with the decorator.</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">InsuranceChargesModel</span><span class="p">()</span>
</code></pre></div>
<p>Next, we’ll instantiate the decorator with the parameters.</p>
<div class="highlight"><pre><span></span><code><span class="n">decorator</span> <span class="o">=</span> <span class="n">PostgreSQLEnrichmentDecorator</span><span class="p">(</span>
<span class="n">host</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">username</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">password</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">table</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">index_field_name</span><span class="o">=</span><span class="s2">"ssn"</span><span class="p">,</span>
<span class="n">index_field_type</span><span class="o">=</span><span class="s2">"str"</span><span class="p">,</span>
<span class="n">enrichment_fields</span><span class="o">=</span><span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"bmi"</span><span class="p">,</span> <span class="s2">"children"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">])</span>
</code></pre></div>
<p>We won't fill in the database details because we don't have a database to connect to yet. However, we can still see how the model's input and output schemas change because of the decorator. In this example, we'll use a client's social security number to uniquely identify records in the datbase table.</p>
<p>We can add the model instance to the decorator after it’s been instantiated like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_model</span> <span class="o">=</span> <span class="n">decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<p>We can see the decorator and the model objects by printing the reference to the decorator:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_model</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>PostgreSQLEnrichmentDecorator(InsuranceChargesModel)
</code></pre></div>
<p>The decorator object is printing out it's own type along with the type of the model that it is decorating.</p>
<p>Now we’ll try to use the decorator and the model together by doing a few things. First, we’ll look at the model input schema:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{'title': 'InsuranceChargesModelInput',
'type': 'object',
'properties': {'ssn': {'title': 'Ssn', 'type': 'string'}},
'required': ['ssn']}
</code></pre></div>
<p>As we can see, the input schema is not the same as what the model exposed, all of the model’s input fields are now removed because they are being provided by the decorator by accessing the database. The user of the model is not expected to provide a value for those fields. However, there is a new field in the schema, the “ssn” field. This field is used by the decorator to select the correct record in the database.</p>
<p>We can also use a few fields from the database and require the client to provide the rest. To do this we'll instantiate the decorator with a few, but not all, of the fields required by the model as enrichment fields.</p>
<div class="highlight"><pre><span></span><code><span class="n">decorator</span> <span class="o">=</span> <span class="n">PostgreSQLEnrichmentDecorator</span><span class="p">(</span>
<span class="n">host</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">username</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">password</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">table</span><span class="o">=</span><span class="s2">""</span><span class="p">,</span>
<span class="n">index_field_name</span><span class="o">=</span><span class="s2">"ssn"</span><span class="p">,</span>
<span class="n">index_field_type</span><span class="o">=</span><span class="s2">"str"</span><span class="p">,</span>
<span class="n">enrichment_fields</span><span class="o">=</span><span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">])</span>
<span class="n">decorated_model</span> <span class="o">=</span> <span class="n">decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
<span class="n">decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{'title': 'InsuranceChargesModelInput',
'type': 'object',
'properties': {'ssn': {'title': 'Ssn', 'type': 'string'},
'bmi': {'title': 'Bmi', 'minimum': 15.0, 'maximum': 50.0, 'type': 'number'},
'children': {'title': 'Children',
'minimum': 0,
'maximum': 5,
'type': 'integer'}},
'required': ['ssn']}
</code></pre></div>
<p>The model's input schema now requires the fields that are not listed as enrichment fields to be provided by the client. The "ssn" field is still added because the decorator needs it in order to retrieve the enrichment fields from the database.</p>
<p>Next, we’ll look at the decorated model’s output schema:</p>
<div class="highlight"><pre><span></span><code><span class="n">output_schema</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">output_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
<span class="n">output_schema</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{<span class="s1">'</span><span class="s">title</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">InsuranceChargesModelOutput</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">description</span><span class="s1">'</span>: <span class="s2">"</span><span class="s">Schema for output of the model's predict method.</span><span class="s2">"</span>,
<span class="s1">'</span><span class="s">type</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">object</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">properties</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">charges</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">title</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">Charges</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">description</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">Individual medical costs billed by health insurance to customer in US dollars.</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">type</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">number</span><span class="s1">'</span>}}}
</code></pre></div>
<p>The output schema has not changed at all, the decorator does not modify the prediction or the schema of the prediction returned by the model.</p>
<h2>Creating a Database</h2>
<p>Now that we have a model and a decorator that can add data to the input of the model, we need to create a database table to pull data from. To do this we’ll first start a PostgreSQL instance in a local docker image.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">--</span><span class="n">name</span> <span class="n">postgres</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">5432</span><span class="p">:</span><span class="mi">5432</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">POSTGRES_USER</span><span class="o">=</span><span class="n">data_enrichment_user</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">POSTGRES_PASSWORD</span><span class="o">=</span><span class="n">data_enrichment_password</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">POSTGRES_DB</span><span class="o">=</span><span class="n">data_enrichment</span> \
<span class="o">-</span><span class="n">d</span> <span class="n">postgres</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">695889</span><span class="n">c4c39617d44b158d7307d431180b1358e62ad07bdf26347a85f725468e</span><span class="w"></span>
</code></pre></div>
<p>We can connect to the database by starting a client within the same container and executing a SQL statement.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">it</span> <span class="o">--</span><span class="n">rm</span> \
<span class="o">--</span><span class="n">network</span><span class="o">=</span><span class="s2">"host"</span> <span class="n">postgres</span> \
<span class="n">psql</span> <span class="n">postgresql</span><span class="p">:</span><span class="o">//</span><span class="n">data_enrichment_user</span><span class="p">:</span><span class="n">data_enrichment_password</span><span class="o">@</span><span class="mf">127.0.0.1</span><span class="p">:</span><span class="mi">5432</span><span class="o">/</span><span class="n">data_enrichment</span> \
<span class="o">-</span><span class="n">c</span> <span class="s2">"SELECT current_database();"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code> current_database
------------------
data_enrichment
(1 row)
</code></pre></div>
<p>The current database within the server is called "data_enrichment" and it was created when the docker image started.</p>
<p>Next we'll execute a SQL statement that creates a table within the database.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">it</span> <span class="o">--</span><span class="n">rm</span> \
<span class="o">--</span><span class="n">network</span><span class="o">=</span><span class="s2">"host"</span> <span class="n">postgres</span> \
<span class="n">psql</span> <span class="n">postgresql</span><span class="p">:</span><span class="o">//</span><span class="n">data_enrichment_user</span><span class="p">:</span><span class="n">data_enrichment_password</span><span class="o">@</span><span class="mf">127.0.0.1</span><span class="p">:</span><span class="mi">5432</span><span class="o">/</span><span class="n">data_enrichment</span> \
<span class="o">-</span><span class="n">c</span> <span class="s2">"CREATE TABLE clients ( </span><span class="se">\</span>
<span class="s2"> ssn varchar(11) PRIMARY KEY, </span><span class="se">\</span>
<span class="s2"> first_name varchar(30) NOT NULL, </span><span class="se">\</span>
<span class="s2"> last_name varchar(30) NOT NULL, </span><span class="se">\</span>
<span class="s2"> age integer NOT NULL, </span><span class="se">\</span>
<span class="s2"> sex varchar(6) NOT NULL, </span><span class="se">\</span>
<span class="s2"> bmi integer NOT NULL, </span><span class="se">\</span>
<span class="s2"> children integer NOT NULL, </span><span class="se">\</span>
<span class="s2"> smoker boolean NOT NULL, </span><span class="se">\</span>
<span class="s2"> region varchar(10) NOT NULL </span><span class="se">\</span>
<span class="s2">);"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>CREATE TABLE
</code></pre></div>
<p>The table has been created, we can see the table schema looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">it</span> <span class="o">--</span><span class="n">rm</span> \
<span class="o">--</span><span class="n">network</span> <span class="n">host</span> <span class="n">postgres</span> \
<span class="n">psql</span> <span class="n">postgresql</span><span class="p">:</span><span class="o">//</span><span class="n">data_enrichment_user</span><span class="p">:</span><span class="n">data_enrichment_password</span><span class="o">@</span><span class="mf">127.0.0.1</span><span class="p">:</span><span class="mi">5432</span><span class="o">/</span><span class="n">data_enrichment</span> \
<span class="o">-</span><span class="n">c</span> <span class="s2">"\d clients"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="w"> </span><span class="n">Table</span><span class="w"> </span><span class="s2">"public.clients"</span><span class="w"></span>
<span class="w"> </span><span class="n">Column</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Type</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Collation</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Nullable</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Default</span><span class="w"> </span>
<span class="o">------------+-----------------------+-----------+----------+---------</span><span class="w"></span>
<span class="w"> </span><span class="n">ssn</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">character</span><span class="w"> </span><span class="n">varying</span><span class="p">(</span><span class="mi">11</span><span class="p">)</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nb nb-Type">null</span><span class="w"> </span><span class="o">|</span><span class="w"> </span>
<span class="w"> </span><span class="n">first_name</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">character</span><span class="w"> </span><span class="n">varying</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nb nb-Type">null</span><span class="w"> </span><span class="o">|</span><span class="w"> </span>
<span class="w"> </span><span class="n">last_name</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">character</span><span class="w"> </span><span class="n">varying</span><span class="p">(</span><span class="mi">30</span><span class="p">)</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nb nb-Type">null</span><span class="w"> </span><span class="o">|</span><span class="w"> </span>
<span class="w"> </span><span class="n">age</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">integer</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nb nb-Type">null</span><span class="w"> </span><span class="o">|</span><span class="w"> </span>
<span class="w"> </span><span class="n">sex</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">character</span><span class="w"> </span><span class="n">varying</span><span class="p">(</span><span class="mi">6</span><span class="p">)</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nb nb-Type">null</span><span class="w"> </span><span class="o">|</span><span class="w"> </span>
<span class="w"> </span><span class="n">bmi</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">integer</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nb nb-Type">null</span><span class="w"> </span><span class="o">|</span><span class="w"> </span>
<span class="w"> </span><span class="n">children</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">integer</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nb nb-Type">null</span><span class="w"> </span><span class="o">|</span><span class="w"> </span>
<span class="w"> </span><span class="n">smoker</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">boolean</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nb nb-Type">null</span><span class="w"> </span><span class="o">|</span><span class="w"> </span>
<span class="w"> </span><span class="n">region</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">character</span><span class="w"> </span><span class="n">varying</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="nb nb-Type">null</span><span class="w"> </span><span class="o">|</span><span class="w"> </span>
<span class="n">Indexes</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="s2">"clients_pkey"</span><span class="w"> </span><span class="n">PRIMARY</span><span class="w"> </span><span class="n">KEY</span><span class="p">,</span><span class="w"> </span><span class="n">btree</span><span class="w"> </span><span class="p">(</span><span class="n">ssn</span><span class="p">)</span><span class="w"></span>
</code></pre></div>
<p>The table has columns for all of the fields that the model requires to make a prediction plus two columns for the first and last name. It also has an index field called “ssn” because we’ll be referencing each record using a fake Social Security number. The ssn field is the unique identifier for each record and is a good way to correlate data from different systems. </p>
<p>Then we’ll run a some code that connects to the database and inserts fake data into the table. To do this we'll use the faker package, so we'll need to install it first.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">Faker</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To add data to the table, we'll just generate some data for each column in the database table and save it into a list.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">faker</span> <span class="kn">import</span> <span class="n">Faker</span>
<span class="n">fake</span> <span class="o">=</span> <span class="n">Faker</span><span class="p">()</span>
<span class="n">records</span> <span class="o">=</span> <span class="nb">list</span><span class="p">()</span>
<span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1000</span><span class="p">):</span>
<span class="n">sex</span> <span class="o">=</span> <span class="n">fake</span><span class="o">.</span><span class="n">random_choices</span><span class="p">(</span><span class="n">elements</span><span class="o">=</span><span class="p">(</span><span class="s2">"male"</span><span class="p">,</span> <span class="s2">"female"</span><span class="p">),</span> <span class="n">length</span><span class="o">=</span><span class="mi">1</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
<span class="n">record</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"ssn"</span><span class="p">:</span> <span class="n">fake</span><span class="o">.</span><span class="n">ssn</span><span class="p">(),</span>
<span class="s2">"age"</span><span class="p">:</span> <span class="n">fake</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">18</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">80</span><span class="p">),</span>
<span class="s2">"sex"</span><span class="p">:</span> <span class="n">sex</span><span class="p">,</span>
<span class="s2">"bmi"</span><span class="p">:</span> <span class="n">fake</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">15</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">50</span><span class="p">),</span>
<span class="s2">"children"</span><span class="p">:</span> <span class="n">fake</span><span class="o">.</span><span class="n">random_int</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="mi">5</span><span class="p">),</span>
<span class="s2">"smoker"</span><span class="p">:</span> <span class="n">fake</span><span class="o">.</span><span class="n">boolean</span><span class="p">(),</span>
<span class="s2">"region"</span><span class="p">:</span> <span class="n">fake</span><span class="o">.</span><span class="n">random_choices</span><span class="p">(</span><span class="n">elements</span><span class="o">=</span><span class="p">(</span><span class="s2">"southwest"</span><span class="p">,</span> <span class="s2">"southeast"</span><span class="p">,</span> <span class="s2">"northwest"</span><span class="p">,</span> <span class="s2">"northeast"</span><span class="p">),</span> <span class="n">length</span><span class="o">=</span><span class="mi">1</span><span class="p">)[</span><span class="mi">0</span><span class="p">],</span>
<span class="s2">"first_name"</span><span class="p">:</span> <span class="n">fake</span><span class="o">.</span><span class="n">first_name_male</span><span class="p">()</span> <span class="k">if</span> <span class="n">sex</span> <span class="o">==</span><span class="s2">"male"</span> <span class="k">else</span> <span class="n">fake</span><span class="o">.</span><span class="n">first_name_female</span><span class="p">(),</span>
<span class="s2">"last_name"</span><span class="p">:</span> <span class="n">fake</span><span class="o">.</span><span class="n">last_name</span><span class="p">()</span>
<span class="p">}</span>
<span class="n">records</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">record</span><span class="p">)</span>
</code></pre></div>
<p>Notice that each field is generating data that does not necessarily fit the schema of the model. For example, the maximum value allowed by the model for the "age" field is 65, but the fake data can go up to 80. We'll use records that do not match the model's schema to test the decorator later.</p>
<p>Let's take a look at the first record that matches the model schema:</p>
<div class="highlight"><pre><span></span><code><span class="n">valid_record</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">record</span> <span class="k">for</span> <span class="n">record</span> <span class="ow">in</span> <span class="n">records</span> <span class="k">if</span> <span class="n">record</span><span class="p">[</span><span class="s2">"age"</span><span class="p">]</span> <span class="o"><=</span> <span class="mi">65</span><span class="p">)</span>
<span class="n">valid_record</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{'</span><span class="n">ssn</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="p">'</span><span class="mh">646</span><span class="o">-</span><span class="mh">87</span><span class="o">-</span><span class="mh">1351</span><span class="p">',</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">age</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mh">31</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">sex</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="p">'</span><span class="n">female</span><span class="p">',</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">bmi</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mh">31</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">children</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mh">1</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">smoker</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="n">False</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">region</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="p">'</span><span class="n">northeast</span><span class="p">',</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">first_name</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="p">'</span><span class="n">Vickie</span><span class="p">',</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">last_name</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="p">'</span><span class="n">Anderson</span><span class="p">'}</span><span class="w"></span>
</code></pre></div>
<p>Now let's find a record that does not fit the model's schema so we can use it later:</p>
<div class="highlight"><pre><span></span><code><span class="n">invalid_record</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">record</span> <span class="k">for</span> <span class="n">record</span> <span class="ow">in</span> <span class="n">records</span> <span class="k">if</span> <span class="n">record</span><span class="p">[</span><span class="s2">"age"</span><span class="p">]</span> <span class="o">></span> <span class="mi">65</span><span class="p">)</span>
<span class="n">invalid_record</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{'</span><span class="n">ssn</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="p">'</span><span class="mh">361</span><span class="o">-</span><span class="mh">47</span><span class="o">-</span><span class="mh">3850</span><span class="p">',</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">age</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mh">72</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">sex</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="p">'</span><span class="n">male</span><span class="p">',</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">bmi</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mh">34</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">children</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mh">4</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">smoker</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="n">False</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">region</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="p">'</span><span class="n">northeast</span><span class="p">',</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">first_name</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="p">'</span><span class="n">Michael</span><span class="p">',</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">last_name</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="p">'</span><span class="n">Pena</span><span class="p">'}</span><span class="w"></span>
</code></pre></div>
<p>We'll use the ssn numbers later to test out the decorator's error handling.</p>
<div class="highlight"><pre><span></span><code><span class="n">valid_ssn</span> <span class="o">=</span> <span class="n">valid_record</span><span class="p">[</span><span class="s2">"ssn"</span><span class="p">]</span>
<span class="n">invalid_ssn</span> <span class="o">=</span> <span class="n">invalid_record</span><span class="p">[</span><span class="s2">"ssn"</span><span class="p">]</span>
</code></pre></div>
<p>Next, we'll put the 1000 fake records generated into the database table that we created above.</p>
<div class="highlight"><pre><span></span><code><span class="n">connection</span> <span class="o">=</span> <span class="n">psycopg2</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span>
<span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="s2">"5432"</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="s2">"data_enrichment"</span><span class="p">,</span>
<span class="n">user</span><span class="o">=</span><span class="s2">"data_enrichment_user"</span><span class="p">,</span>
<span class="n">password</span><span class="o">=</span><span class="s2">"data_enrichment_password"</span><span class="p">)</span>
<span class="n">cursor</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">cursor</span><span class="p">()</span>
<span class="k">for</span> <span class="n">record</span> <span class="ow">in</span> <span class="n">records</span><span class="p">:</span>
<span class="n">cursor</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"INSERT INTO clients (ssn, first_name, last_name, age, sex, bmi, children, smoker, region)"</span>
<span class="s2">"VALUES (</span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">);"</span><span class="p">,</span>
<span class="p">(</span><span class="n">record</span><span class="p">[</span><span class="s2">"ssn"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"first_name"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"last_name"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"age"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"sex"</span><span class="p">],</span>
<span class="n">record</span><span class="p">[</span><span class="s2">"bmi"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"children"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"smoker"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"region"</span><span class="p">]))</span>
<span class="n">connection</span><span class="o">.</span><span class="n">commit</span><span class="p">()</span>
<span class="n">cursor</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="n">connection</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</code></pre></div>
<p>The database now has a table that has records that we can use to try out the model using the decorator.</p>
<p>We'll access a some records to see the data:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">it</span> <span class="o">--</span><span class="n">rm</span> \
<span class="o">--</span><span class="n">network</span> <span class="n">host</span> <span class="n">postgres</span> \
<span class="n">psql</span> <span class="n">postgresql</span><span class="p">:</span><span class="o">//</span><span class="n">data_enrichment_user</span><span class="p">:</span><span class="n">data_enrichment_password</span><span class="o">@</span><span class="mf">127.0.0.1</span><span class="p">:</span><span class="mi">5432</span><span class="o">/</span><span class="n">data_enrichment</span> \
<span class="o">-</span><span class="n">c</span> <span class="s2">"SELECT ssn, first_name, last_name FROM clients LIMIT 5;"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code> ssn | first_name | last_name
-------------+------------+-----------
646-87-1351 | Vickie | Anderson
194-94-3733 | Patricia | Lee
709-08-5148 | Seth | James
132-30-5594 | Edward | Allen
096-55-1187 | Mark | Keith
(5 rows)
</code></pre></div>
<h2>Trying out the Decorator</h2>
<p>Now that we have some data in the database, we can try to make predictions with the decorated model.</p>
<div class="highlight"><pre><span></span><code><span class="n">decorator</span> <span class="o">=</span> <span class="n">PostgreSQLEnrichmentDecorator</span><span class="p">(</span>
<span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="s2">"5432"</span><span class="p">,</span>
<span class="n">username</span><span class="o">=</span><span class="s2">"data_enrichment_user"</span><span class="p">,</span>
<span class="n">password</span><span class="o">=</span><span class="s2">"data_enrichment_password"</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="s2">"data_enrichment"</span><span class="p">,</span>
<span class="n">table</span><span class="o">=</span><span class="s2">"clients"</span><span class="p">,</span>
<span class="n">index_field_name</span><span class="o">=</span><span class="s2">"ssn"</span><span class="p">,</span>
<span class="n">index_field_type</span><span class="o">=</span><span class="s2">"str"</span><span class="p">,</span>
<span class="n">enrichment_fields</span><span class="o">=</span><span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"bmi"</span><span class="p">,</span> <span class="s2">"children"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">])</span>
<span class="n">decorated_model</span> <span class="o">=</span> <span class="n">decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="p">(</span><span class="n">ssn</span><span class="o">=</span><span class="n">valid_ssn</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=6416.86)
</code></pre></div>
<p>We provided a value for the ssn field and the decorator was able to retrieve the values for the other fields for the model to use.</p>
<p>Next, we'll see what happens when we try to do data enrichment with a record that does not exist in the database.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="p">(</span><span class="n">ssn</span><span class="o">=</span><span class="s2">"123-45-6789"</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">ValueError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="nv">Could</span> <span class="nv">not</span> <span class="nv">find</span> <span class="nv">a</span> <span class="nv">record</span> <span class="k">for</span> <span class="nv">data</span> <span class="nv">enrichment</span>.
</code></pre></div>
<p>The decorator raised a ValueError exception because it could not find the needed record.</p>
<p>We can also leave some fields for the client of the model to provide and pull all other fields from the database. We just need to instantiate the decorator a little differently.</p>
<div class="highlight"><pre><span></span><code><span class="n">decorator</span> <span class="o">=</span> <span class="n">PostgreSQLEnrichmentDecorator</span><span class="p">(</span>
<span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="s2">"5432"</span><span class="p">,</span>
<span class="n">username</span><span class="o">=</span><span class="s2">"data_enrichment_user"</span><span class="p">,</span>
<span class="n">password</span><span class="o">=</span><span class="s2">"data_enrichment_password"</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="s2">"data_enrichment"</span><span class="p">,</span>
<span class="n">table</span><span class="o">=</span><span class="s2">"clients"</span><span class="p">,</span>
<span class="n">index_field_name</span><span class="o">=</span><span class="s2">"ssn"</span><span class="p">,</span>
<span class="n">index_field_type</span><span class="o">=</span><span class="s2">"str"</span><span class="p">,</span>
<span class="n">enrichment_fields</span><span class="o">=</span><span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"bmi"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">])</span>
<span class="n">decorated_model</span> <span class="o">=</span> <span class="n">decorator</span><span class="o">.</span><span class="n">set_model</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<p>To see which fields are now required by the model, we'll take a look at the input schema of the decorated model.</p>
<div class="highlight"><pre><span></span><code><span class="n">input_schema</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
<span class="n">input_schema</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{'title': 'InsuranceChargesModelInput',
'type': 'object',
'properties': {'ssn': {'title': 'Ssn', 'type': 'string'},
'children': {'title': 'Children',
'minimum': 0,
'maximum': 5,
'type': 'integer'},
'smoker': {'title': 'Smoker', 'type': 'boolean'}},
'required': ['ssn']}
</code></pre></div>
<p>The decorator has removed the age, sex, bmi, and region fields from the input schema. It has left the smoker and children fields in place, and it has added the ssn field as we expected.</p>
<p>Now we can try the decorator with this new input schema:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="p">(</span><span class="n">ssn</span><span class="o">=</span><span class="n">valid_ssn</span><span class="p">,</span> <span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=6123.85)
</code></pre></div>
<p>The decorator was able to bring in the values for the missing fields from the database and join them to the fields provided by the client in order to make a prediction. </p>
<p>Lastly, we'll select a client record in the database that does not meet the schema requirements of the model:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_input</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="p">(</span><span class="n">ssn</span><span class="o">=</span><span class="n">invalid_ssn</span><span class="p">,</span> <span class="n">children</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="k">except</span> <span class="n">MLModelSchemaValidationException</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">1</span><span class="w"> </span><span class="nb">val</span><span class="n">idation</span><span class="w"> </span><span class="n">error</span><span class="w"> </span><span class="kr">for</span><span class="w"> </span><span class="n">InsuranceChargesModelInput</span><span class="w"></span>
<span class="n">age</span><span class="w"></span>
<span class="w"> </span><span class="n">ensure</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="nb">val</span><span class="n">ue</span><span class="w"> </span><span class="n">is</span><span class="w"> </span><span class="n">less</span><span class="w"> </span><span class="n">than</span><span class="w"> </span><span class="ow">or</span><span class="w"> </span><span class="n">equal</span><span class="w"> </span><span class="kr">to</span><span class="w"> </span><span class="mf">65</span><span class="w"> </span><span class="p">(</span><span class="n">type</span><span class="o">=</span><span class="nb">val</span><span class="n">ue_error</span><span class="mf">.</span><span class="n">number</span><span class="mf">.</span><span class="ow">not</span><span class="n">_le</span><span class="p">;</span><span class="w"> </span><span class="n">limit_value</span><span class="o">=</span><span class="mf">65</span><span class="p">)</span><span class="w"></span>
</code></pre></div>
<p>Because we put some records in the database that do not meet the input schema of the model a ValueError was raised inside of the decorator instance. The record had an age value that is above 65, which the model cannot predict with.</p>
<h2>Adding a Decorator to a Deployed Model</h2>
<p>Now that we have a model and a decorator, we can combine them together into a service that is able to make predictions and also do data enrichment. To do this, we won't need to write any extra code, we can leverage the <a href="https://pypi.org/project/rest-model-service/">rest_model_service package</a> to provide the RESTful API for the service. You can learn more about the package in <a href="https://www.tekhnoal.com/rest-model-service.html">this blog post</a>.</p>
<p>To install the package, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">rest_model_service</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To create a service for our model, all that is needed is that we add a YAML configuration file to the project. The configuration file looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Insurance Charges Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">qualified_name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model</span><span class="w"></span>
<span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model.prediction.model.InsuranceChargesModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
<span class="w"> </span><span class="nt">decorators</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">data_enrichment.postgresql.PostgreSQLEnrichmentDecorator</span><span class="w"></span>
<span class="w"> </span><span class="nt">configuration</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="s">"localhost"</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="s">"5432"</span><span class="w"></span>
<span class="w"> </span><span class="nt">username</span><span class="p">:</span><span class="w"> </span><span class="s">"data_enrichment_user"</span><span class="w"></span>
<span class="w"> </span><span class="nt">password</span><span class="p">:</span><span class="w"> </span><span class="s">"data_enrichment_password"</span><span class="w"></span>
<span class="w"> </span><span class="nt">database</span><span class="p">:</span><span class="w"> </span><span class="s">"data_enrichment"</span><span class="w"></span>
<span class="w"> </span><span class="nt">table</span><span class="p">:</span><span class="w"> </span><span class="s">"clients"</span><span class="w"></span>
<span class="w"> </span><span class="nt">index_field_name</span><span class="p">:</span><span class="w"> </span><span class="s">"ssn"</span><span class="w"></span>
<span class="w"> </span><span class="nt">index_field_type</span><span class="p">:</span><span class="w"> </span><span class="s">"str"</span><span class="w"></span>
<span class="w"> </span><span class="nt">enrichment_fields</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="s">"age"</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="s">"sex"</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="s">"bmi"</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="s">"children"</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="s">"smoker"</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="s">"region"</span><span class="w"></span>
</code></pre></div>
<p>The service_title field is the name of the service as it will appear in the documentation. The models field is an array that contains the details of the models we would like to deploy in the service. The class_path points at the MLModel class that implement's the model's prediction logic. The decorators field contains the details of the decorators that we want to attach to the model instance. In this case, we want to use the PostgreSQLEnrichmentDecorator decorator class with the configuration we've used for local testing.</p>
<p>Using the configuration file, we're able to create an OpenAPI specification file for the model service by executing these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>./configuration/rest_config.yaml
generate_openapi --output_file<span class="o">=</span><span class="s2">"service_contract.yaml"</span>
</code></pre></div>
<p>The service_contract.yaml file will be generated and it will contain the specification that was generated for the model service. The insurance_charges_model endpoint is the one we'll call to make predictions with the model. The model's input and output schemas were automatically extracted and added to the specification. If you inspect the contract, you'll find that the enrichment fields are not part of the input schema because they are being removed by the enrichment decorator. The ssn field has been added to the contract because it is needed to do data enrichment.</p>
<p>To run the service locally, execute these commands:</p>
<div class="highlight"><pre><span></span><code>uvicorn rest_model_service.main:app --reload
</code></pre></div>
<p>The service should come up and can be accessed in a web browser at http://127.0.0.1:8000. When you access that URL you will be redirected to the documentation page that is generated by the FastAPI package:</p>
<p><img alt="FastAPI Documnetation" src="https://www.tekhnoal.com/fastapi_documentation_defml.png" width="100%"></p>
<p>The documentation allows you to make requests against the API in order to try it out. Here's a prediction request against the insurance charges model:</p>
<p><img alt="Prediction Request" src="https://www.tekhnoal.com/prediction_request_defml.png" width="100%"></p>
<p>And the prediction result:</p>
<p><img alt="Prediction Result" src="https://www.tekhnoal.com/prediction_result_defml.png" width="100%"></p>
<p>By using the MLModel base class provided by the ml_base package and the REST service framework provided by the rest_model_service package we're able to quickly stand up a service to host the model. The decorator that we want to test can also be added to the model through configuration, including all of its parameters.</p>
<h2>Deploying the Model</h2>
<p>Now that we have a working model and model service, we'll need to deploy it somewhere. We'll start by deploying the service locally. Once we have the service and database working locally, we'll deploy everything to the cloud using DigitalOcean's managed kubernetes service.</p>
<h3>Creating a Docker Image</h3>
<p>Before moving forward, let's create a docker image and run it locally. The docker image is generated using instructions in the Dockerfile:</p>
<div class="highlight"><pre><span></span><code><span class="k">FROM</span><span class="w"> </span><span class="s">python:3.9-slim</span>
<span class="k">MAINTAINER</span><span class="w"> </span><span class="s">Brian Schmidt "6666331+schmidtbri@users.noreply.github.com"</span>
<span class="k">WORKDIR</span><span class="w"> </span><span class="s">./service</span>
<span class="k">RUN</span><span class="w"> </span>apt-get update
<span class="k">RUN</span><span class="w"> </span>apt-get --assume-yes install git
<span class="k">COPY</span><span class="w"> </span>./data_enrichment ./data_enrichment
<span class="k">COPY</span><span class="w"> </span>./configuration ./configuration
<span class="k">COPY</span><span class="w"> </span>./LICENSE ./LICENSE
<span class="k">COPY</span><span class="w"> </span>./requirements.txt ./requirements.txt
<span class="k">RUN</span><span class="w"> </span>pip install -r requirements.txt
<span class="k">CMD</span><span class="w"> </span><span class="p">[</span><span class="s2">"uvicorn"</span><span class="p">,</span><span class="w"> </span><span class="s2">"rest_model_service.main:app"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--host"</span><span class="p">,</span><span class="w"> </span><span class="s2">"0.0.0.0"</span><span class="p">,</span><span class="w"> </span><span class="s2">"--port"</span><span class="p">,</span><span class="w"> </span><span class="s2">"8000"</span><span class="p">]</span>
</code></pre></div>
<p>The Dockerfile is used by this command to create a docker image:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">build</span> <span class="o">-</span><span class="n">t</span> <span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="mf">0.1.0</span> <span class="o">..</span>\
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>To make sure everything worked as expected, we'll look through the docker images in our system:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">image</span> <span class="n">ls</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">insurance_charges_model_service</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>insurance_charges_model_service 0.1.0 f5b85418ebc7 2 days ago 1.53GB
</code></pre></div>
<p>The insurance_charges_model_service image is listed. Next, we'll start the image to see if everything is working as expected. However, we need to connect the docker containers to the same network first. </p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">create</span> <span class="n">data</span><span class="o">-</span><span class="n">enrichment</span><span class="o">-</span><span class="n">network</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>bcfa5ed0334b609c6f553caac67375c0571438f4541d75d63be79638a6e300f7
</code></pre></div>
<p>Next, we'll connect the running postgres image to the network.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">connect</span> <span class="n">data</span><span class="o">-</span><span class="n">enrichment</span><span class="o">-</span><span class="n">network</span> <span class="n">postgres</span>
</code></pre></div>
<p>Now we can start the service docker image connected to the same network as the postgres container.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">d</span> \
<span class="o">-</span><span class="n">p</span> <span class="mi">8000</span><span class="p">:</span><span class="mi">8000</span> \
<span class="o">--</span><span class="n">net</span> <span class="n">data</span><span class="o">-</span><span class="n">enrichment</span><span class="o">-</span><span class="n">network</span> \
<span class="o">-</span><span class="n">e</span> <span class="n">REST_CONFIG</span><span class="o">=./</span><span class="n">configuration</span><span class="o">/</span><span class="n">local_rest_config</span><span class="o">.</span><span class="n">yaml</span> \
<span class="o">--</span><span class="n">name</span> <span class="n">insurance_charges_model_service</span> \
<span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">6</span><span class="n">e1bc98063053f9260e078fb4bef3e36637bb84e73b04441791e2c75fd0ad833</span><span class="w"></span>
</code></pre></div>
<p>Notice that we're using a different configuration file that has a different hostname for the postgres instance. The postgres image is not accesible from localhost inside of the network so we needed to have the hostname "postgres" in the configuration.</p>
<p>The service should be accessible on port 8000 of localhost, so we'll try to make a prediction using the curl command running inside of a container connected to the network:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">run</span> <span class="o">-</span><span class="n">it</span> <span class="o">--</span><span class="n">rm</span> \
<span class="o">--</span><span class="n">net</span> <span class="n">data</span><span class="o">-</span><span class="n">enrichment</span><span class="o">-</span><span class="n">network</span> \
<span class="n">curlimages</span><span class="o">/</span><span class="n">curl</span> \
<span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://insurance_charges_model_service:8000/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s1">'{"ssn": "646-87-1351"}'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":6416.86}
</code></pre></div>
<p>The model predicted that the insurance charges will be $6416.86 for the person whose SSN is 646-87-1351.</p>
<p>We're done with the service and the database so we'll shut down the docker containers and the docker network.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">postgres</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">postgres</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">kill</span> <span class="n">insurance_charges_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">rm</span> <span class="n">insurance_charges_model_service</span>
<span class="err">!</span><span class="n">docker</span> <span class="n">network</span> <span class="n">rm</span> <span class="n">data</span><span class="o">-</span><span class="n">enrichment</span><span class="o">-</span><span class="n">network</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>postgres
postgres
insurance_charges_model_service
insurance_charges_model_service
data-enrichment-network
</code></pre></div>
<h3>Setting up Digital Ocean</h3>
<p>In order to deploy the model service to a DigitalOcean kubernetes cluster, we'll need to connect to the DigitalOcean API. </p>
<p>In this section we'll be using the doctl command line utility which will help us to interact with the Digital Ocean Kubernetes service. We followed <a href="https://docs.digitalocean.com/reference/doctl/how-to/install/">these instructions</a> to install the doctl utility. Before we can do anything with the Digital Ocean API, we need to authenticate, so we created an API token by following <a href="https://docs.digitalocean.com/reference/api/create-personal-access-token/">these instructions</a>. Once we have the token we can add it to the doctl utility by creating a new authentication context with this command:</p>
<div class="highlight"><pre><span></span><code>doctl auth init --context model-services-context
</code></pre></div>
<p>The command will ask for the token that we generated on the website.</p>
<p>The command creates a new context called "model-services-context" that we'll use to interact with the Digital Ocean API. The command asks for the API token we generated and saves it into the configuration file of the tool. To make sure that the context was created correctly and is the current context, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">doctl</span> <span class="n">auth</span> <span class="nb">list</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>default
model-services-context (current)
</code></pre></div>
<p>The newly created context should be listed and have "(current)" by its name. If the context we created is not the current context, we can switch to it with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">doctl</span> <span class="n">auth</span> <span class="n">switch</span> <span class="o">--</span><span class="n">context</span> <span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="o">-</span><span class="n">context</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Now using context [model-services-context] by default
</code></pre></div>
<p>Now that we have the credentials necessary, we can start creating the infrastructure for our deployment.</p>
<h3>Creating the Kubernetes Cluster</h3>
<p>To create the kubernetes cluster and supporting infrastructure, we'll use <a href="https://www.terraform.io/">Terraform</a>. Terraform is an Infrastructure as Code (IaC) tool that will allow us to declaratively create our infrastructure in configuration files, and then create, manage, and destroy it with simple commands. The command line Terraform tool can be installed by following <a href="https://learn.hashicorp.com/tutorials/terraform/install-cli">these intructions</a>.</p>
<p>We wont be doing a deep dive into Terraform for this blog post because it would make the post too long. The Terraform module that we'll install is in the source code attached to this post, in the "terraform" folder. </p>
<p>To begin, we'll switch into the terraform folder and add our API token to an environment variable.</p>
<div class="highlight"><pre><span></span><code><span class="o">%</span><span class="n">cd</span> <span class="o">../</span><span class="n">terraform</span>
<span class="o">%</span><span class="n">env</span> <span class="n">DIGITALOCEAN_TOKEN</span><span class="o">=</span><span class="n">dop_v1_c857bb7bb4bed089000125513c49f642f03401253ec09178c41f94df665312a</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Next, we'll initialize the Terraform environment.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">terraform</span> <span class="n">init</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Initializing the backend...
Initializing provider plugins...
- Finding latest version of hashicorp/kubernetes...
- Finding digitalocean/digitalocean versions matching "~> 2.0"...
- Installing hashicorp/kubernetes v2.11.0...
- Installed hashicorp/kubernetes v2.11.0 (signed by HashiCorp)
- Installing digitalocean/digitalocean v2.19.0...
- Installed digitalocean/digitalocean v2.19.0 (signed by a HashiCorp partner, key ID F82037E524B9C0E8)
...
</code></pre></div>
<p>The terraform environment is now initialized and stored in the terraform folder. We can now create a plan for the deployment of the resources.</p>
<p>The plan command required an input variable called "project_name" which allows the resources to have a shared naming convention. We provided the value through the command line option.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">terraform</span> <span class="n">plan</span> <span class="o">-</span><span class="n">var</span><span class="o">=</span><span class="s2">"project_name=model-services"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">Terraform</span><span class="w"> </span><span class="n">used</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">selected</span><span class="w"> </span><span class="n">providers</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="k">generate</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="n">execution</span><span class="w"> </span><span class="n">plan</span><span class="p">.</span><span class="w"></span>
<span class="n">Resource</span><span class="w"> </span><span class="n">actions</span><span class="w"> </span><span class="n">are</span><span class="w"> </span><span class="n">indicated</span><span class="w"> </span><span class="n">with</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="nl">symbols:</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">create</span><span class="w"></span>
<span class="n">Terraform</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">perform</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="nl">actions:</span><span class="w"></span>
<span class="w"> </span><span class="p">#</span><span class="w"> </span><span class="n">digitalocean_container_registry</span><span class="p">.</span><span class="n">container_registry</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">be</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">resource</span><span class="w"> </span><span class="s">"digitalocean_container_registry"</span><span class="w"> </span><span class="s">"container_registry"</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">endpoint</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"model-services-registry"</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">region</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">server_url</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">storage_usage_bytes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">subscription_tier_slug</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"basic"</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="w"> </span><span class="p">#</span><span class="w"> </span><span class="n">digitalocean_container_registry_docker_credentials</span><span class="p">.</span><span class="n">registry_credentials</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">be</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">resource</span><span class="w"> </span><span class="s">"digitalocean_container_registry_docker_credentials"</span><span class="w"> </span><span class="s">"registry_credentials"</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">credential_expiration_time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">docker_credentials</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">sensitive</span><span class="w"> </span><span class="n">value</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">expiry_seconds</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mh">1576800000</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">registry_name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"model-services-registry"</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">write</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">true</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
<span class="nl">Plan:</span><span class="w"> </span><span class="mh">5</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">add</span><span class="p">,</span><span class="w"> </span><span class="mh">0</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">change</span><span class="p">,</span><span class="w"> </span><span class="mh">0</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">destroy</span><span class="p">.</span><span class="w"></span>
<span class="n">Changes</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="nl">Outputs:</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">kubernetes_cluster_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">registry_endpoint</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="err">───────────────────────────────────────────────────────────────────────────────</span><span class="w"></span>
<span class="nl">Note:</span><span class="w"> </span><span class="n">You</span><span class="w"> </span><span class="n">didn</span><span class="p">'</span><span class="n">t</span><span class="w"> </span><span class="n">use</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="o">-</span><span class="n">out</span><span class="w"> </span><span class="n">option</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">save</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="n">plan</span><span class="p">,</span><span class="w"> </span><span class="n">so</span><span class="w"> </span><span class="n">Terraform</span><span class="w"> </span><span class="n">can</span><span class="p">'</span><span class="n">t</span><span class="w"></span>
<span class="n">guarantee</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">take</span><span class="w"> </span><span class="n">exactly</span><span class="w"> </span><span class="n">these</span><span class="w"> </span><span class="n">actions</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">you</span><span class="w"> </span><span class="n">run</span><span class="w"> </span><span class="s">"terraform apply"</span><span class="w"> </span><span class="n">now</span><span class="p">.</span><span class="w"></span>
</code></pre></div>
<p>The output of the plan command gives us a list of the resources that will be created. These resources are:</p>
<ul>
<li>docker registry, used to deploy images to the cluster</li>
<li>docker registry credentials, used to allow access to the images from the cluster</li>
<li>VPC, a private network for the cluster nodes</li>
<li>kubernetes cluster, used to host the services</li>
<li>kubernetes secret, to hold the docker registry credentials so that the cluster can load images from the docker registry</li>
</ul>
<p>We can create the resources with the apply command.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">terraform</span> <span class="n">apply</span> <span class="o">-</span><span class="n">var</span><span class="o">=</span><span class="s2">"project_name=model-services"</span> <span class="o">-</span><span class="n">auto</span><span class="o">-</span><span class="n">approve</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">Terraform</span><span class="w"> </span><span class="n">used</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">selected</span><span class="w"> </span><span class="n">providers</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="k">generate</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="n">execution</span><span class="w"> </span><span class="n">plan</span><span class="p">.</span><span class="w"></span>
<span class="n">Resource</span><span class="w"> </span><span class="n">actions</span><span class="w"> </span><span class="n">are</span><span class="w"> </span><span class="n">indicated</span><span class="w"> </span><span class="n">with</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="nl">symbols:</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">create</span><span class="w"></span>
<span class="n">Terraform</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">perform</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="nl">actions:</span><span class="w"></span>
<span class="w"> </span><span class="p">#</span><span class="w"> </span><span class="n">digitalocean_container_registry</span><span class="p">.</span><span class="n">container_registry</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">be</span><span class="w"> </span><span class="n">created</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">resource</span><span class="w"> </span><span class="s">"digitalocean_container_registry"</span><span class="w"> </span><span class="s">"container_registry"</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">endpoint</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"model-services-registry"</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">region</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">server_url</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">storage_usage_bytes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">known</span><span class="w"> </span><span class="n">after</span><span class="w"> </span><span class="n">apply</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">subscription_tier_slug</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"basic"</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
<span class="nl">Outputs:</span><span class="w"></span>
<span class="n">kubernetes_cluster_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"7eda057c-501f-414c-ad36-e4a75feac4e0"</span><span class="w"></span>
<span class="n">registry_endpoint</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"registry.digitalocean.com/model-services-registry"</span><span class="w"></span>
</code></pre></div>
<p>The terraform stack returned the id of the cluster that was created. We'll need this id to connect to the cluster.</p>
<div class="highlight"><pre><span></span><code><span class="o">%</span><span class="n">cd</span> <span class="o">..</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="o">/</span><span class="nv">Users</span><span class="o">/</span><span class="nv">brian</span><span class="o">/</span><span class="nv">Code</span><span class="o">/</span><span class="nv">data</span><span class="o">-</span><span class="nv">enrichment</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="nv">ml</span><span class="o">-</span><span class="nv">models</span>
</code></pre></div>
<h3>Pushing the Image</h3>
<p>Now that we have a registry, we need to add credentials to our local docker daemon in order to be able to upload images, to do that we'll use this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">doctl</span> <span class="n">registry</span> <span class="n">login</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">Logging</span><span class="w"> </span><span class="n">Docker</span><span class="w"> </span><span class="n">in</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">registry</span><span class="p">.</span><span class="n">digitalocean</span><span class="p">.</span><span class="n">com</span><span class="w"></span>
</code></pre></div>
<p>In order to upload the image, we need to tag it with the URL of the DO registry we created. The URL of the registry was an output of the terraform module we just created above. The docker tag command looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">tag</span> <span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="mf">0.1.0</span> <span class="n">registry</span><span class="o">.</span><span class="n">digitalocean</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="o">-</span><span class="n">registry</span><span class="o">/</span><span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<p>Now we can push the image to the DigitalOcean docker registry.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">docker</span> <span class="n">push</span> <span class="n">registry</span><span class="o">.</span><span class="n">digitalocean</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="o">-</span><span class="n">registry</span><span class="o">/</span><span class="n">insurance_charges_model_service</span><span class="p">:</span><span class="mf">0.1.0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">The</span><span class="w"> </span><span class="n">push</span><span class="w"> </span><span class="n">refers</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">repository</span><span class="w"> </span><span class="p">[</span><span class="n">registry</span><span class="p">.</span><span class="n">digitalocean</span><span class="p">.</span><span class="n">com</span><span class="o">/</span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="o">-</span><span class="n">registry</span><span class="o">/</span><span class="n">insurance_charges_model_service</span><span class="p">]</span><span class="w"></span>
<span class="p">[</span><span class="mh">1</span><span class="nl">B4e8c730f:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">B262abd28:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">B103cfdc5:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">Be6d9a4d6:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">Ba89df31c:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">B3bc716a2:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">Bb9727396:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">B7bf074b6:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">B85df8c54:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">Bafbe089a:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">B90f11bed:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">1</span><span class="nl">B50a1245f:</span><span class="w"> </span><span class="n">Preparing</span><span class="w"> </span>
<span class="p">[</span><span class="mh">13</span><span class="nl">Be8c730f:</span><span class="w"> </span><span class="n">Pushing</span><span class="w"> </span><span class="mf">426.5</span><span class="n">MB</span><span class="o">/</span><span class="mf">1.3</span><span class="n">GB</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
</code></pre></div>
<h3>Accessing the Kubernetes Cluster</h3>
<p>To access the cluster, doctl has another option that will set up the kubectl tool for us:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">doctl</span> <span class="n">kubernetes</span> <span class="n">cluster</span> <span class="n">kubeconfig</span> <span class="n">save</span> <span class="mi">7</span><span class="n">eda057c</span><span class="o">-</span><span class="mi">501</span><span class="n">f</span><span class="o">-</span><span class="mi">414</span><span class="n">c</span><span class="o">-</span><span class="n">ad36</span><span class="o">-</span><span class="n">e4a75feac4e0</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">Notice</span><span class="o">:</span><span class="w"> </span><span class="n">Adding</span><span class="w"> </span><span class="n">cluster</span><span class="w"> </span><span class="n">credentials</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">kubeconfig</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="n">found</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="s2">"/Users/brian/.kube/config"</span><span class="w"></span>
<span class="n">Notice</span><span class="o">:</span><span class="w"> </span><span class="n">Setting</span><span class="w"> </span><span class="n">current</span><span class="o">-</span><span class="n">context</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="k">do</span><span class="o">-</span><span class="n">nyc1</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="o">-</span><span class="n">cluster</span><span class="w"></span>
</code></pre></div>
<p>The unique identifier is for the cluster that was just created and is returned by the previous command. When the command finishes, the current kubectl context should be switched to the newly created cluster. To list the contexts in kubectl, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">config</span> <span class="n">get</span><span class="o">-</span><span class="n">contexts</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="nv">CURRENT</span> <span class="nv">NAME</span> <span class="nv">CLUSTER</span> <span class="nv">AUTHINFO</span> <span class="nv">NAMESPACE</span>
<span class="o">*</span> <span class="k">do</span><span class="o">-</span><span class="nv">nyc1</span><span class="o">-</span><span class="nv">model</span><span class="o">-</span><span class="nv">services</span><span class="o">-</span><span class="nv">cluster</span> <span class="k">do</span><span class="o">-</span><span class="nv">nyc1</span><span class="o">-</span><span class="nv">model</span><span class="o">-</span><span class="nv">services</span><span class="o">-</span><span class="nv">cluster</span> <span class="k">do</span><span class="o">-</span><span class="nv">nyc1</span><span class="o">-</span><span class="nv">model</span><span class="o">-</span><span class="nv">services</span><span class="o">-</span><span class="nv">cluster</span><span class="o">-</span><span class="nv">admin</span>
<span class="nv">minikube</span> <span class="nv">minikube</span> <span class="nv">minikube</span>
</code></pre></div>
<p>A listing of the contexts currently in the kubectl configuration should appear, and there should be a star next to the new cluster's context. To make sure everything is working we can get a list of the nodes in the cluster with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">nodes</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME STATUS ROLES AGE VERSION
model-services-cluster-worker-pool-crmkf Ready <none> 55m v1.22.8
model-services-cluster-worker-pool-crmkx Ready <none> 55m v1.22.8
model-services-cluster-worker-pool-crmky Ready <none> 55m v1.22.8
</code></pre></div>
<h3>Creating a Kubernetes Namespace</h3>
<p>Now that we have a cluster and are connected to it, we'll create a namespace to hold the resources for our model deployment. The resource definition is in the kubernetes/namespace.yml file. To apply the manifest to the cluster, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">create</span> <span class="o">-</span><span class="n">f</span> <span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace/model-services created
</code></pre></div>
<p>To take a look at the namespaces, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">namespace</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME STATUS AGE
default Active 164m
kube-node-lease Active 164m
kube-public Active 164m
kube-system Active 164m
model-services Active 2s
</code></pre></div>
<p>The new namespace should appear in the listing along with other namespaces created by default by the system. To use the new namespace for the rest of the operations, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">config</span> <span class="nb">set</span><span class="o">-</span><span class="n">context</span> <span class="o">--</span><span class="n">current</span> <span class="o">--</span><span class="n">namespace</span><span class="o">=</span><span class="n">model</span><span class="o">-</span><span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="nv">Context</span> <span class="s2">"</span><span class="s">do-nyc1-model-services-cluster</span><span class="s2">"</span> <span class="nv">modified</span>.
</code></pre></div>
<h3>Creating a Database</h3>
<p>To create a PostgreSQL database instance in Kubernetes, we'll use the <a href="https://github.com/bitnami/charts/tree/master/bitnami/postgresql">bitnami helm chart</a>. </p>
<p>Helm charts are packaged applications that can be easily installed on a Kubernetes cluster. To install PostgreSQL we'll first add the bitnami helm repository:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">helm</span> <span class="n">repo</span> <span class="n">add</span> <span class="n">bitnami</span> <span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">charts</span><span class="o">.</span><span class="n">bitnami</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">bitnami</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>"bitnami" has been added to your repositories
</code></pre></div>
<p>Now we can apply the PostgreSQL chart to the current cluster and namespace with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">helm</span> <span class="n">install</span> <span class="n">postgres</span> <span class="n">bitnami</span><span class="o">/</span><span class="n">postgresql</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">NAME</span><span class="o">:</span><span class="w"> </span><span class="n">postgres</span><span class="w"></span>
<span class="n">LAST</span><span class="w"> </span><span class="n">DEPLOYED</span><span class="o">:</span><span class="w"> </span><span class="n">Sun</span><span class="w"> </span><span class="n">May</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="mi">23</span><span class="o">:</span><span class="mi">36</span><span class="o">:</span><span class="mi">50</span><span class="w"> </span><span class="mi">2022</span><span class="w"></span>
<span class="n">NAMESPACE</span><span class="o">:</span><span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="w"></span>
<span class="n">STATUS</span><span class="o">:</span><span class="w"> </span><span class="n">deployed</span><span class="w"></span>
<span class="n">REVISION</span><span class="o">:</span><span class="w"> </span><span class="mi">1</span><span class="w"></span>
<span class="n">TEST</span><span class="w"> </span><span class="n">SUITE</span><span class="o">:</span><span class="w"> </span><span class="n">None</span><span class="w"></span>
<span class="n">NOTES</span><span class="o">:</span><span class="w"></span>
<span class="n">CHART</span><span class="w"> </span><span class="n">NAME</span><span class="o">:</span><span class="w"> </span><span class="n">postgresql</span><span class="w"></span>
<span class="n">CHART</span><span class="w"> </span><span class="n">VERSION</span><span class="o">:</span><span class="w"> </span><span class="mf">11.1</span><span class="o">.</span><span class="mi">25</span><span class="w"></span>
<span class="n">APP</span><span class="w"> </span><span class="n">VERSION</span><span class="o">:</span><span class="w"> </span><span class="mf">14.2</span><span class="o">.</span><span class="mi">0</span><span class="w"></span>
<span class="o">**</span><span class="w"> </span><span class="n">Please</span><span class="w"> </span><span class="n">be</span><span class="w"> </span><span class="n">patient</span><span class="w"> </span><span class="k">while</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">chart</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="n">being</span><span class="w"> </span><span class="n">deployed</span><span class="w"> </span><span class="o">**</span><span class="w"></span>
<span class="n">PostgreSQL</span><span class="w"> </span><span class="n">can</span><span class="w"> </span><span class="n">be</span><span class="w"> </span><span class="n">accessed</span><span class="w"> </span><span class="n">via</span><span class="w"> </span><span class="n">port</span><span class="w"> </span><span class="mi">5432</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="n">DNS</span><span class="w"> </span><span class="n">names</span><span class="w"> </span><span class="n">from</span><span class="w"> </span><span class="n">within</span><span class="w"> </span><span class="n">your</span><span class="w"> </span><span class="n">cluster</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="o">.</span><span class="na">model</span><span class="o">-</span><span class="n">services</span><span class="o">.</span><span class="na">svc</span><span class="o">.</span><span class="na">cluster</span><span class="o">.</span><span class="na">local</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">Read</span><span class="o">/</span><span class="n">Write</span><span class="w"> </span><span class="n">connection</span><span class="w"></span>
<span class="n">To</span><span class="w"> </span><span class="kd">get</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">password</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="s2">"postgres"</span><span class="w"> </span><span class="n">run</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">export</span><span class="w"> </span><span class="n">POSTGRES_PASSWORD</span><span class="o">=</span><span class="n">$</span><span class="o">(</span><span class="n">kubectl</span><span class="w"> </span><span class="kd">get</span><span class="w"> </span><span class="n">secret</span><span class="w"> </span><span class="o">--</span><span class="kd">namespace</span><span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="w"> </span><span class="n">postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="w"> </span><span class="o">-</span><span class="n">o</span><span class="w"> </span><span class="n">jsonpath</span><span class="o">=</span><span class="s2">"{.data.postgres-password}"</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">base64</span><span class="w"> </span><span class="o">--</span><span class="n">decode</span><span class="o">)</span><span class="w"></span>
<span class="n">To</span><span class="w"> </span><span class="n">connect</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">your</span><span class="w"> </span><span class="n">database</span><span class="w"> </span><span class="n">run</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="n">command</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">kubectl</span><span class="w"> </span><span class="n">run</span><span class="w"> </span><span class="n">postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="o">-</span><span class="n">client</span><span class="w"> </span><span class="o">--</span><span class="n">rm</span><span class="w"> </span><span class="o">--</span><span class="n">tty</span><span class="w"> </span><span class="o">-</span><span class="n">i</span><span class="w"> </span><span class="o">--</span><span class="n">restart</span><span class="o">=</span><span class="s1">'Never'</span><span class="w"> </span><span class="o">--</span><span class="kd">namespace</span><span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="w"> </span><span class="o">--</span><span class="n">image</span><span class="w"> </span><span class="n">docker</span><span class="o">.</span><span class="na">io</span><span class="sr">/bitnami/</span><span class="n">postgresql</span><span class="o">:</span><span class="mf">14.2</span><span class="o">.</span><span class="mi">0</span><span class="o">-</span><span class="n">debian</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="n">r77</span><span class="w"> </span><span class="o">--</span><span class="n">env</span><span class="o">=</span><span class="s2">"PGPASSWORD=$POSTGRES_PASSWORD"</span><span class="w"> </span><span class="o">\</span><span class="w"></span>
<span class="w"> </span><span class="o">--</span><span class="n">command</span><span class="w"> </span><span class="o">--</span><span class="w"> </span><span class="n">psql</span><span class="w"> </span><span class="o">--</span><span class="n">host</span><span class="w"> </span><span class="n">postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="w"> </span><span class="o">-</span><span class="n">U</span><span class="w"> </span><span class="n">postgres</span><span class="w"> </span><span class="o">-</span><span class="n">d</span><span class="w"> </span><span class="n">postgres</span><span class="w"> </span><span class="o">-</span><span class="n">p</span><span class="w"> </span><span class="mi">5432</span><span class="w"></span>
<span class="w"> </span><span class="o">></span><span class="w"> </span><span class="n">NOTE</span><span class="o">:</span><span class="w"> </span><span class="n">If</span><span class="w"> </span><span class="n">you</span><span class="w"> </span><span class="n">access</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">container</span><span class="w"> </span><span class="n">using</span><span class="w"> </span><span class="n">bash</span><span class="o">,</span><span class="w"> </span><span class="n">make</span><span class="w"> </span><span class="n">sure</span><span class="w"> </span><span class="n">that</span><span class="w"> </span><span class="n">you</span><span class="w"> </span><span class="n">execute</span><span class="w"> </span><span class="s2">"/opt/bitnami/scripts/entrypoint.sh /bin/bash"</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">order</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">avoid</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">error</span><span class="w"> </span><span class="s2">"psql: local user with ID 1001} does not exist"</span><span class="w"></span>
<span class="n">To</span><span class="w"> </span><span class="n">connect</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">your</span><span class="w"> </span><span class="n">database</span><span class="w"> </span><span class="n">from</span><span class="w"> </span><span class="n">outside</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">cluster</span><span class="w"> </span><span class="n">execute</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="n">commands</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">kubectl</span><span class="w"> </span><span class="n">port</span><span class="o">-</span><span class="n">forward</span><span class="w"> </span><span class="o">--</span><span class="kd">namespace</span><span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="w"> </span><span class="n">svc</span><span class="o">/</span><span class="n">postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="w"> </span><span class="mi">5432</span><span class="o">:</span><span class="mi">5432</span><span class="w"> </span><span class="o">&</span><span class="w"></span>
<span class="w"> </span><span class="n">PGPASSWORD</span><span class="o">=</span><span class="s2">"$POSTGRES_PASSWORD"</span><span class="w"> </span><span class="n">psql</span><span class="w"> </span><span class="o">--</span><span class="n">host</span><span class="w"> </span><span class="mf">127.0</span><span class="o">.</span><span class="mf">0.1</span><span class="w"> </span><span class="o">-</span><span class="n">U</span><span class="w"> </span><span class="n">postgres</span><span class="w"> </span><span class="o">-</span><span class="n">d</span><span class="w"> </span><span class="n">postgres</span><span class="w"> </span><span class="o">-</span><span class="n">p</span><span class="w"> </span><span class="mi">5432</span><span class="w"></span>
</code></pre></div>
<p>The output of the helm chart contains some info about the deployment that we'll need later. The DNS name of the new PostgreSQL service is used in the configuration of the decorator.</p>
<p>We can view the newly created database instance by looking for the pods that are hosting it:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">pods</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME READY STATUS RESTARTS AGE
postgres-postgresql-0 1/1 Running 0 104s
</code></pre></div>
<p>To access the database, we'll need to get the password created by the helm chart:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">secret</span> <span class="n">postgres</span><span class="o">-</span><span class="n">postgresql</span> <span class="o">-</span><span class="n">o</span> <span class="n">jsonpath</span><span class="o">=</span><span class="s2">"{.data.postgres-password}"</span> <span class="o">|</span> <span class="n">base64</span> <span class="o">--</span><span class="n">decode</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>SaF0fhHrRj
</code></pre></div>
<p>We can test the database by executing a simple SELECT statement from another pod in the cluster:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">run</span> <span class="n">postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="o">-</span><span class="n">client</span> <span class="o">--</span><span class="n">rm</span> <span class="o">--</span><span class="n">tty</span> <span class="o">-</span><span class="n">i</span> \
<span class="o">--</span><span class="n">restart</span><span class="o">=</span><span class="s1">'Never'</span> \
<span class="o">--</span><span class="n">image</span> <span class="n">docker</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">bitnami</span><span class="o">/</span><span class="n">postgresql</span><span class="p">:</span><span class="mf">14.2.0</span><span class="o">-</span><span class="n">debian</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="n">r77</span> \
<span class="o">--</span><span class="n">command</span> <span class="o">--</span> <span class="n">psql</span> <span class="n">postgresql</span><span class="p">:</span><span class="o">//</span><span class="n">postgres</span><span class="p">:</span><span class="n">SaF0fhHrRj</span><span class="nd">@postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="p">:</span><span class="mi">5432</span><span class="o">/</span><span class="n">postgres</span> \
<span class="o">-</span><span class="n">c</span> <span class="s2">"SELECT current_database();"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code> current_database
------------------
postgres
(1 row)
pod "postgres-postgresql-client" deleted
</code></pre></div>
<p>To create a table in the database, we'll execute a SQL command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">run</span> <span class="n">postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="o">-</span><span class="n">client</span> <span class="o">--</span><span class="n">rm</span> <span class="o">--</span><span class="n">tty</span> <span class="o">-</span><span class="n">i</span> \
<span class="o">--</span><span class="n">restart</span><span class="o">=</span><span class="s1">'Never'</span> \
<span class="o">--</span><span class="n">image</span> <span class="n">docker</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">bitnami</span><span class="o">/</span><span class="n">postgresql</span><span class="p">:</span><span class="mf">14.2.0</span><span class="o">-</span><span class="n">debian</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="n">r77</span> \
<span class="o">--</span><span class="n">command</span> <span class="o">--</span> <span class="n">psql</span> <span class="n">postgresql</span><span class="p">:</span><span class="o">//</span><span class="n">postgres</span><span class="p">:</span><span class="n">SaF0fhHrRj</span><span class="nd">@postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="p">:</span><span class="mi">5432</span><span class="o">/</span><span class="n">postgres</span> \
<span class="o">-</span><span class="n">c</span> <span class="s2">"CREATE TABLE clients ( </span><span class="se">\</span>
<span class="s2"> ssn varchar(11) PRIMARY KEY, </span><span class="se">\</span>
<span class="s2"> first_name varchar(30) NOT NULL, </span><span class="se">\</span>
<span class="s2"> last_name varchar(30) NOT NULL, </span><span class="se">\</span>
<span class="s2"> age integer NOT NULL, </span><span class="se">\</span>
<span class="s2"> sex varchar(6) NOT NULL, </span><span class="se">\</span>
<span class="s2"> bmi integer NOT NULL, </span><span class="se">\</span>
<span class="s2"> children integer NOT NULL, </span><span class="se">\</span>
<span class="s2"> smoker boolean NOT NULL, </span><span class="se">\</span>
<span class="s2"> region varchar(10) NOT NULL);"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>CREATE TABLE
pod "postgres-postgresql-client" deleted
</code></pre></div>
<p>Next, we'll add some data to the table using the same code as we used for the local docker PostgreSQL instance. Before that, we'll need to connect to the instance using using port forwarding. Port forwarding is a simple way to connect to a pod running in the cluster from the local environment, it simply forwards all traffic from a local port to a remote port in the pod.</p>
<p>To start port forwarding, execute this command:</p>
<div class="highlight"><pre><span></span><code>kubectl port-forward svc/postgres-postgresql <span class="m">5432</span>:5432
</code></pre></div>
<p>Now we can execute the python code that will add the data to the table:</p>
<div class="highlight"><pre><span></span><code><span class="n">connection</span> <span class="o">=</span> <span class="n">psycopg2</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span>
<span class="n">host</span><span class="o">=</span><span class="s2">"localhost"</span><span class="p">,</span>
<span class="n">port</span><span class="o">=</span><span class="s2">"5432"</span><span class="p">,</span>
<span class="n">database</span><span class="o">=</span><span class="s2">"postgres"</span><span class="p">,</span>
<span class="n">user</span><span class="o">=</span><span class="s2">"postgres"</span><span class="p">,</span>
<span class="n">password</span><span class="o">=</span><span class="s2">"SaF0fhHrRj"</span><span class="p">)</span>
<span class="n">cursor</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">cursor</span><span class="p">()</span>
<span class="k">for</span> <span class="n">record</span> <span class="ow">in</span> <span class="n">records</span><span class="p">:</span>
<span class="n">cursor</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"INSERT INTO clients (ssn, first_name, last_name, age, sex, bmi, children, smoker, region)"</span>
<span class="s2">"VALUES (</span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">, </span><span class="si">%s</span><span class="s2">);"</span><span class="p">,</span>
<span class="p">(</span><span class="n">record</span><span class="p">[</span><span class="s2">"ssn"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"first_name"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"last_name"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"age"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"sex"</span><span class="p">],</span>
<span class="n">record</span><span class="p">[</span><span class="s2">"bmi"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"children"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"smoker"</span><span class="p">],</span> <span class="n">record</span><span class="p">[</span><span class="s2">"region"</span><span class="p">]))</span>
<span class="n">connection</span><span class="o">.</span><span class="n">commit</span><span class="p">()</span>
<span class="n">cursor</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="n">connection</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</code></pre></div>
<p>The remote database instance should now have the data needed to try out the decorator running in the service. We can view some of the data with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">run</span> <span class="n">postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="o">-</span><span class="n">client</span> <span class="o">--</span><span class="n">rm</span> <span class="o">--</span><span class="n">tty</span> <span class="o">-</span><span class="n">i</span> \
<span class="o">--</span><span class="n">restart</span><span class="o">=</span><span class="s1">'Never'</span> \
<span class="o">--</span><span class="n">image</span> <span class="n">docker</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">bitnami</span><span class="o">/</span><span class="n">postgresql</span><span class="p">:</span><span class="mf">14.2.0</span><span class="o">-</span><span class="n">debian</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="n">r77</span> \
<span class="o">--</span><span class="n">command</span> <span class="o">--</span> <span class="n">psql</span> <span class="n">postgresql</span><span class="p">:</span><span class="o">//</span><span class="n">postgres</span><span class="p">:</span><span class="n">SaF0fhHrRj</span><span class="nd">@postgres</span><span class="o">-</span><span class="n">postgresql</span><span class="p">:</span><span class="mi">5432</span><span class="o">/</span><span class="n">postgres</span> \
<span class="o">-</span><span class="n">c</span> <span class="s2">"SELECT ssn, first_name, last_name FROM clients LIMIT 5;"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code> ssn | first_name | last_name
-------------+------------+-----------
646-87-1351 | Vickie | Anderson
194-94-3733 | Patricia | Lee
709-08-5148 | Seth | James
132-30-5594 | Edward | Allen
096-55-1187 | Mark | Keith
(5 rows)
pod "postgres-postgresql-client" deleted
</code></pre></div>
<p>Now that we're done putting data in the database, we can shut down the port forwarding process by pressing CTL-C or with this command:</p>
<div class="highlight"><pre><span></span><code>pkill -f kubectl port-forward
</code></pre></div>
<h3>Creating a Kubernetes Deployment and Service</h3>
<p>The model service now has a database to access, so we'll be creating the model service resources. These are:</p>
<ul>
<li>Deployment: a declarative way to manage a set of pods, the model service pods are managed through the Deployment.</li>
<li>Service: a way to expose a set of pods in a Deployment, the model services is made available to the outside world through the Service, the service type is LoadBalancer which means that a load balancer will be created for the service.</li>
</ul>
<p>They are created within the Kubernetes cluster with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps/insurance-charges-model-deployment created
service/insurance-charges-model-service created
</code></pre></div>
<p>The deployment and service for the model service were created together. You can see the new service with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">get</span> <span class="n">services</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
insurance-charges-model-service LoadBalancer 10.245.246.238 <pending> 80:31223/TCP 32s
postgres-postgresql ClusterIP 10.245.0.250 <none> 5432/TCP 15m
postgres-postgresql-hl ClusterIP None <none> 5432/TCP 15m
</code></pre></div>
<p>The Service type is LoadBalancer, which means that the cloud provider is providing a load balancer and public IP address through which we can contact the service. To view details about the load balancer provided by Digital Ocean for this Service, we'll execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">describe</span> <span class="n">service</span> <span class="n">insurance</span><span class="o">-</span><span class="n">charges</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">service</span> <span class="o">|</span> <span class="n">grep</span> <span class="s2">"LoadBalancer Ingress"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>LoadBalancer Ingress: 157.230.202.103
</code></pre></div>
<p>The load balancer can take a while longer than the service to come up, until the load balancer is running the command won't return anything. The IP address that the Digital Ocean load balancer sits behind will be listed in the output of the command. </p>
<p>Once the load balancer comes up, we can view the service through a web browser:</p>
<p><img alt="FastAPI Documentation" src="https://www.tekhnoal.com/service_documentation_defml.png" width="100%"></p>
<p>The same documentation is displayes as when we deployed the service locally.</p>
<p>To make a prediction, we'll hit the IP service with a request:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://157.230.202.103/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s1">'{ "ssn": "646-87-1351" }'</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":6416.86}
</code></pre></div>
<p>The decorator is working and accessing data from the database!</p>
<p>The service is using the configuration file in ./configuration/kubernetes_rest_config.yaml right now, which is configuring the PostgreSQL decorator to accept the "ssn" field and use it to load all other features needed by the model from the database. This is not the only way that we can use the decorator, so we'll try out another configuration. </p>
<p>To load another configuration file, we'll just change the environment variable value in the Kubernetes Deployment resource for the model service:</p>
<div class="highlight"><pre><span></span><code><span class="nt">env</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">REST_CONFIG</span><span class="w"></span>
<span class="w"> </span><span class="nt">value</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">./configuration/kubernetes_rest_config2.yaml</span><span class="w"></span>
<span class="nn">...</span><span class="w"></span>
</code></pre></div>
<p>The new configuration file causes the decorator to accept more fields from the user of the service. After changing the Deployment, we'll recreate it in the cluster with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps/insurance-charges-model-deployment configured
service/insurance-charges-model-service unchanged
</code></pre></div>
<p>The service pods are restarted with the new configuration, the service remains unchanhed. We can try out a request with this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://157.230.202.103/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">ssn</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">646-87-1351</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 65, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 50, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: true </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":46627.88}
</code></pre></div>
<p>The service now required more fields because the decorator is no longer loading those features from the database.</p>
<p>We'll try out one more configuration to show how powerful decorators can be. In a <a href="https://www.tekhnoal.com/ml-model-decorators.html">previous blog post</a> we created a decorator that added a unique prediction id to every prediction returned by the model. We can add this decorator to the service by simply changing the configuration:</p>
<div class="highlight"><pre><span></span><code><span class="nt">decorators</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">data_enrichment.prediction_id.PredictionIDDecorator</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">data_enrichment.postgresql.PostgreSQLEnrichmentDecorator</span><span class="w"></span>
<span class="w"> </span><span class="nt">configuration</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="s">"postgres-postgresql.model-services.svc.cluster.local"</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="s">"5432"</span><span class="w"></span>
<span class="nn">...</span><span class="w"></span>
</code></pre></div>
<p>This configuration is in the ./configuration/kubernetes_rest_config3.yaml file. We recreate the Deployment again, this time pointing at this configuration file:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">apply</span> <span class="o">-</span><span class="n">f</span> <span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps/insurance-charges-model-deployment configured
service/insurance-charges-model-service unchanged
</code></pre></div>
<p>We'll try the service one more time:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://157.230.202.103/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s2">"{ </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">ssn</span><span class="se">\"</span><span class="s2">: </span><span class="se">\"</span><span class="s2">646-87-1351</span><span class="se">\"</span><span class="s2">, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">age</span><span class="se">\"</span><span class="s2">: 65, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">bmi</span><span class="se">\"</span><span class="s2">: 50, </span><span class="se">\</span>
<span class="s2"> </span><span class="se">\"</span><span class="s2">smoker</span><span class="se">\"</span><span class="s2">: true </span><span class="se">\</span>
<span class="s2"> }"</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":46627.88,"prediction_id":"4db189c2-5200-44a6-b6af-0e341d0fb9bc"}
</code></pre></div>
<p>The service returned a unique identifier field called "prediction_id" along with the prediction. This field was generated by the decorator we added through configuration. A full explanation of how the prediction ID decorator works can be found in the blog post.</p>
<p>This shows how easy and powerful it is to combine decorator with models in order to do more complex operations.</p>
<h3>Deleting the Resources</h3>
<p>Now that we're done with the service we need to destroy the resources. To delete the database deploymet, we'll delete the helm deployment:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">helm</span> <span class="n">delete</span> <span class="n">postgres</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>release "postgres" uninstalled
</code></pre></div>
<p>Since the persistent volume claim is not deleted with the chart, we'll delete it with a kubectl command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="n">pvc</span> <span class="o">-</span><span class="n">l</span> <span class="n">app</span><span class="o">.</span><span class="n">kubernetes</span><span class="o">.</span><span class="n">io</span><span class="o">/</span><span class="n">instance</span><span class="o">=</span><span class="n">postgres</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>persistentvolumeclaim "data-postgres-postgresql-0" deleted
</code></pre></div>
<p>To delete the model service, we'll execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="n">kubernetes</span><span class="o">/</span><span class="n">model_service</span><span class="o">.</span><span class="n">yml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>deployment.apps "insurance-charges-model-deployment" deleted
service "insurance-charges-model-service" deleted
</code></pre></div>
<p>To delete the namespace:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">kubectl</span> <span class="n">delete</span> <span class="o">-</span><span class="n">f</span> <span class="n">kubernetes</span><span class="o">/</span><span class="n">namespace</span><span class="o">.</span><span class="n">yml</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>namespace "model-services" deleted
</code></pre></div>
<p>Lastly, to destroy the kubernetes cluster, execute these commands:</p>
<div class="highlight"><pre><span></span><code><span class="o">%</span><span class="n">cd</span> <span class="o">./</span><span class="n">terraform</span>
<span class="err">!</span><span class="n">terraform</span> <span class="n">plan</span> <span class="o">-</span><span class="n">var</span><span class="o">=</span><span class="s2">"project_name=model-services"</span> <span class="o">-</span><span class="n">destroy</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="o">/</span><span class="n">Users</span><span class="o">/</span><span class="n">brian</span><span class="o">/</span><span class="n">Code</span><span class="o">/</span><span class="n">data</span><span class="o">-</span><span class="n">enrichment</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="n">ml</span><span class="o">-</span><span class="n">models</span><span class="o">/</span><span class="n">terraform</span><span class="w"></span>
<span class="n">digitalocean_vpc</span><span class="p">.</span><span class="nl">cluster_vpc:</span><span class="w"> </span><span class="n">Refreshing</span><span class="w"> </span><span class="n">state</span><span class="p">...</span><span class="w"> </span><span class="p">[</span><span class="n">id</span><span class="o">=</span><span class="mh">5</span><span class="n">a0e94f2</span><span class="o">-</span><span class="n">bb9d</span><span class="o">-</span><span class="mh">4814</span><span class="o">-</span><span class="n">bf5f</span><span class="o">-</span><span class="n">ccc2c2e98b84</span><span class="p">]</span><span class="w"></span>
<span class="n">digitalocean_container_registry</span><span class="p">.</span><span class="nl">container_registry:</span><span class="w"> </span><span class="n">Refreshing</span><span class="w"> </span><span class="n">state</span><span class="p">...</span><span class="w"> </span><span class="p">[</span><span class="n">id</span><span class="o">=</span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="o">-</span><span class="n">registry</span><span class="p">]</span><span class="w"></span>
<span class="n">digitalocean_container_registry_docker_credentials</span><span class="p">.</span><span class="nl">registry_credentials:</span><span class="w"> </span><span class="n">Refreshing</span><span class="w"> </span><span class="n">state</span><span class="p">...</span><span class="w"> </span><span class="p">[</span><span class="n">id</span><span class="o">=</span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="o">-</span><span class="n">registry</span><span class="p">]</span><span class="w"></span>
<span class="n">digitalocean_kubernetes_cluster</span><span class="p">.</span><span class="nl">cluster:</span><span class="w"> </span><span class="n">Refreshing</span><span class="w"> </span><span class="n">state</span><span class="p">...</span><span class="w"> </span><span class="p">[</span><span class="n">id</span><span class="o">=</span><span class="mh">7</span><span class="n">eda057c</span><span class="o">-</span><span class="mf">501f</span><span class="o">-</span><span class="mh">414</span><span class="n">c</span><span class="o">-</span><span class="n">ad36</span><span class="o">-</span><span class="n">e4a75feac4e0</span><span class="p">]</span><span class="w"></span>
<span class="n">kubernetes_secret</span><span class="p">.</span><span class="nl">cluster_registry_crendentials:</span><span class="w"> </span><span class="n">Refreshing</span><span class="w"> </span><span class="n">state</span><span class="p">...</span><span class="w"> </span><span class="p">[</span><span class="n">id</span><span class="o">=</span><span class="k">default</span><span class="o">/</span><span class="n">docker</span><span class="o">-</span><span class="n">cfg</span><span class="p">]</span><span class="w"></span>
<span class="n">Terraform</span><span class="w"> </span><span class="n">used</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">selected</span><span class="w"> </span><span class="n">providers</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="k">generate</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="n">execution</span><span class="w"> </span><span class="n">plan</span><span class="p">.</span><span class="w"></span>
<span class="n">Resource</span><span class="w"> </span><span class="n">actions</span><span class="w"> </span><span class="n">are</span><span class="w"> </span><span class="n">indicated</span><span class="w"> </span><span class="n">with</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="nl">symbols:</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">destroy</span><span class="w"></span>
<span class="n">Terraform</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">perform</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="nl">actions:</span><span class="w"></span>
<span class="w"> </span><span class="p">#</span><span class="w"> </span><span class="n">digitalocean_container_registry</span><span class="p">.</span><span class="n">container_registry</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">be</span><span class="w"> </span><span class="n">destroyed</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">resource</span><span class="w"> </span><span class="s">"digitalocean_container_registry"</span><span class="w"> </span><span class="s">"container_registry"</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"2022-05-02 00:48:55 +0000 UTC"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">endpoint</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"registry.digitalocean.com/model-services-registry"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"model-services-registry"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"model-services-registry"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">region</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"sfo3"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">server_url</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"registry.digitalocean.com"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">storage_usage_bytes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mh">694379520</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">subscription_tier_slug</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"basic"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
<span class="nl">Plan:</span><span class="w"> </span><span class="mh">0</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">add</span><span class="p">,</span><span class="w"> </span><span class="mh">0</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">change</span><span class="p">,</span><span class="w"> </span><span class="mh">5</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">destroy</span><span class="p">.</span><span class="w"></span>
<span class="n">Changes</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="nl">Outputs:</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">kubernetes_cluster_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"7eda057c-501f-414c-ad36-e4a75feac4e0"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">registry_endpoint</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"registry.digitalocean.com/model-services-registry"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="err">───────────────────────────────────────────────────────────────────────────────</span><span class="w"></span>
<span class="nl">Note:</span><span class="w"> </span><span class="n">You</span><span class="w"> </span><span class="n">didn</span><span class="p">'</span><span class="n">t</span><span class="w"> </span><span class="n">use</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="o">-</span><span class="n">out</span><span class="w"> </span><span class="n">option</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">save</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="n">plan</span><span class="p">,</span><span class="w"> </span><span class="n">so</span><span class="w"> </span><span class="n">Terraform</span><span class="w"> </span><span class="n">can</span><span class="p">'</span><span class="n">t</span><span class="w"></span>
<span class="n">guarantee</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">take</span><span class="w"> </span><span class="n">exactly</span><span class="w"> </span><span class="n">these</span><span class="w"> </span><span class="n">actions</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">you</span><span class="w"> </span><span class="n">run</span><span class="w"> </span><span class="s">"terraform apply"</span><span class="w"> </span><span class="n">now</span><span class="p">.</span><span class="w"></span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">terraform</span> <span class="n">apply</span> <span class="o">-</span><span class="n">var</span><span class="o">=</span><span class="s2">"project_name=model-services"</span> <span class="o">-</span><span class="n">auto</span><span class="o">-</span><span class="n">approve</span> <span class="o">-</span><span class="n">destroy</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">digitalocean_container_registry</span><span class="p">.</span><span class="nl">container_registry:</span><span class="w"> </span><span class="n">Refreshing</span><span class="w"> </span><span class="n">state</span><span class="p">...</span><span class="w"> </span><span class="p">[</span><span class="n">id</span><span class="o">=</span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="o">-</span><span class="n">registry</span><span class="p">]</span><span class="w"></span>
<span class="n">digitalocean_vpc</span><span class="p">.</span><span class="nl">cluster_vpc:</span><span class="w"> </span><span class="n">Refreshing</span><span class="w"> </span><span class="n">state</span><span class="p">...</span><span class="w"> </span><span class="p">[</span><span class="n">id</span><span class="o">=</span><span class="mh">5</span><span class="n">a0e94f2</span><span class="o">-</span><span class="n">bb9d</span><span class="o">-</span><span class="mh">4814</span><span class="o">-</span><span class="n">bf5f</span><span class="o">-</span><span class="n">ccc2c2e98b84</span><span class="p">]</span><span class="w"></span>
<span class="n">digitalocean_container_registry_docker_credentials</span><span class="p">.</span><span class="nl">registry_credentials:</span><span class="w"> </span><span class="n">Refreshing</span><span class="w"> </span><span class="n">state</span><span class="p">...</span><span class="w"> </span><span class="p">[</span><span class="n">id</span><span class="o">=</span><span class="n">model</span><span class="o">-</span><span class="n">services</span><span class="o">-</span><span class="n">registry</span><span class="p">]</span><span class="w"></span>
<span class="n">digitalocean_kubernetes_cluster</span><span class="p">.</span><span class="nl">cluster:</span><span class="w"> </span><span class="n">Refreshing</span><span class="w"> </span><span class="n">state</span><span class="p">...</span><span class="w"> </span><span class="p">[</span><span class="n">id</span><span class="o">=</span><span class="mh">7</span><span class="n">eda057c</span><span class="o">-</span><span class="mf">501f</span><span class="o">-</span><span class="mh">414</span><span class="n">c</span><span class="o">-</span><span class="n">ad36</span><span class="o">-</span><span class="n">e4a75feac4e0</span><span class="p">]</span><span class="w"></span>
<span class="n">kubernetes_secret</span><span class="p">.</span><span class="nl">cluster_registry_crendentials:</span><span class="w"> </span><span class="n">Refreshing</span><span class="w"> </span><span class="n">state</span><span class="p">...</span><span class="w"> </span><span class="p">[</span><span class="n">id</span><span class="o">=</span><span class="k">default</span><span class="o">/</span><span class="n">docker</span><span class="o">-</span><span class="n">cfg</span><span class="p">]</span><span class="w"></span>
<span class="n">Terraform</span><span class="w"> </span><span class="n">used</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">selected</span><span class="w"> </span><span class="n">providers</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="k">generate</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="n">execution</span><span class="w"> </span><span class="n">plan</span><span class="p">.</span><span class="w"></span>
<span class="n">Resource</span><span class="w"> </span><span class="n">actions</span><span class="w"> </span><span class="n">are</span><span class="w"> </span><span class="n">indicated</span><span class="w"> </span><span class="n">with</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="nl">symbols:</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">destroy</span><span class="w"></span>
<span class="n">Terraform</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">perform</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="nl">actions:</span><span class="w"></span>
<span class="w"> </span><span class="p">#</span><span class="w"> </span><span class="n">digitalocean_container_registry</span><span class="p">.</span><span class="n">container_registry</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">be</span><span class="w"> </span><span class="n">destroyed</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">resource</span><span class="w"> </span><span class="s">"digitalocean_container_registry"</span><span class="w"> </span><span class="s">"container_registry"</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="n">created_at</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"2022-05-02 00:48:55 +0000 UTC"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">endpoint</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"registry.digitalocean.com/model-services-registry"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"model-services-registry"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"model-services-registry"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">region</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"sfo3"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">server_url</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"registry.digitalocean.com"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">storage_usage_bytes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mh">694379520</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">subscription_tier_slug</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"basic"</span><span class="w"> </span><span class="o">-></span><span class="w"> </span><span class="n">null</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="p">...</span><span class="w"></span>
</code></pre></div>
<h2>Closing</h2>
<p>In this blog post, we showed how to use decorators to perform data enrichment for machine learning models. Data enrichment is a common requirement across many different ML model deployments. We went through the entire design and coding process for the decorator, local testing using Docker, creating the infrastucture using Terraform, and then deploying the solution to Kubernetes.</p>
<p>One of the benefits of using a decorator for the ML model is that we keep the model prediction code and the data access code separate from each other. The model code did not have to change at all for us to be able to perform data enrichment for the model. The RESTful service package code also didnt have to be modified because it supports adding decorators to models through configuration rather than doing it through code. In the end it was possible to cleanly combine the model, decorator, and service components into one cohesive solution through the use of configuration only. The service is also able to host multiple decorators for each model which also allows for more complex use cases for decorators.</p>
<p>Another benefit is that we are able to reuse the decorator we built in this blog post to do data enrichment for any ML model deployment that needs to pull data from a PostgreSQL database. The same decorator class can easily be instantiated and added to any model instance that follows the MLModel interface. We can do this because the decorator is built for flexibility, being able to be configured to load any number of fields from a database table and join the values into the model's input.</p>Decorator Pattern for ML Models2022-02-27T07:00:00-05:002022-02-27T07:00:00-05:00Brian Schmidttag:www.tekhnoal.com,2022-02-27:/ml-model-decorators.html<p>The decorator pattern is a software engineering pattern that allows software to be more flexible, more reusable, and more cohesive. In this blog post, we’ll explore how decorators work, how to implement them, and how to apply them to the MLModel base class.</p><h1>Decorator Pattern for ML Models</h1>
<h2>Introduction</h2>
<p>The decorator pattern is a software engineering pattern that allows software to be more flexible, more reusable, and more cohesive. In this blog post, we’ll explore how decorators work, how to implement them, how to apply them to the MLModel base class, and how to deploy them in a REST service.</p>
<p>We’ll be building on top of the MLModel base class that we’ve built in a <a href="https://www.tekhnoal.com/introducing-ml-base-package.html">previous blog post</a>. The MLModel base class is designed to be wrapped around the prediction functionality of a machine learning model. It has several properties that allow a model object to describe itself to the outside world, including its name, version, and input and output schemas. The MLModel base class also requires that any class that inherits from it to implement the __init__() method, and the predict() method. These two methods form the most simple functionality of a machine learning model, the __init__() method is where model parameters are loaded, and the predict() method is where predictions are made.</p>
<h2>The Decorator Pattern</h2>
<p>The decorator pattern is an object-oriented design pattern that is useful when behavior needs to be added to an object without changing the object’s class or subclassing the object’s class. A decorator is an object that “decorates” the API of the object that it is decorating while not modifying the API of the object. The decorator executes its own behavior before and after the behavior of the decorated object, in this way, the decorator instance acts as a “gateway” to the decorated object.</p>
<h3>How to Build a Decorator</h3>
<p>A decorator is a class that has the same API as the class that we want to decorate. In order to build a decorator, we’ll first create a Decorator base class by following these steps:</p>
<ul>
<li>Subclass the class we want to decorate, creating a Decorator class with the same API.</li>
<li>In the Decorator class, add an instance attribute that can point to an instance of the class that we want to decorate.</li>
<li>When instantiating the Decorator class, receive an instance of the class we want to decorate and save it to the instance attribute.</li>
<li>In the Decorator class, implement the methods of the API of the class we want to decorate, calling the methods of the instance attribute and returning the results to the caller.</li>
</ul>
<p>If we instantiate the Decorator base class, the decorator instance will just forward all method calls to the decorated object, which is not very useful. To actually build a Decorator, we’ll need to create a subclass of it like this:</p>
<ul>
<li>Create a subclass of Decorator that overrides the methods that you want to modify, adding your own behavior.</li>
<li>Make sure that you call the corresponding methods in the instance attribute from the Decorator’s methods in order to allow the decorated object to still execute its own behavior.</li>
</ul>
<p>Notice that a decorator instance can actually decorate another decorator instance, which allows us to “stack” decorators together to do more complex things. </p>
<h3>Benefits of the Decorator Pattern</h3>
<p>One of the great benefits of decorators is the flexibility that they bring to software development. Without the use of decorators, an object’s class must be modified or subclassed in order to modify its behavior. By using decorators, we can modify the behavior just by attaching the decorator to the object. The “decoration” of an object can be done at runtime and can be configuration-driven, which means that we can change a program’s behavior quickly and easily by modifying its configuration instead of its source code.</p>
<p>Another benefit of decorators in that the API of the object that is decorated does not change at all. Any other object that depends on the API of the decorated object can use it without modification and without being aware that it is decorated. The only problems that arise when applying decorators is if an object depends on another object’s specific behavior instead of its API, however this is an antipattern and should be avoided.</p>
<p>Yet another benefit of decorators is the ability to reuse them across different parts of an application. If we need to add the same behavior to many different objects which share the same API, we can create a decorator class that implements the behavior and attach it to the specific objects that we need to modify. If we had modified the behavior of the objects by changing their class, we would force all instances of the class to have the new behavior that we needed. If we subclassed the original class to add behavior, we would be adding another level of abstraction to the design which makes everything more complicated. By using decorator instances and only attaching them only to the objects that we actually need to modify, we simplify the application’s codebase.</p>
<p>By adding the decorator pattern to a codebase, we are able to make the whole codebase more cohesive. This is because we’re making individual classes that do one thing only. If we need to add some extra behavior to the class, we can attach a decorator that adds only that behavior instead of adding the behavior to the original class. The single responsibility principle tells us that a class should have only one reason to change, by using decorators we can make following this principle in our code a lot easier. </p>
<p>Decorators also encourage us to use a compositional approach to software development, which means that we create the desired behavior of the program by “composing” it from various smaller pieces of code. This is different from a hierarchical approach in which we define new behaviors by inheriting from and extending the behavior of base classes. Building software through composition is simpler in the long run because it incentivizes us to use simpler inheritance hierarchies that are easier to work with.</p>
<h3>Decorators in the Python Language</h3>
<p>The Python programming language already has a feature called decorators, which is syntactic sugar that allows a programmer to extend the functionality of a function or class. A decorator of this type is a function that takes a function or a class as a parameter and extends it with new behavior. Functions that are “decorated” have the name of the decorator function prepended with a “@” symbol:</p>
<div class="highlight"><pre><span></span><code><span class="nd">@my_decorator</span>
<span class="k">def</span> <span class="nf">my_function</span><span class="p">():</span>
<span class="o">...</span>
</code></pre></div>
<p>In this case the decorated function is called my_function and the decorator function is called my_decorator. It’s important to understand that in this blog post, we are not talking about this kind of decorator, although it is a similar concept. A great place to learn about Python decorators is <a href="https://realpython.com/primer-on-python-decorators/">here</a>.</p>
<p>The decorator that is supported by the Python language allows you to decorate code, but does not allow for dynamic runtime behavior. That is to say, we can modify a function right after it is loaded, but not before it is executed. The type of decoration we will be building in this blog post will allow us to decorate MLModel objects at runtime, regardless of the actual code. This means that we’ll be able to add decorators are runtime from configuration, adding some flexibility to our software.</p>
<h2>Base Class for Decorators</h2>
<p>The decorator pattern requires that we define a base class for the decorators that we want to actually build. </p>
<p>First, we'll install the ml_base package</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">clear_output</span>
<span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">ml_base</span><span class="o">>=</span><span class="mf">0.2.0</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>The MLModelDecorator base class looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Optional</span>
<span class="kn">from</span> <span class="nn">ml_base.ml_model</span> <span class="kn">import</span> <span class="n">MLModel</span>
<span class="k">class</span> <span class="nc">MLModelDecorator</span><span class="p">(</span><span class="n">MLModel</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">model</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">MLModel</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="k">if</span> <span class="n">model</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="ow">and</span> <span class="ow">not</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">MLModel</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Only objects of type MLModel can be wrapped with MLModelDecorator instances."</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_model"</span><span class="p">]</span> <span class="o">=</span> <span class="n">model</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_configuration"</span><span class="p">]</span> <span class="o">=</span> <span class="n">kwargs</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">display_name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_model"</span><span class="p">]</span><span class="o">.</span><span class="n">display_name</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">qualified_name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_model"</span><span class="p">]</span><span class="o">.</span><span class="n">qualified_name</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">description</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_model"</span><span class="p">]</span><span class="o">.</span><span class="n">description</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">version</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_model"</span><span class="p">]</span><span class="o">.</span><span class="n">version</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_model"</span><span class="p">]</span><span class="o">.</span><span class="n">input_schema</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_model"</span><span class="p">]</span><span class="o">.</span><span class="n">output_schema</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="s2">"_model"</span><span class="p">]</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>The MLModelDecorator base class is actually defined in the <a href="https://github.com/schmidtbri/ml-base">ml_base package</a> in version 0.2.0 and above. We can also import the class from the ml_base package like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">ml_base.decorator</span> <span class="kn">import</span> <span class="n">MLModelDecorator</span>
</code></pre></div>
<p>The base class for ML Model decorators is designed to hold a reference to an MLModel instance and add no behavior to it. Every method in the decorator just calls the corresponding method on the MLModel instance. This is done on purpose so that we can easily build simple decorators that only work on a single method while leaving all of the other methods and properties alone.</p>
<h2>Installing a Model</h2>
<p>To make this blog post a little shorter we won't build a new model to work with. Instead we'll install a model that we've built in the past.</p>
<p>To install the model, we can use the pip command and point it at the github repo of the model:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="o">-</span><span class="n">e</span> <span class="n">git</span><span class="o">+</span><span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">github</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">schmidtbri</span><span class="o">/</span><span class="n">regression</span><span class="o">-</span><span class="n">model</span><span class="c1">#egg=insurance_charges_model</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>The model is used to estimate insurance charges and we built it in a <a href="https://www.tekhnoal.com/regression-model.html">previous blog post</a>. The code for the model is in <a href="https://github.com/schmidtbri/regression-model">this github repository</a>.</p>
<p>To make a prediction with the model, we'll import the model's class:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.model</span> <span class="kn">import</span> <span class="n">InsuranceChargesModel</span>
</code></pre></div>
<p>Now we can instantiate the model:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">InsuranceChargesModel</span><span class="p">()</span>
</code></pre></div>
<p>To make a prediction, we'll need to use the model's input schema class.</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.schemas</span> <span class="kn">import</span> <span class="n">InsuranceChargesModelInput</span><span class="p">,</span> \
<span class="n">SexEnum</span><span class="p">,</span> <span class="n">RegionEnum</span>
<span class="n">model_input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span><span class="n">age</span><span class="o">=</span><span class="mi">21</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">male</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">20.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">southwest</span><span class="p">)</span>
</code></pre></div>
<p>Now we can make a prediction with the model by calling predict() with the input.</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=2231.7)
</code></pre></div>
<p>The model predicts that the charges will be $2231.70.</p>
<h2>Decorating the Model</h2>
<p>To show how the simplest possible decorator works, we'll instantiate the MLModel class and the MLModelDecorator class:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">ml_base</span> <span class="kn">import</span> <span class="n">MLModelDecorator</span>
<span class="n">decorator</span> <span class="o">=</span> <span class="n">MLModelDecorator</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
<span class="n">decorator</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>MLModelDecorator(InsuranceChargesModel)
</code></pre></div>
<p>The decorator instance is wrapping the model. When we print the decorator object it shows us that it is wrapping the IrisModelMock instance.</p>
<p>All of the properties of the IrisModelMock instance can still be accessed:</p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">decorator</span><span class="o">.</span><span class="n">display_name</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">decorator</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">decorator</span><span class="o">.</span><span class="n">description</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">decorator</span><span class="o">.</span><span class="n">version</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">decorator</span><span class="o">.</span><span class="n">input_schema</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">decorator</span><span class="o">.</span><span class="n">output_schema</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Insurance Charges Model
insurance_charges_model
Model to predict the insurance charges of a customer.
0.1.0
<class 'insurance_charges_model.prediction.schemas.InsuranceChargesModelInput'>
<class 'insurance_charges_model.prediction.schemas.InsuranceChargesModelOutput'>
</code></pre></div>
<p>The MLModelDecorator base class actually makes no modifications to the results that it "passes through" from the model instance.</p>
<p>We can also make predictions with the predict() method:</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">decorator</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=2231.7)
</code></pre></div>
<p>The MLModel decorator base class is not very useful by itself, we need to subclass it to add custom behaviors.</p>
<h2>Creating a Simple Decorator</h2>
<p>We'll override the default implementation of the MLModelDecorator base class in order to add some behavior.</p>
<p>This decorator executes around the predict() method:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">SimplePredictDecorator</span><span class="p">(</span><span class="n">MLModelDecorator</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Executing before prediction."</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Executing after prediction."</span><span class="p">)</span>
<span class="k">return</span> <span class="n">prediction</span>
</code></pre></div>
<p>The decorator wraps around the predict() method and does nothing except print a message before and after executing the predict method of the model.</p>
<p>We can try it out by wrapping the model instance again:</p>
<div class="highlight"><pre><span></span><code><span class="n">decorator</span> <span class="o">=</span> <span class="n">SimplePredictDecorator</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
</code></pre></div>
<p>Now, we'll call the predict method:</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">decorator</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">model_input</span><span class="p">)</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Executing before prediction.
Executing after prediction.
InsuranceChargesModelOutput(charges=2231.7)
</code></pre></div>
<p>The decorator instance executed before and after the model's predict() method and printed some messages.</p>
<h2>Adding UUIDs to Predictions</h2>
<p>Now we’ll build a decorator class that adds the ability to generate UUIDs for each prediction that a model makes. A UUID is a universally unique 128-bit identifier that can be generated for anything that we want to identify uniquely. In this case, we’d like to identify an individual prediction that an ML model makes. </p>
<p>To do this, we’ll have to do four things:</p>
<ul>
<li>Modify the description of the model to add info about the prediction id.</li>
<li>Modify the input schema of the model add an optional field that accepts UUIDs.</li>
<li>Modify the output schema of the model to add a field for the UUID.</li>
<li>Modify the predict() method to generate a UUID and return it alongside the prediction.</li>
</ul>
<p>Here is the code for the decorator:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Optional</span>
<span class="kn">from</span> <span class="nn">pydantic</span> <span class="kn">import</span> <span class="n">create_model</span>
<span class="kn">from</span> <span class="nn">uuid</span> <span class="kn">import</span> <span class="n">uuid4</span>
<span class="k">class</span> <span class="nc">PredictionIDDecorator</span><span class="p">(</span><span class="n">MLModelDecorator</span><span class="p">):</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">description</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="n">decorator_description</span> <span class="o">=</span> <span class="s2">" This model also has an optional input called 'prediction_id' that accepts an UUID string to uniquely identify the prediction returned. If the prediction id is not provided, a UUID is generated and returned in a field called 'prediction_id' in the model output."</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">description</span> <span class="o">+</span> <span class="n">decorator_description</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">input_schema</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">input_schema</span>
<span class="n">new_input_schema</span> <span class="o">=</span> <span class="n">create_model</span><span class="p">(</span>
<span class="n">input_schema</span><span class="o">.</span><span class="vm">__name__</span><span class="p">,</span>
<span class="n">prediction_id</span><span class="o">=</span><span class="p">(</span><span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">],</span> <span class="kc">None</span><span class="p">),</span>
<span class="n">__base__</span><span class="o">=</span><span class="n">input_schema</span><span class="p">,</span>
<span class="p">)</span>
<span class="k">return</span> <span class="n">new_input_schema</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">output_schema</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">output_schema</span>
<span class="n">new_output_schema</span> <span class="o">=</span> <span class="n">create_model</span><span class="p">(</span>
<span class="n">output_schema</span><span class="o">.</span><span class="vm">__name__</span><span class="p">,</span>
<span class="n">prediction_id</span><span class="o">=</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="o">...</span><span class="p">),</span>
<span class="n">__base__</span><span class="o">=</span><span class="n">output_schema</span><span class="p">,</span>
<span class="p">)</span>
<span class="k">return</span> <span class="n">new_output_schema</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="s2">"prediction_id"</span><span class="p">)</span> <span class="ow">and</span> <span class="n">data</span><span class="o">.</span><span class="n">prediction_id</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">prediction_id</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">prediction_id</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">prediction_id</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">uuid4</span><span class="p">())</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="n">wrapped_prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">output_schema</span><span class="p">(</span><span class="n">prediction_id</span><span class="o">=</span><span class="n">prediction_id</span><span class="p">,</span> <span class="o">**</span><span class="n">prediction</span><span class="o">.</span><span class="n">dict</span><span class="p">())</span>
<span class="k">return</span> <span class="n">wrapped_prediction</span>
</code></pre></div>
<p>We’ll try it out but instantiating the decorator with the IrisModel model instance:</p>
<div class="highlight"><pre><span></span><code><span class="n">uuid_decorated_model</span> <span class="o">=</span> <span class="n">PredictionIDDecorator</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
<span class="n">uuid_decorated_model</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>PredictionIDDecorator(InsuranceChargesModel)
</code></pre></div>
<p>The description should be different:</p>
<div class="highlight"><pre><span></span><code><span class="n">uuid_decorated_model</span><span class="o">.</span><span class="n">description</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="s2">"</span><span class="s">Model to predict the insurance charges of a customer. This model also has an optional input called 'prediction_id' that accepts an UUID string to uniquely identify the prediction returned. If the prediction id is not provided, a UUID is generated and returned in a field called 'prediction_id' in the model output.</span><span class="s2">"</span>
</code></pre></div>
<p>Next, we’ll take a look at the output schema:</p>
<div class="highlight"><pre><span></span><code><span class="n">uuid_decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'InsuranceChargesModelInput'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Schema for input of the model's predict method."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'object'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'properties'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'age'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Age'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Age of primary beneficiary in years.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mi">18</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mi">65</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'integer'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'sex'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Sex'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Gender of beneficiary.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/SexEnum'</span><span class="p">}]},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'bmi'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Body Mass Index'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Body mass index of beneficiary.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mf">15.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mf">50.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'number'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'children'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Children'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Number of children covered by health insurance.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'minimum'</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'maximum'</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'integer'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'smoker'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Smoker'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Whether beneficiary is a smoker.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'boolean'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'region'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Region'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Region where beneficiary lives.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'allOf'</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/RegionEnum'</span><span class="p">}]},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'prediction_id'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Prediction Id'</span><span class="p">,</span><span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">}},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'definitions'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'SexEnum'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'SexEnum'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Enumeration for the value of the 'sex' input of the model."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'male'</span><span class="p">,</span><span class="w"> </span><span class="s1">'female'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'RegionEnum'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'RegionEnum'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s2">"Enumeration for the value of the 'region' input of the model."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'southwest'</span><span class="p">,</span><span class="w"> </span><span class="s1">'southeast'</span><span class="p">,</span><span class="w"> </span><span class="s1">'northwest'</span><span class="p">,</span><span class="w"> </span><span class="s1">'northeast'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">}}}</span><span class="w"></span>
</code></pre></div>
<p>Even though the IrisModelMock didn't have a "prediction_id" in its input schema, the decorated model instance now has the field as an optional string field. This new field was added by the decorator instance.</p>
<p>We can see the prediction_id field schema by selecting it from the properties:</p>
<div class="highlight"><pre><span></span><code><span class="n">uuid_decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()[</span><span class="s2">"properties"</span><span class="p">][</span><span class="s2">"prediction_id"</span><span class="p">]</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{'title': 'Prediction Id', 'type': 'string'}
</code></pre></div>
<p>The output schema of the model was also modified.</p>
<div class="highlight"><pre><span></span><code><span class="n">uuid_decorated_model</span><span class="o">.</span><span class="n">output_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{<span class="s1">'</span><span class="s">title</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">InsuranceChargesModelOutput</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">description</span><span class="s1">'</span>: <span class="s2">"</span><span class="s">Schema for output of the model's predict method.</span><span class="s2">"</span>,
<span class="s1">'</span><span class="s">type</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">object</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">properties</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">charges</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">title</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">Charges</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">description</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">Individual medical costs billed by health insurance to customer in US dollars.</span><span class="s1">'</span>,
<span class="s1">'</span><span class="s">type</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">number</span><span class="s1">'</span>},
<span class="s1">'</span><span class="s">prediction_id</span><span class="s1">'</span>: {<span class="s1">'</span><span class="s">title</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">Prediction Id</span><span class="s1">'</span>, <span class="s1">'</span><span class="s">type</span><span class="s1">'</span>: <span class="s1">'</span><span class="s">string</span><span class="s1">'</span>}},
<span class="s1">'</span><span class="s">required</span><span class="s1">'</span>: [<span class="s1">'</span><span class="s">prediction_id</span><span class="s1">'</span>]}
</code></pre></div>
<p>In the output the "prediction_id" is a a required field, we did this because we want to always have a prediction_id associated with a prediction. To see how the decorator uses these new field, we'll make a prediction:</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">uuid_decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span>
<span class="n">uuid_decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="p">(</span><span class="n">age</span><span class="o">=</span><span class="mi">21</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">male</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">20.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">southwest</span><span class="p">))</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=2231.7, prediction_id='e84ab429-acec-4630-83d2-12809f222ae2')
</code></pre></div>
<p>The prediction now has a randomly generated UUID attached to it by the decorator in the "prediction_id" field. </p>
<p>We had to use the input schema returned by the decorator because the original InsuranceChargesModelInput schema class is no longer the model's input schema. The decorator creates a new class that becomes the model's new input schema.</p>
<p>If we want to provide a prediction_id with the model's input, the decorator will not generate a new prediction_id, instead it will return the prediction_id that was provided in the input.</p>
<div class="highlight"><pre><span></span><code><span class="n">prediction</span> <span class="o">=</span> <span class="n">uuid_decorated_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span>
<span class="n">uuid_decorated_model</span><span class="o">.</span><span class="n">input_schema</span><span class="p">(</span><span class="n">age</span><span class="o">=</span><span class="mi">21</span><span class="p">,</span>
<span class="n">sex</span><span class="o">=</span><span class="n">SexEnum</span><span class="o">.</span><span class="n">male</span><span class="p">,</span>
<span class="n">bmi</span><span class="o">=</span><span class="mf">20.0</span><span class="p">,</span>
<span class="n">children</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">smoker</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">region</span><span class="o">=</span><span class="n">RegionEnum</span><span class="o">.</span><span class="n">southwest</span><span class="p">,</span>
<span class="n">prediction_id</span><span class="o">=</span><span class="s2">"asdf-1234-asdf-1234"</span><span class="p">))</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>InsuranceChargesModelOutput(charges=2231.7, prediction_id='asdf-1234-asdf-1234')
</code></pre></div>
<p>The prediction_id returned by the model now has the same prediction_id that we provided to the model's input, the prediction_id was not generated.</p>
<p>This decorator will work with any model that works with the MLModel base class, as long as the UUID field can be attached to the root of the input and output schemas.</p>
<h2>Adding Decorators to a Deployed Model</h2>
<p>In order to deploy a model with a decorator we'll need to create a service that can add decorators to the model instance right after it is intantiated. This is supported by the rest_model_service package in version 0.2.0 and above. We built the rest_model service package in a <a href="https://www.tekhnoal.com/rest-model-service.html">previous blog post</a> to easily deploy MLModel instances.</p>
<p>First, we'll install the rest_model_service package.</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">rest_model_service</span><span class="o">>=</span><span class="mf">0.2.0</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>In order to deploy the IrisModelMock class, we'll create a configuration YAML file for the service:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Insurance Charges Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">qualified_name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model</span><span class="w"></span>
<span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model.prediction.model.InsuranceChargesModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
</code></pre></div>
<p>Notice that the we're pointing to the IrisModelMock class in the __main__ module which is the module inside of the jupyter notebook where this blog post is being written.</p>
<p>We can run the REST model service with these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>configuration/rest_config.yaml
uvicorn rest_model_service.main:app --reload
</code></pre></div>
<p>We can access the documentation at the root of the model service:</p>
<p><img alt="Service Documentation" src="https://www.tekhnoal.com/service_documentation.png" width="100%"></p>
<p>The model is running inside of the "api/models/iris_model/prediction" endpoint. We can make a prediction with a curl command:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="p">(</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:8000/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s1">'{"age": 65, "sex": "male", "bmi": 50, "children": 5, "smoker": true, "region": "southwest"}'</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":46277.67}
</code></pre></div>
<p>We were able to make a prediction with the undecorated model. Notice that we actually haven't loaded the decorator for the model yet. We'll stop the service with CTL C and try that next.</p>
<p>Adding a decorator to the IrisModelMock instance is done by adding the "decorators" key to the configuration:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Insurance Charges Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">qualified_name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model</span><span class="w"></span>
<span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model.prediction.model.InsuranceChargesModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
<span class="w"> </span><span class="nt">decorators</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">ml_model_decorators.prediction_id_decorator.PredictionIDDecorator</span><span class="w"></span>
</code></pre></div>
<p>We'll point the service to the new config file and restart it:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>configuration/decorators_config.yaml
uvicorn rest_model_service.main:app --reload
</code></pre></div>
<p>With the service now restarted using the PredictionIDDecorator, we can view the documentation for this endpoint:</p>
<p><img alt="Endpoint Documentation" src="https://www.tekhnoal.com/endpoint_documentation.png" width="100%"></p>
<p>As you can see, the modified description of the model is now displayed instead of the old description and the example value has the prediction_id field. Now we can try to make a prediction again:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="p">(</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:8000/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s1">'{"age": 65, "sex": "male", "bmi": 50, "children": 5, "smoker": true, "region": "southwest"}'</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":46277.67,"prediction_id":"5edbec33-ebec-4cdc-908b-e7d90d4bc2a2"}
</code></pre></div>
<p>We've made a prediction without providing a prediction_id, and we have the generated prediction_id in the response.</p>
<p>We can make another prediction request but with a provided prediction_id:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="p">(</span><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="s1">'POST'</span> \
<span class="s1">'http://127.0.0.1:8000/api/models/insurance_charges_model/prediction'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'accept: application/json'</span> \
<span class="o">-</span><span class="n">H</span> <span class="s1">'Content-Type: application/json'</span> \
<span class="o">-</span><span class="n">d</span> <span class="s1">'{"age": 65, "sex": "male", "bmi": 50, "children": 5, "smoker": true, "region": "southwest", "prediction_id": "asdf-1234-asdf-1234"}'</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{"charges":46277.67,"prediction_id":"asdf-1234-asdf-1234"}
</code></pre></div>
<p>As we expected, the model is now returning prediction ids along with the predictions themselves. </p>
<p>Since the service is able to load decorators along with models, we can modify the runtime behavior of any model we wish, as long as we wrap the code in an MLModelDecorator class.</p>
<h2>Closing</h2>
<p>In this blog post, we showed how decorators work and how to create decorators that work with the MLModel base class. We also showed how we can quickly deploy decorators on models inside of a RESTful model service through configuration. Decorators are an easy way to add functionality to a model without having to modify the code of the model class itself. In this blog post, we deployed an UUID generator on an ML model instance without having to modify the code of the model’s class or the code of the REST model service that hosts the model. The combination of decorators and machine learning models can help us to quickly and easily deploy common functionality to many different models.</p>Property-Based Testing for ML Models2021-09-03T08:01:00-05:002021-09-03T08:01:00-05:00Brian Schmidttag:www.tekhnoal.com,2021-09-03:/property-based-testing-for-ml-models.html<p>Property-based testing is a form of software testing that allows developers to write more comprehensive tests for software components. Property-based tests work by asserting that certain properties of the software component under test hold over a wide range of inputs. Property-based tests rely on the generation of inputs for a component and are a form of generative testing. When doing property-based testing it is useful to think in terms of invariants within the software component that we are testing. An invariant is a condition or assumption that we expect will never be violated by the component.</p><h1>Property-Based Testing for ML Models</h1>
<h2>Introduction</h2>
<p>Property-based testing is a form of software testing that allows
developers to write more comprehensive tests for software components.
Property-based tests work by asserting that certain properties of the
software component under test hold over a wide range of inputs.</p>
<p>Property-based tests rely on the generation of inputs for a component
and are a form of generative testing. When doing property-based testing
it is useful to think in terms of invariants within the software
component that we are testing. An invariant is a condition or assumption
that we expect will never be violated by the component.</p>
<p>Generative software testing is a type of testing in which a developer
does not have to come up with test cases manually. To accomplish this,
an engine is used that can come up with any number of test cases, as
long as we're able to state our requirements for the test cases clearly
and concisely. When the engine generates a test case for us, we send it
to the code that we are testing and see if any errors come up.
Generative testing is a form of black box testing because we don't
really know much about the internals of the component that is under
test, we just know how to structure its input in the correct way. Once a
test case is generated, we test by making sure that the component
returns valid output or that it is not in an invalid state.</p>
<p>Machine learning models are just like any other software component, they
require input and provide an output. In fact, ML models are some of the
simplest software components that make up a software system because they
usually only have one function: the "predict" function. The prediction
usually only requires one object, and the prediction result is also a
single object. Because of these factors, ML models are actually great
candidates for property-based testing. In this blog post, we'll focus on
testing the input and output schemas of the model and we'll make sure
that the model is able to accept the inputs that it says that it can
accept. In terms of invariants, we'll be testing that the model can
handle any input that is within its stated input schema.</p>
<p>In this blog post, we'll do property-based testing of an ML model and a
RESTful model service that we'll build around the same model. To do
property-based testing we'll use the <a href="https://hypothesis.readthedocs.io/en/latest/">hypothesis
package</a>, and to do
property-based testing on the model service we'll use the <a href="https://schemathesis.readthedocs.io/en/stable/">schemathesis
package</a>.</p>
<h1>Package Structure</h1>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">mobile_handset_price_model</span>
<span class="o">-</span> <span class="nv">model_files</span> <span class="ss">(</span><span class="nv">output</span> <span class="nv">files</span> <span class="nv">from</span> <span class="nv">model</span> <span class="nv">training</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">prediction</span> <span class="ss">(</span><span class="nv">package</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">prediction</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">model</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">prediction</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">schemas</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">model</span> <span class="nv">input</span> <span class="nv">and</span> <span class="nv">output</span> <span class="nv">schemas</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">transformers</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">data</span> <span class="nv">transformers</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">training</span> <span class="ss">(</span><span class="nv">package</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">training</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">data_exploration</span>.<span class="nv">ipynb</span> <span class="ss">(</span><span class="nv">data</span> <span class="nv">exploration</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">data_preparation</span>.<span class="nv">ipynb</span> <span class="ss">(</span><span class="nv">data</span> <span class="nv">preparation</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_training</span>.<span class="nv">ipynb</span> <span class="ss">(</span><span class="nv">model</span> <span class="nv">training</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_validation</span>.<span class="nv">ipynb</span> <span class="ss">(</span><span class="nv">model</span> <span class="nv">validation</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span><span class="nv">unit</span> <span class="nv">tests</span> <span class="k">for</span> <span class="nv">model</span> <span class="nv">codel</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">Makefile</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span> <span class="ss">(</span><span class="nv">list</span> <span class="nv">of</span> <span class="nv">dependencies</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">rest_config</span>.<span class="nv">yaml</span> <span class="ss">(</span><span class="nv">configuration</span> <span class="k">for</span> <span class="nv">REST</span> <span class="nv">model</span> <span class="nv">service</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">service_contract</span>.<span class="nv">yaml</span> <span class="ss">(</span><span class="nv">OpenAPI</span> <span class="nv">service</span> <span class="nv">contract</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">setup</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span> <span class="ss">(</span><span class="nv">test</span> <span class="nv">dependencies</span><span class="ss">)</span>
</code></pre></div>
<p>All of the code is available in a <a href="https://github.com/schmidtbri/property-based-testing-for-ml-models">github repository</a>.</p>
<h1>Creating a Model</h1>
<p>To be able to do property-based testing on an ML model, we'll need to
have a model to work with. In this section we will get a dataset,
explore it, preprocess it, train a model on it, and validate the
resulting model.</p>
<h2>Getting Data</h2>
<p>In order to train a model, we first need to have a dataset. We went into
Kaggle and found a dataset that contains mobile handset price
information. To make it easy to download the dataset, we installed the
kaggle python package and then we executed these commands to download
the data and unzip it into the data folder in the project:</p>
<div class="highlight"><pre><span></span><code>mkdir -p data
kaggle datasets download -d iabhishekofficial/mobile-price-classification/tasks -p ./data --unzip
</code></pre></div>
<p>To make it even easier to download the data, we added a Makefile target
for the commands:</p>
<div class="highlight"><pre><span></span><code><span class="nf">download-dataset</span><span class="o">:</span>
mkdir -p data
kaggle datasets download -d iabhishekofficial/mobile-price-classification/tasks -p ./data --unzip
</code></pre></div>
<p>Now all we need to do to get the data is execute this command:</p>
<div class="highlight"><pre><span></span><code>make download-data
</code></pre></div>
<p>Instead of having to remember how to get the data needed to do modeling,
I always try to create a repeatable and documented process for creating
the dataset. We also need to make sure to never store the dataset in
source control, so we\'ll add this line to the .gitignore file:</p>
<div class="highlight"><pre><span></span><code>data/
</code></pre></div>
<h2>Training a Model</h2>
<h3>Data Exploration</h3>
<p>In order to create a model, we'll first explore the data. Before we can
do that, we need to first load the data and do some basic housekeeping.</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s2">"../../data/train.csv"</span><span class="p">)</span>
</code></pre></div>
<p>The datatypes of the columns are:</p>
<div class="highlight"><pre><span></span><code>data.dtypes
battery_power int64
blue int64
clock_speed float64
dual_sim int64
fc int64
four_g int64
int_memory int64
m_dep float64
mobile_wt int64
n_cores int64
pc int64
px_height int64
px_width int64
ram int64
sc_h int64
sc_w int64
talk_time int64
three_g int64
touch_screen int64
wifi int64
price_range int64
dtype: object
</code></pre></div>
<p>In order to more easily work with the data, we\'ll rename some of the
columns so that they have clearer names:</p>
<div class="highlight"><pre><span></span><code><span class="n">columns</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"blue"</span><span class="p">:</span> <span class="s2">"has_bluetooth"</span><span class="p">,</span>
<span class="s2">"dual_sim"</span><span class="p">:</span> <span class="s2">"has_dual_sim"</span><span class="p">,</span>
<span class="s2">"fc"</span><span class="p">:</span> <span class="s2">"front_camera_megapixels"</span><span class="p">,</span>
<span class="s2">"four_g"</span><span class="p">:</span> <span class="s2">"has_four_g"</span><span class="p">,</span>
<span class="s2">"int_memory"</span><span class="p">:</span> <span class="s2">"internal_memory"</span><span class="p">,</span>
<span class="s2">"m_dep"</span><span class="p">:</span> <span class="s2">"depth"</span><span class="p">,</span>
<span class="s2">"mobile_wt"</span><span class="p">:</span> <span class="s2">"weight"</span><span class="p">,</span>
<span class="s2">"n_cores"</span><span class="p">:</span> <span class="s2">"number_of_cores"</span><span class="p">,</span>
<span class="s2">"pc"</span><span class="p">:</span> <span class="s2">"primary_camera_megapixels"</span><span class="p">,</span>
<span class="s2">"px_height"</span><span class="p">:</span> <span class="s2">"pixel_resolution_height"</span><span class="p">,</span>
<span class="s2">"px_width"</span><span class="p">:</span> <span class="s2">"pixel_resolution_width"</span><span class="p">,</span>
<span class="s2">"sc_h"</span><span class="p">:</span> <span class="s2">"screen_height"</span><span class="p">,</span>
<span class="s2">"sc_w"</span><span class="p">:</span> <span class="s2">"screen_width"</span><span class="p">,</span>
<span class="s2">"three_g"</span><span class="p">:</span> <span class="s2">"has_three_g"</span><span class="p">,</span>
<span class="s2">"touch_screen"</span><span class="p">:</span> <span class="s2">"has_touch_screen"</span><span class="p">,</span>
<span class="s2">"wifi"</span><span class="p">:</span> <span class="s2">"has_wifi"</span>
<span class="p">}</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">rename</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="n">columns</span><span class="p">)</span>
</code></pre></div>
<p>We also need to get the unique values of the variable we intend to use
as the target variable:</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span><span class="p">[</span><span class="s2">"price_range"</span><span class="p">]</span><span class="o">.</span><span class="n">unique</span><span class="p">()</span>
<span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">0</span><span class="p">])</span>
</code></pre></div>
<p>The target variable holds categorical values.</p>
<p>To finish the data exploration we'll use the
<a href="https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/">pandas_profiling</a>
package. This package is able to take a pandas dataframe and quickly
create a full report about the dataset in the dataframe. Here are some
simple statistics found by pandas_profiling:</p>
<p><img alt="Dataset Statistics" src="https://www.tekhnoal.com/dataset_statistics.png" width="100%"></p>
<p>The dataset has 21 variables in total, with 14 numeric variables and 7
categorical variables. There are 2000 samples, with no missing values or
duplicate values. After examining the report, we can see that the
categorical variables all hold only two values, for example the
"has_bluetooth" variable:</p>
<p><img alt="Variable Description" src="https://www.tekhnoal.com/variable_description.png" width="100%"></p>
<p>From this we can see that these are really just boolean values, we'll
use this later in order to simplify the input schema of the model.</p>
<p>The data exploration is in <a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/mobile_handset_price_model/training/data_exploration.ipynb">this
notebook</a>.</p>
<h3>Preparing the Data</h3>
<p>To prepare the data for modeling, we'll first create lists of
categorical, numerical, and boolean variables:</p>
<div class="highlight"><pre><span></span><code><span class="n">categorical_cols</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">numerical_columns</span> <span class="o">=</span> <span class="p">[</span>
<span class="s2">"battery_power"</span><span class="p">,</span>
<span class="s2">"clock_speed"</span><span class="p">,</span>
<span class="s2">"front_camera_megapixels"</span><span class="p">,</span>
<span class="s2">"internal_memory"</span><span class="p">,</span>
<span class="s2">"depth"</span><span class="p">,</span>
<span class="s2">"weight"</span><span class="p">,</span>
<span class="s2">"number_of_cores"</span><span class="p">,</span>
<span class="s2">"primary_camera_megapixels"</span><span class="p">,</span>
<span class="s2">"pixel_resolution_height"</span><span class="p">,</span>
<span class="s2">"pixel_resolution_width"</span><span class="p">,</span>
<span class="s2">"ram"</span><span class="p">,</span>
<span class="s2">"screen_height"</span><span class="p">,</span>
<span class="s2">"screen_width"</span><span class="p">,</span>
<span class="s2">"talk_time"</span>
<span class="p">]</span>
<span class="n">boolean_columns</span> <span class="o">=</span> <span class="p">[</span>
<span class="s2">"has_bluetooth"</span><span class="p">,</span>
<span class="s2">"has_dual_sim"</span><span class="p">,</span>
<span class="s2">"has_four_g"</span><span class="p">,</span>
<span class="s2">"has_three_g"</span><span class="p">,</span>
<span class="s2">"has_touch_screen"</span><span class="p">,</span>
<span class="s2">"has_wifi"</span><span class="p">,</span>
<span class="p">]</span>
</code></pre></div>
<p>Because all of the categorical variables are in fact boolean variables,
we don\'t have any variables in the "categorical_cols" list. Next, we'll
create a transformer that will work with the numerical variables:</p>
<div class="highlight"><pre><span></span><code><span class="n">numerical_transformer</span> <span class="o">=</span> <span class="n">Pipeline</span><span class="p">(</span><span class="n">steps</span><span class="o">=</span><span class="p">[</span>
<span class="p">(</span><span class="s2">"imputer"</span><span class="p">,</span> <span class="n">SimpleImputer</span><span class="p">(</span><span class="n">strategy</span><span class="o">=</span><span class="s2">"mean"</span><span class="p">)),</span>
<span class="p">(</span><span class="s2">"scaler"</span><span class="p">,</span> <span class="n">StandardScaler</span><span class="p">())</span>
<span class="p">])</span>
</code></pre></div>
<p>Next, we\'ll create a transformer that is able to convert the values in
the boolean columns to boolean values:</p>
<div class="highlight"><pre><span></span><code><span class="n">boolean_transformer</span> <span class="o">=</span> <span class="n">BooleanTransformer</span><span class="p">(</span><span class="n">true_value</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">false_value</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div>
<p>Lastly, we'll combine both transformers using a ColumnTransformer:</p>
<div class="highlight"><pre><span></span><code><span class="n">column_transformer</span> <span class="o">=</span> <span class="n">ColumnTransformer</span><span class="p">(</span>
<span class="n">remainder</span><span class="o">=</span><span class="s2">"passthrough"</span><span class="p">,</span>
<span class="n">transformers</span><span class="o">=</span><span class="p">[</span>
<span class="p">(</span><span class="s2">"numerical"</span><span class="p">,</span> <span class="n">numerical_transformer</span><span class="p">,</span> <span class="n">numerical_columns</span><span class="p">),</span>
<span class="p">(</span><span class="s2">"boolean"</span><span class="p">,</span> <span class="n">boolean_transformer</span><span class="p">,</span> <span class="n">boolean_columns</span><span class="p">)</span>
<span class="p">]</span>
<span class="p">)</span>
</code></pre></div>
<p>Now we can save the transformer object so we can fit it to the data
later:</p>
<div class="highlight"><pre><span></span><code><span class="n">joblib</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">column_transformer</span><span class="p">,</span> <span class="s2">"column_transformer.joblib"</span><span class="p">)</span>
</code></pre></div>
<p>The data preparation code is in <a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/mobile_handset_price_model/training/data_preparation.ipynb">this
notebook</a>.</p>
<h3>Training a Model</h3>
<p>Now that we have the data transformations built, we can train a model.
To do that, we'll first create lists of the predictor variables and the
target column:</p>
<div class="highlight"><pre><span></span><code><span class="n">feature_columns</span> <span class="o">=</span> <span class="p">[</span>
<span class="s2">"battery_power"</span><span class="p">,</span>
<span class="s2">"has_bluetooth"</span><span class="p">,</span>
<span class="s2">"clock_speed"</span><span class="p">,</span>
<span class="s2">"has_dual_sim"</span><span class="p">,</span>
<span class="s2">"front_camera_megapixels"</span><span class="p">,</span>
<span class="s2">"has_four_g"</span><span class="p">,</span>
<span class="s2">"internal_memory"</span><span class="p">,</span>
<span class="s2">"depth"</span><span class="p">,</span>
<span class="s2">"weight"</span><span class="p">,</span>
<span class="s2">"number_of_cores"</span><span class="p">,</span>
<span class="s2">"primary_camera_megapixels"</span><span class="p">,</span>
<span class="s2">"pixel_resolution_height"</span><span class="p">,</span>
<span class="s2">"pixel_resolution_width"</span><span class="p">,</span>
<span class="s2">"ram"</span><span class="p">,</span>
<span class="s2">"screen_height"</span><span class="p">,</span>
<span class="s2">"screen_width"</span><span class="p">,</span>
<span class="s2">"talk_time"</span><span class="p">,</span>
<span class="s2">"has_three_g"</span><span class="p">,</span>
<span class="s2">"has_touch_screen"</span><span class="p">,</span>
<span class="s2">"has_wifi"</span>
<span class="p">]</span>
<span class="n">target_column</span> <span class="o">=</span> <span class="s2">"price_range"</span>
</code></pre></div>
<p>Next, we'll split the dataset into training, validation, and test sets
and then create dataframes for the predictor and target variables:</p>
<div class="highlight"><pre><span></span><code><span class="n">train</span><span class="p">,</span> <span class="n">validate</span><span class="p">,</span> <span class="n">test</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">sample</span><span class="p">(</span><span class="n">frac</span><span class="o">=</span><span class="mi">1</span><span class="p">),</span> <span class="p">[</span><span class="nb">int</span><span class="p">(</span><span class="mf">0.6</span><span class="o">*</span><span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">)),</span> <span class="nb">int</span><span class="p">(</span><span class="mf">0.8</span><span class="o">*</span><span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">))])</span>
<span class="n">X_train</span> <span class="o">=</span> <span class="n">train</span><span class="p">[</span><span class="n">feature_columns</span><span class="p">]</span>
<span class="n">y_train</span> <span class="o">=</span> <span class="n">train</span><span class="p">[</span><span class="n">target_column</span><span class="p">]</span>
<span class="n">X_validate</span> <span class="o">=</span> <span class="n">validate</span><span class="p">[</span><span class="n">feature_columns</span><span class="p">]</span>
<span class="n">y_validate</span> <span class="o">=</span> <span class="n">validate</span><span class="p">[</span><span class="n">target_column</span><span class="p">]</span>
</code></pre></div>
<p>We'll need the transformer we created earlier, so we'll load it from
disk:</p>
<div class="highlight"><pre><span></span><code><span class="n">transformer</span> <span class="o">=</span> <span class="n">joblib</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="s2">"column_transformer.joblib"</span><span class="p">)</span>
</code></pre></div>
<p>Next, we'll create an XGBClassifier model:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">XGBClassifier</span><span class="p">()</span>
</code></pre></div>
<p>And combine it with the transformer to create a single pipeline:</p>
<div class="highlight"><pre><span></span><code><span class="n">pipeline</span> <span class="o">=</span> <span class="n">Pipeline</span><span class="p">(</span><span class="n">steps</span><span class="o">=</span><span class="p">[</span>
<span class="p">(</span><span class="s2">"preprocessor"</span><span class="p">,</span> <span class="n">transformer</span><span class="p">),</span>
<span class="p">(</span><span class="s2">"model"</span><span class="p">,</span> <span class="n">model</span><span class="p">)</span>
<span class="p">])</span>
</code></pre></div>
<p>Next, we'll fit the pipeline to the training set:</p>
<div class="highlight"><pre><span></span><code><span class="n">pipeline</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
</code></pre></div>
<p>Now we can try to make single prediction to make sure everything is working:</p>
<div class="highlight"><pre><span></span><code><span class="n">result</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X_validate</span><span class="o">.</span><span class="n">iloc</span><span class="p">[[</span><span class="mi">0</span><span class="p">]])</span>
<span class="nb">print</span><span class="p">(</span><span class="n">result</span><span class="p">)</span>
<span class="n">array</span><span class="p">([</span><span class="mi">3</span><span class="p">])</span>
</code></pre></div>
<p>However, this is not the real model we want, we'll do hyperparameter
tuning using the <a href="https://hyperopt.github.io/hyperopt/">hyperopt package</a>. The hyperparameter
space is defined like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">space</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"max_depth"</span><span class="p">:</span> <span class="n">hp</span><span class="o">.</span><span class="n">quniform</span><span class="p">(</span><span class="s2">"max_depth"</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">18</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
<span class="s2">"gamma"</span><span class="p">:</span> <span class="n">hp</span><span class="o">.</span><span class="n">uniform</span> <span class="p">(</span><span class="s2">"gamma"</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">9</span><span class="p">),</span>
<span class="s2">"reg_alpha"</span> <span class="p">:</span> <span class="n">hp</span><span class="o">.</span><span class="n">quniform</span><span class="p">(</span><span class="s2">"reg_alpha"</span><span class="p">,</span> <span class="mi">40</span><span class="p">,</span><span class="mi">180</span><span class="p">,</span><span class="mi">1</span><span class="p">),</span>
<span class="s2">"reg_lambda"</span> <span class="p">:</span> <span class="n">hp</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="s2">"reg_lambda"</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
<span class="s2">"colsample_bytree"</span> <span class="p">:</span> <span class="n">hp</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="s2">"colsample_bytree"</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
<span class="s2">"min_child_weight"</span> <span class="p">:</span> <span class="n">hp</span><span class="o">.</span><span class="n">quniform</span><span class="p">(</span><span class="s2">"min_child_weight"</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
<span class="s2">"n_estimators"</span><span class="p">:</span> <span class="mi">180</span><span class="p">,</span>
<span class="s2">"seed"</span><span class="p">:</span> <span class="mi">0</span>
<span class="p">}</span>
</code></pre></div>
<p>And the objective function looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">objective</span><span class="p">(</span><span class="n">space</span><span class="p">):</span>
<span class="n">classifier</span> <span class="o">=</span> <span class="n">XGBClassifier</span><span class="p">(</span>
<span class="n">n_estimators</span><span class="o">=</span><span class="n">space</span><span class="p">[</span><span class="s2">"n_estimators"</span><span class="p">],</span>
<span class="n">max_depth</span><span class="o">=</span><span class="nb">int</span><span class="p">(</span><span class="n">space</span><span class="p">[</span><span class="s2">"max_depth"</span><span class="p">]),</span>
<span class="n">gamma</span><span class="o">=</span><span class="n">space</span><span class="p">[</span><span class="s2">"gamma"</span><span class="p">],</span>
<span class="n">reg_alpha</span><span class="o">=</span><span class="nb">int</span><span class="p">(</span><span class="n">space</span><span class="p">[</span><span class="s2">"reg_alpha"</span><span class="p">]),</span>
<span class="n">min_child_weight</span><span class="o">=</span><span class="nb">int</span><span class="p">(</span><span class="n">space</span><span class="p">[</span><span class="s2">"min_child_weight"</span><span class="p">]),</span>
<span class="n">colsample_bytree</span><span class="o">=</span><span class="nb">int</span><span class="p">(</span><span class="n">space</span><span class="p">[</span><span class="s2">"colsample_bytree"</span><span class="p">])</span>
<span class="p">)</span>
<span class="n">evaluation</span> <span class="o">=</span> <span class="p">[(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">),</span> <span class="p">(</span><span class="n">X_validate</span><span class="p">,</span> <span class="n">y_validate</span><span class="p">)]</span>
<span class="n">classifier</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">,</span> <span class="n">eval_set</span><span class="o">=</span><span class="n">evaluation</span><span class="p">,</span> <span class="n">eval_metric</span><span class="o">=</span><span class="s2">"merror"</span><span class="p">,</span> <span class="n">early_stopping_rounds</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">predictions</span> <span class="o">=</span> <span class="n">classifier</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X_validate</span><span class="p">)</span>
<span class="n">accuracy</span> <span class="o">=</span> <span class="n">accuracy_score</span><span class="p">(</span><span class="n">y_validate</span><span class="p">,</span> <span class="n">predictions</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"SCORE: "</span><span class="p">,</span> <span class="n">accuracy</span><span class="p">)</span>
<span class="k">return</span> <span class="p">{</span>
<span class="s2">"loss"</span><span class="p">:</span> <span class="o">-</span><span class="n">accuracy</span><span class="p">,</span>
<span class="s2">"status"</span><span class="p">:</span> <span class="n">STATUS_OK</span>
<span class="p">}</span>
</code></pre></div>
<p>We'll run the hyperparameter search like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">trials</span> <span class="o">=</span> <span class="n">Trials</span><span class="p">()</span>
<span class="n">best_hyperparameters</span> <span class="o">=</span> <span class="n">fmin</span><span class="p">(</span><span class="n">fn</span> <span class="o">=</span> <span class="n">objective</span><span class="p">,</span>
<span class="n">space</span> <span class="o">=</span> <span class="n">space</span><span class="p">,</span>
<span class="n">algo</span> <span class="o">=</span> <span class="n">tpe</span><span class="o">.</span><span class="n">suggest</span><span class="p">,</span>
<span class="n">max_evals</span> <span class="o">=</span> <span class="mi">100</span><span class="p">,</span>
<span class="n">trials</span> <span class="o">=</span> <span class="n">trials</span><span class="p">)</span>
</code></pre></div>
<p>The best hyperparameters found are these:</p>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">colsample_bytree</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mf">0.7805313948569044</span><span class="p">,</span><span class="w"> </span>
<span class="w"> </span><span class="p">'</span><span class="n">gamma</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mf">2.8457210780834963</span><span class="p">,</span><span class="w"> </span>
<span class="w"> </span><span class="p">'</span><span class="n">max_depth</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mf">8.0</span><span class="p">,</span><span class="w"> </span>
<span class="w"> </span><span class="p">'</span><span class="n">min_child_weight</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mf">8.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="p">'</span><span class="n">reg_alpha</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mf">86.0</span><span class="p">,</span><span class="w"> </span>
<span class="w"> </span><span class="p">'</span><span class="n">reg_lambda</span><span class="p">'</span><span class="o">:</span><span class="w"> </span><span class="mf">0.23805965814363095</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
</code></pre></div>
<p>Now that we have found the best hyperparameters, we'll train the real
model:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">XGBClassifier</span><span class="p">(</span><span class="o">**</span><span class="n">best_hyperparameters</span><span class="p">)</span>
<span class="n">pipeline</span> <span class="o">=</span> <span class="n">Pipeline</span><span class="p">(</span><span class="n">steps</span><span class="o">=</span><span class="p">[</span>
<span class="p">(</span><span class="s2">"preprocessor"</span><span class="p">,</span> <span class="n">transformer</span><span class="p">),</span>
<span class="p">(</span><span class="s2">"model"</span><span class="p">,</span> <span class="n">model</span><span class="p">)])</span>
<span class="n">pipeline</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
</code></pre></div>
<p>Lastly, we can save the model object:</p>
<div class="highlight"><pre><span></span><code><span class="n">joblib</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">pipeline</span><span class="p">,</span> <span class="s2">"model.joblib"</span><span class="p">)</span>
</code></pre></div>
<p>The model training code is in <a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/mobile_handset_price_model/training/model_training.ipynb">this
notebook</a>.</p>
<h3>Validating the Model</h3>
<p>To validate the model, we'll use the <a href="https://www.scikit-yb.org/en/latest/">yellowbrick
package</a>. First, we\'ll load
the fitted model object that was saved in a previous step:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">joblib</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="s2">"model.joblib"</span><span class="p">)</span>
</code></pre></div>
<p>The yellowbrick package can create a classification report like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">yellowbrick.classifier</span> <span class="kn">import</span> <span class="n">ClassificationReport</span>
<span class="n">visualizer</span> <span class="o">=</span> <span class="n">ClassificationReport</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">classes</span><span class="o">=</span><span class="n">classes</span><span class="p">,</span> <span class="n">support</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">score</span><span class="p">(</span><span class="n">X_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div>
<p>The resulting graph looks like this:</p>
<p><img alt="Classification Report" src="https://www.tekhnoal.com/classification_report.png" width="100%"></p>
<p>The classification report visualizer displays the precision, recall, F1,
and support scores for the model for each class in the target variable.</p>
<p>A confusion matrix is created like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">yellowbrick.classifier</span> <span class="kn">import</span> <span class="n">ConfusionMatrix</span>
<span class="n">visualizer</span> <span class="o">=</span> <span class="n">ConfusionMatrix</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">classes</span><span class="o">=</span><span class="n">classes</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">score</span><span class="p">(</span><span class="n">X_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div>
<p><img alt="Confusion Matrix" src="https://www.tekhnoal.com/confusion_matrix.png" width="100%"></p>
<p>The ROC/AUC plot is created like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">yellowbrick.classifier</span> <span class="kn">import</span> <span class="n">ROCAUC</span>
<span class="n">visualizer</span> <span class="o">=</span> <span class="n">ROCAUC</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">classes</span><span class="o">=</span><span class="n">classes</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">score</span><span class="p">(</span><span class="n">X_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div>
<p><img alt="ROCAUC" src="https://www.tekhnoal.com/roc_auc.png" width="100%"></p>
<p>The class prediction error plot is done like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">yellowbrick.classifier</span> <span class="kn">import</span> <span class="n">ClassPredictionError</span>
<span class="n">visualizer</span> <span class="o">=</span> <span class="n">ClassPredictionError</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">classes</span><span class="o">=</span><span class="n">classes</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">score</span><span class="p">(</span><span class="n">X_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div>
<p><img alt="Class Prediction Error" src="https://www.tekhnoal.com/class_prediction_error.png" width="100%"></p>
<p>Now that we have a fully trained and validated model and we understand
the underlying data that we used to create the model, we can move
forward with writing the code that we'll use to make predictions with
the model.</p>
<p>The model validation code is in <a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/mobile_handset_price_model/training/model_validation.ipynb">this
notebook</a>.</p>
<h2>Creating the Model Schemas</h2>
<p>In order to be able to use the model, we'll need to define what it's
input and output schemas are. To do this, we'll use the <a href="https://schemathesis.readthedocs.io/en/stable/#">pydantic
package</a> to
define two classes. The model input class looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MobileHandsetPriceModelInput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="sd">"""Schema for input of the model's predict method."""</span>
<span class="n">battery_power</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"battery_power"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">500</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">2000</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Total energy a battery can store in one time measured in mAh."</span><span class="p">)</span>
<span class="n">has_bluetooth</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="o">...</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"has_bluetooth"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether the phone has bluetooth."</span><span class="p">)</span>
<span class="n">clock_speed</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">float</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"clock_speed"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mf">0.5</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mf">3.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Speed of microprocessor in gHz."</span><span class="p">)</span>
<span class="n">has_dual_sim</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">bool</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"has_dual_sim"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether the phone has dual SIM slots."</span><span class="p">)</span>
<span class="n">front_camera_megapixels</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"front_camera_megapixels"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Front camera mega pixels."</span><span class="p">)</span>
<span class="n">has_four_g</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="o">...</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"has_four_g"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether the phone has 4G."</span><span class="p">)</span>
<span class="n">internal_memory</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"internal_memory"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">664</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Internal memory in gigabytes."</span><span class="p">)</span>
<span class="n">depth</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"depth"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mf">0.1</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mf">1.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Depth of mobile phone in cm."</span><span class="p">)</span>
<span class="n">weight</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"weight"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">80</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">200</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Weight of mobile phone."</span><span class="p">)</span>
<span class="n">number_of_cores</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"number_of_cores"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">8</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Number of cores of processor."</span><span class="p">)</span>
<span class="n">primary_camera_megapixels</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"primary_camera_megapixels"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Primary camera mega pixels."</span><span class="p">)</span>
<span class="n">pixel_resolution_height</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"pixel_resolution_height"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">1960</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Pixel resolution height."</span><span class="p">)</span>
<span class="n">pixel_resolution_width</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"pixel_resolution_width"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">500</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">1998</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Pixel resolution width."</span><span class="p">)</span>
<span class="n">ram</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"ram"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">3998</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Random access memory in megabytes."</span><span class="p">)</span>
<span class="n">screen_height</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"screen_height"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">19</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Screen height of mobile in cm."</span><span class="p">)</span>
<span class="n">screen_width</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"screen_width"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">18</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Screen width of mobile in cm."</span><span class="p">)</span>
<span class="n">talk_time</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"talk_time"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Longest time that a single battery charge will last when on phone call."</span><span class="p">)</span>
<span class="n">has_three_g</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="o">...</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"has_three_g"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether the phone has 3G touchscreen or not."</span><span class="p">)</span>
<span class="n">has_touch_screen</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="o">...</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"has_touch_screen"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether the phone has a touchscreen or not."</span><span class="p">)</span>
<span class="n">has_wifi</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="o">...</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"has_wifi"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether the phone has wifi or not."</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/mobile_handset_price_model/prediction/schemas.py#L6-L41">here</a>.</p>
<p>The input schema of the model defines what is acceptable input for the
model and also provides a user-friendly interface to the code that is
calling the model.</p>
<p>In order to make the model's input easier to understand we've replaced
the binary categorical input variables with booleans which can have
values of "True" or "False". For example, the model expected the
has_bluetooth variable to contain either a "0" or a "1", Instead of
forcing the user to understand the semantics of these values in order to
provide input to the model we just convert "True" to "1" and "False" to
"0" before we pass the input to the model.</p>
<p>Another example of user-friendliness is the addition of the "greater
than" and "less than" limits to the numerical variables. These limits
are enforced by pydantic when the class is instantiated and they clearly
communicate which values are allowed by the model for the numerical
variables. The bounds match the contents of the training set, for
example the "battery_power" has a lower bound of 500 and an upper bound
of 2000 which are the minimum and maximum values found in the training
data for this variable.</p>
<p>The pydantic package allows us to add descriptions to each field that
help the user to understand the fields that the model expects. The
pydantic package also supports the generation of JSON schema documents
from a schema class. The JSON schema of the input class looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"title"</span><span class="p">:</span><span class="w"> </span><span class="s2">"MobileHandsetPriceModelInput"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"description"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Schema for input of the model's predict method."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"object"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"properties"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"battery_power"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"title"</span><span class="p">:</span><span class="w"> </span><span class="s2">"battery_power"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"description"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Total energy a battery can store in one time measured in mAh."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"minimum"</span><span class="p">:</span><span class="w"> </span><span class="mi">500</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"maximum"</span><span class="p">:</span><span class="w"> </span><span class="mi">2000</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"integer"</span><span class="w"></span>
<span class="w"> </span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="nt">"has_bluetooth"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"title"</span><span class="p">:</span><span class="w"> </span><span class="s2">"has_bluetooth"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"description"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Whether the phone has bluetooth."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"boolean"</span><span class="w"></span>
<span class="w"> </span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="nt">"clock_speed"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"title"</span><span class="p">:</span><span class="w"> </span><span class="s2">"clock_speed"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"description"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Speed of microprocessor in gHz."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"minimum"</span><span class="p">:</span><span class="w"> </span><span class="mf">0.5</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"maximum"</span><span class="p">:</span><span class="w"> </span><span class="mi">3</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"number"</span><span class="w"></span>
<span class="w"> </span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="nt">"has_dual_sim"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"title"</span><span class="p">:</span><span class="w"> </span><span class="s2">"has_dual_sim"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"description"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Whether the phone has dual SIM slots."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"boolean"</span><span class="w"></span>
<span class="w"> </span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="nt">"front_camera_megapixels"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="nt">"title"</span><span class="p">:</span><span class="w"> </span><span class="s2">"front_camera_megapixels"</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"description"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Front camera mega pixels."</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"minimum"</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"maximum"</span><span class="p">:</span><span class="w"> </span><span class="mi">20</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="nt">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"integer"</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="err">...</span><span class="w"></span>
</code></pre></div>
<p>The model also requires a schema for it's output. Before we can define
it, we need to define the allowed values. To do that we'll use an Enum
class:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">PriceEnum</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="n">zero</span> <span class="o">=</span> <span class="s2">"zero"</span>
<span class="n">one</span> <span class="o">=</span> <span class="s2">"one"</span>
<span class="n">two</span> <span class="o">=</span> <span class="s2">"two"</span>
<span class="n">three</span> <span class="o">=</span> <span class="s2">"three"</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/mobile_handset_price_model/prediction/schemas.py#L44-L48">here</a>.</p>
<p>The four allowed values match the output of the model. We defined this
as an enumeration because this is a classification model, even though
the outputs look like numbers.</p>
<p>Now we can define the output schema class:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MobileHandsetPriceModelOutput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">price_range</span><span class="p">:</span> <span class="n">PriceEnum</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="o">...</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"Price Range"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Price range class."</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/mobile_handset_price_model/prediction/schemas.py#L51-L53">here</a>.</p>
<p>The "price_range" variable uses the PriceEnum enumeration to define what
the allowed values are.</p>
<h2>Creating the Model Class</h2>
<p>Now that we have the model's input and output schemas defined we can
move on to creating a class that will wrap around the model and make
predictions. This class makes using the model a lot easier because it
abstracts out a lot of the low level details of the model.</p>
<p>To start, we\'ll define the class and add all of the required
properties:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MobileHandsetPriceModel</span><span class="p">(</span><span class="n">MLModel</span><span class="p">):</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">display_name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s2">"Mobile Handset Price Model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">qualified_name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s2">"mobile_handset_price_model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">description</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s2">"Model to predict the price of a mobile phone."</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">version</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="n">__version__</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">MobileHandsetPriceModelInput</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">MobileHandsetPriceModelOutput</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/mobile_handset_price_model/prediction/model.py#L19-L50">here</a>.</p>
<p>The properties of the class return metadata about the model. The input
and output schema classes are returned from the input_schema and
output_schema properties and can be used by the users of the model to
introspect the schemas of the model.</p>
<p>The __init__ method of the class looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">dir_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">realpath</span><span class="p">(</span><span class="vm">__file__</span><span class="p">)))</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir_path</span><span class="p">,</span> <span class="s2">"model_files"</span><span class="p">,</span> <span class="s2">"1"</span><span class="p">,</span> <span class="s2">"model.joblib"</span><span class="p">),</span> <span class="s1">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span> <span class="o">=</span> <span class="n">joblib</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/mobile_handset_price_model/prediction/model.py#L52-L61">here</a>.</p>
<p>The __init__ method is used to initialize the model, after it
completes the model object should be ready to make predictions.</p>
<p>The predict() method is the last method we need to define:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">:</span> <span class="n">MobileHandsetPriceModelInput</span><span class="p">)</span> <span class="o">-></span> <span class="n">MobileHandsetPriceModelOutput</span><span class="p">:</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">([[</span><span class="n">data</span><span class="o">.</span><span class="n">battery_power</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">has_bluetooth</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">clock_speed</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">has_dual_sim</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">front_camera_megapixels</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">has_four_g</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">internal_memory</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">depth</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">weight</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">number_of_cores</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">primary_camera_megapixels</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">pixel_resolution_height</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">pixel_resolution_width</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">ram</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">screen_height</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">screen_width</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">talk_time</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">has_three_g</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">has_touch_screen</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">has_wifi</span><span class="p">]],</span>
<span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s2">"battery_power"</span><span class="p">,</span> <span class="s2">"has_bluetooth"</span><span class="p">,</span> <span class="s2">"clock_speed"</span><span class="p">,</span>
<span class="s2">"has_dual_sim"</span><span class="p">,</span> <span class="s2">"front_camera_megapixels"</span><span class="p">,</span> <span class="s2">"has_four_g"</span><span class="p">,</span>
<span class="s2">"internal_memory"</span><span class="p">,</span> <span class="s2">"depth"</span><span class="p">,</span> <span class="s2">"weight"</span><span class="p">,</span> <span class="s2">"number_of_cores"</span><span class="p">,</span>
<span class="s2">"primary_camera_megapixels"</span><span class="p">,</span> <span class="s2">"pixel_resolution_height"</span><span class="p">,</span>
<span class="s2">"pixel_resolution_width"</span><span class="p">,</span> <span class="s2">"ram"</span><span class="p">,</span> <span class="s2">"screen_height"</span><span class="p">,</span>
<span class="s2">"screen_width"</span><span class="p">,</span> <span class="s2">"talk_time"</span><span class="p">,</span> <span class="s2">"has_three_g"</span><span class="p">,</span>
<span class="s2">"has_touch_screen"</span><span class="p">,</span> <span class="s2">"has_wifi"</span><span class="p">])</span>
<span class="c1"># making the prediction and extracting the result from the array</span>
<span class="n">y_hat</span> <span class="o">=</span> <span class="n">output_class_map</span><span class="p">[</span><span class="nb">str</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X</span><span class="p">)[</span><span class="mi">0</span><span class="p">])]</span>
<span class="k">return</span> <span class="n">MobileHandsetPriceModelOutput</span><span class="p">(</span><span class="n">price_range</span><span class="o">=</span><span class="n">y_hat</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/mobile_handset_price_model/prediction/model.py#L63-L85">here</a>.</p>
<p>This method accepts a pydantic object of the type that meets the model's
input schema and returns a pydantic object that meets the model's output
schema.</p>
<h1>Adding the Property-Based Tests</h1>
<p>The model class is now ready to do property-based testing. To test we'll
use the hypothesis package, which we can install with this command:</p>
<div class="highlight"><pre><span></span><code>pip install hypothesis
</code></pre></div>
<p>To launch a set of hypothesis tests, we'll write a simple test class:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">ModelPropertyBasedTests</span><span class="p">(</span><span class="n">TestCase</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">setUp</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">counter</span> <span class="o">=</span> <span class="mi">0</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model</span> <span class="o">=</span> <span class="n">MobileHandsetPriceModel</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">tearDown</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Generated and tested </span><span class="si">{}</span><span class="s2"> examples."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">counter</span><span class="p">))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/tests/property_based_tests.py#L10-L17">here</a>.</p>
<p>The test class defines a setUp() method which sets up a counter to 0 and
instantiates the model object. The setUp method is executed before the
execution of every test case, so by loading the model object here, we'll
avoid the cost of instantiating during every execution of the test. The
tearDown() method is executed after each test case, we'll use it to
print out how many test cases we executed.</p>
<div class="highlight"><pre><span></span><code><span class="nd">@settings</span><span class="p">(</span><span class="n">deadline</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">max_examples</span><span class="o">=</span><span class="mi">1000</span><span class="p">)</span>
<span class="nd">@given</span><span class="p">(</span><span class="n">strategies</span><span class="o">.</span><span class="n">builds</span><span class="p">(</span><span class="n">MobileHandsetPriceModelInput</span><span class="p">))</span>
<span class="k">def</span> <span class="nf">test_model_input</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="c1"># act</span>
<span class="n">result</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="c1"># assert</span>
<span class="bp">self</span><span class="o">.</span><span class="n">assertTrue</span><span class="p">(</span><span class="nb">type</span><span class="p">(</span><span class="n">result</span><span class="p">)</span> <span class="ow">is</span> <span class="n">MobileHandsetPriceModelOutput</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">assertTrue</span><span class="p">(</span><span class="nb">type</span><span class="p">(</span><span class="n">result</span><span class="o">.</span><span class="n">price_range</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PriceEnum</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">counter</span> <span class="o">+=</span> <span class="mi">1</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/tests/property_based_tests.py#L19-L29">here</a>.</p>
<p>The test_model_input test case is decorated with two decorators that
make it into a hypothesis test. The \@settings decorator tells the
hypothesis package that there is no deadline for completion of the test
case and that we would like to test with 1000 samples. The \@given
decorator tells the hypothesis package that we would like to build
samples for testing using the MobileHandsetPriceModelInput schema. The
hypothesis package then generates 1000 samples from the schema class and
calls the test_model_input method 1000 with the generated samples.</p>
<p>The test method itself is very simple, it makes a prediction with the
sample generated by hypothesis and asserts that the result is of the
right type. If any exceptions are raised in the execution of the predict
method, the test will fail. The counter we initialized is incremented
every time a test case is executed.</p>
<p>To execute the hypothesis tests, we'll use the pytest command:</p>
<div class="highlight"><pre><span></span><code>py.test ./tests/property_based_tests.py --hypothesis-show-statistics
</code></pre></div>
<p>The output of the command tells us a bit about the test:</p>
<div class="highlight"><pre><span></span><code><span class="o">==========================</span> Hypothesis <span class="nv">Statistics</span><span class="o">=============================</span>
tests/property_based_tests.py::ModelPropertyBasedTests::test_model_input:
- during reuse phase <span class="o">(</span><span class="m">0</span>.00 seconds<span class="o">)</span>:
- Typical runtimes: < 1ms, ~ <span class="m">86</span>% <span class="k">in</span> data generation
- <span class="m">0</span> passing examples, <span class="m">0</span> failing examples, <span class="m">1</span> invalid examples
- during generate phase <span class="o">(</span><span class="m">0</span>.36 seconds<span class="o">)</span>:
- Typical runtimes: <span class="m">6</span>-293 ms, ~ <span class="m">7</span>% <span class="k">in</span> data generation
- <span class="m">2</span> passing examples, <span class="m">7</span> failing examples, <span class="m">0</span> invalid examples
- Found <span class="m">1</span> failing example <span class="k">in</span> this phase
- during shrink phase <span class="o">(</span><span class="m">0</span>.10 seconds<span class="o">)</span>:
- Typical runtimes: <span class="m">0</span>-7 ms, ~ <span class="m">68</span>% <span class="k">in</span> data generation
- <span class="m">2</span> passing examples, <span class="m">6</span> failing examples, <span class="m">22</span> invalid examples
- Tried <span class="m">30</span> shrinks of which <span class="m">8</span> were successful
- Stopped because nothing left to <span class="k">do</span>
<span class="o">===========================</span> short <span class="nb">test</span> summary <span class="nv">info</span><span class="o">===========================</span>
FAILED
tests/property_based_tests.py::ModelPropertyBasedTests::test_model_input
- ValueError: Value: -1 cannot be mapped to a boolean value.
<span class="o">==============================</span> <span class="m">1</span> failed <span class="k">in</span> <span class="m">1</span>.70s<span class="o">=============================</span>
</code></pre></div>
<p>The test failed with the very first sample generated. The error raised
is: "ValueError: Value: -1 cannot be mapped to a boolean value." in the
mobile_handset_price_model/prediction/transformers.py file. This error
is easy to debug because we actually introduced the problem in the first
place!. The problem lies in the input schema of the model, the field
called "has_bluetooth" is defined like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">has_bluetooth</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="o">...</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"has_bluetooth"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether the phone has bluetooth."</span><span class="p">)</span>
</code></pre></div>
<p>The problem is that the hypothesis package generated the value -1 for
the "has_bluetooth" field because the type of the field is "int", which
failed to be processed by the model. This error happened because we were
matching the type of the field that is found in the dataset, instead of
the type of the field as defined by the model's input schema. We can fix
it easily by defining the field like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">has_bluetooth</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="o">...</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"has_bluetooth"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether the phone has bluetooth."</span><span class="p">)</span>
</code></pre></div>
<p>Now we can try to run the tests again. The results came back like this:</p>
<div class="highlight"><pre><span></span><code><span class="o">=========================</span> Hypothesis <span class="nv">Statistics</span><span class="o">===========================</span>
tests/property_based_tests.py::ModelPropertyBasedTests::test_model_input:
- during reuse phase <span class="o">(</span><span class="m">0</span>.29 seconds<span class="o">)</span>:
- Typical runtimes: ~ 287ms, ~ <span class="m">0</span>% <span class="k">in</span> data generation
- <span class="m">0</span> passing examples, <span class="m">1</span> failing examples, <span class="m">0</span> invalid examples
- Found <span class="m">1</span> failing example <span class="k">in</span> this phase
- during shrink phase <span class="o">(</span><span class="m">0</span>.01 seconds<span class="o">)</span>:
- Typical runtimes: ~ 6ms, ~ <span class="m">8</span>% <span class="k">in</span> data generation
- <span class="m">0</span> passing examples, <span class="m">1</span> failing examples, <span class="m">0</span> invalid examples
- Tried <span class="m">1</span> shrinks of which <span class="m">0</span> were successful
- Stopped because nothing left to <span class="k">do</span>
<span class="o">=======================</span> short <span class="nb">test</span> summary <span class="nv">info</span><span class="o">==============================</span>
FAILED
tests/property_based_tests.py::ModelPropertyBasedTests::test_model_input
- ValueError: Value: None cannot be mapped to a boolean value.
<span class="o">==========================</span> <span class="m">1</span> failed in1.56s<span class="o">==================================</span>
</code></pre></div>
<p>The hypothesis test failed again with the very first sample generated.
The error raised is: "ValueError: Value: None cannot be mapped to a
boolean value." in the
mobile_handset_price_model/prediction/transformers.py file. The problem
again lies in the input schema of the model, in the field called
"has_dual_sim" which is defined like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">has_dual_sim</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">bool</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"has_dual_sim"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether the phone has dual SIM slots."</span><span class="p">)</span>
</code></pre></div>
<p>The problem is the fact that the model cannot impute a value for the
boolean inputs in the same way that it can for the numerical inputs.
This problem might arise if we forget which fields the model is able to
impute, and mark fields that need to be provided as optional. We'll fix
the issue by making the "has_dual_sim" input field a required field:</p>
<div class="highlight"><pre><span></span><code><span class="n">has_dual_sim</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="o">...</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"has_dual_sim"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether the phone has dual SIM slots."</span><span class="p">)</span>
</code></pre></div>
<p>We ran the tests one last time and got back this result:</p>
<div class="highlight"><pre><span></span><code><span class="o">==============================</span> Hypothesis <span class="nv">Statistics</span><span class="o">==========================</span>
tests/property_based_tests.py::ModelPropertyBasedTests::test_model_input:
- during generate phase <span class="o">(</span><span class="m">0</span>.52 seconds<span class="o">)</span>:
- Typical runtimes: <span class="m">6</span>-8 ms, ~ <span class="m">9</span>% <span class="k">in</span> data generation
- <span class="m">64</span> passing examples, <span class="m">0</span> failing examples, <span class="m">0</span> invalid examples
- Stopped because nothing left to <span class="k">do</span>
<span class="o">===============================</span> <span class="m">1</span> passed <span class="k">in</span> <span class="m">1</span>.41s<span class="o">==========================</span>
</code></pre></div>
<p>None of the samples generated by the hypothesis package were able to
raise an exception in the model's prediction class.</p>
<h1>Creating a RESTful Model Service</h1>
<p>Creating a RESTful model service is very simple because we'll be
leveraging the <a href="https://pypi.org/project/rest-model-service/">rest_model_service
package</a>. The
package works through a configuration file that points at the model
classes of the ML model that we would like to host in the service. If
you\'d like to learn more about the rest_model_service package, here is
a <a href="https://www.tekhnoal.com/rest-model-service.html">blog post</a>
about it.</p>
<p>To install the package, execute this command:</p>
<div class="highlight"><pre><span></span><code>pip install rest_model_service
</code></pre></div>
<p>To create a service for our model, all that is needed is that we add a
YAML configuration file to the project. The configuration file looks
like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Mobile Handset Price Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">qualified_name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">mobile_handset_price_model</span><span class="w"></span>
<span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">mobile_handset_price_model.prediction.model.MobileHandsetPriceModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
</code></pre></div>
<p>The configuration file can be found
<a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/rest_config.yaml">here</a>.</p>
<p>The configuration file sets up the service_title, which is the title
that will be shown in the documentation of the service. The models array
allows us to host any number of models within the service. The only
model we'll host today is the mobile_handset_price_model, the class_path
points at the location of the model class in the python environment. The
create_endpoint setting is set to true which means that the service will
create an endpoint for the model.</p>
<p>Now that we have the configuration set up, we can automatically generate
an OpenAPI specification file for the service, with these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
generate_openapi --output_file<span class="o">=</span>service_contract.yaml
</code></pre></div>
<p>The OpenAPI spec file can be found
<a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/service_contract.yaml">here</a>.
We can render the documentation using the <a href="https://editor.swagger.io/">Swagger
Editor</a>, which looks like this:</p>
<p><img alt="Swagger Editor" src="https://www.tekhnoal.com/swagger_editor.png" width="100%"></p>
<p>The service contract is set up, so now we can run the service locally,
with these commands:</p>
<div class="highlight"><pre><span></span><code>uvicorn rest_model_service.main:app --reload
</code></pre></div>
<p>The service should come up and can be accessed in a web browser at
http://127.0.0.1:8000. When you access that URL you will be redirected
to the documentation page that is generated by the FastAPI package:</p>
<p><img alt="API Documentation" src="https://www.tekhnoal.com/api_documentation.png" width="100%"></p>
<p>The service is running locally, so now we can try out a request against
the model's endpoint:</p>
<div class="highlight"><pre><span></span><code>curl -X <span class="s1">'POST'</span>
<span class="s1">'http://127.0.0.1:8000/api/models/mobile_handset_price_model/prediction'</span>
-H <span class="s1">'accept: application/json'</span>
-H <span class="s1">'Content-Type: application/json'</span>
-d <span class="s1">'{</span>
<span class="s1">"battery_power": 2000,</span>
<span class="s1">"has_bluetooth": true,</span>
<span class="s1">"clock_speed": 3,</span>
<span class="s1">"has_dual_sim": true,</span>
<span class="s1">"front_camera_megapixels": 20,</span>
<span class="s1">"has_four_g": true,</span>
<span class="s1">"internal_memory": 664,</span>
<span class="s1">"depth": 1,</span>
<span class="s1">"weight": 200,</span>
<span class="s1">"number_of_cores": 8,</span>
<span class="s1">"primary_camera_megapixels": 20,</span>
<span class="s1">"pixel_resolution_height": 1960,</span>
<span class="s1">"pixel_resolution_width": 1998,</span>
<span class="s1">"ram": 3998,</span>
<span class="s1">"screen_height": 19,</span>
<span class="s1">"screen_width": 18,</span>
<span class="s1">"talk_time": 20,</span>
<span class="s1">"has_three_g": true,</span>
<span class="s1">"has_touch_screen": true,</span>
<span class="s1">"has_wifi": true</span>
<span class="s1">}'</span>
</code></pre></div>
<p>The service responds with this result:</p>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="nt">"price_range"</span><span class="p">:</span><span class="s2">"three"</span><span class="p">}</span><span class="w"></span>
</code></pre></div>
<p>By using the rest_model_service package we've just set up a RESTful API
service that is hosting our model. We can now move on to do
property-based testing on the model through the service.</p>
<h1>Adding Property-Based API Tests</h1>
<p>The <a href="https://schemathesis.readthedocs.io/en/stable/#">schemathesis
package</a> allows
us to use the hypothesis package against REST API services, doing all of
the things that the hypothesis package can do. The schemathesis uses the
OpenAPI specification of the service to introspect the service contract
and generate test cases.</p>
<p>There are two ways for schemathesis to execute the tests: by sending
requests to the service as it runs in its own process or by sending
request objects to the ASGI application object as it lives in the memory
of a process. The second way is very fast because it does not require
that we send requests over the network, so we'll execute the tests that
way.</p>
<p>To begin, we'll import the ASGI application object from the
rest_model_service package</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">rest_model_service.main</span> <span class="kn">import</span> <span class="n">app</span>
</code></pre></div>
<p>Next, we'll ask the schemathesis to extract the schema from the
application object:</p>
<div class="highlight"><pre><span></span><code><span class="n">schema</span> <span class="o">=</span> <span class="n">schemathesis</span><span class="o">.</span><span class="n">from_asgi</span><span class="p">(</span><span class="s2">"/openapi.json"</span><span class="p">,</span> <span class="n">app</span><span class="p">,</span> <span class="n">data_generation_methods</span><span class="o">=</span><span class="p">[</span><span class="n">DataGenerationMethod</span><span class="o">.</span><span class="n">negative</span><span class="p">])</span>
</code></pre></div>
<p>Next, we'll generate two strategies from the schema, one strategy per
endpoint defined in the application:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_metadata_strategy</span> <span class="o">=</span> <span class="n">schema</span><span class="p">[</span><span class="s2">"/api/models"</span><span class="p">][</span><span class="s2">"GET"</span><span class="p">]</span><span class="o">.</span><span class="n">as_strategy</span><span class="p">()</span>
<span class="n">model_prediction_strategy</span> <span class="o">=</span> <span class="n">schema</span><span class="p">[</span><span class="s2">"/api/models/mobile_handset_price_model/prediction"</span><span class="p">][</span><span class="s2">"POST"</span><span class="p">]</span><span class="o">.</span><span class="n">as_strategy</span><span class="p">()</span>
</code></pre></div>
<p>Now we're ready to start writing the test class:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">APITests</span><span class="p">(</span><span class="n">TestCase</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">setUp</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">counter</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">def</span> <span class="nf">tearDown</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Generated and tested </span><span class="si">{}</span><span class="s2"> examples."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">counter</span><span class="p">))</span>
</code></pre></div>
<p>The test class keeps track of the number of test cases executed through
a counter that is created in the setUp method.</p>
<p>The test case for the metadata endpoint is very simple:</p>
<div class="highlight"><pre><span></span><code><span class="nd">@given</span><span class="p">(</span><span class="n">case</span><span class="o">=</span><span class="n">model_metadata_strategy</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">test_model_metadata_endpoint</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">case</span><span class="p">):</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">case</span><span class="o">.</span><span class="n">call_asgi</span><span class="p">()</span>
<span class="k">case</span><span class="o">.</span><span class="n">validate_response</span><span class="p">(</span><span class="n">response</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">counter</span> <span class="o">+=</span> <span class="mi">1</span>
</code></pre></div>
<p>The \@given decorator uses the model_metadata_strategy to generate test
cases for the endpoint. This is a very simple endpoint that does not
accept input and provides a static output that contains metadata about
the model being hosted in the service.</p>
<p>The next test is much more interesting:</p>
<div class="highlight"><pre><span></span><code><span class="nd">@given</span><span class="p">(</span><span class="n">case</span><span class="o">=</span><span class="n">model_prediction_strategy</span><span class="p">)</span>
<span class="nd">@settings</span><span class="p">(</span><span class="n">max_examples</span><span class="o">=</span><span class="mi">1000</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">test_model_prediction_endpoint</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">case</span><span class="p">):</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">case</span><span class="o">.</span><span class="n">call_asgi</span><span class="p">()</span>
<span class="k">case</span><span class="o">.</span><span class="n">validate_response</span><span class="p">(</span><span class="n">response</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">counter</span> <span class="o">+=</span> <span class="mi">1</span>
</code></pre></div>
<p>The model_prediction_strategy generates test cases for the model's
prediction endpoint. The \@settings decorator asks schemathesis to
generate 1000 test samples. The case.validate_response() method looks
for unexpected responses from the service endpoint.</p>
<p>We executed the api tests with this command:</p>
<div class="highlight"><pre><span></span><code>py.test ./tests/api_tests.py
</code></pre></div>
<p>The command provided this output:</p>
<div class="highlight"><pre><span></span><code><span class="o">==========================</span> <span class="nv">test</span> <span class="nv">session</span> <span class="nv">starts</span><span class="o">============================</span>
<span class="nv">platform</span> <span class="nv">darwin</span> <span class="o">--</span> <span class="nv">Python</span> <span class="mi">3</span>.<span class="mi">8</span>.<span class="mi">10</span>, <span class="nv">pytest</span><span class="o">-</span><span class="mi">6</span>.<span class="mi">2</span>.<span class="mi">4</span>, <span class="nv">py</span><span class="o">-</span><span class="mi">1</span>.<span class="mi">10</span>.<span class="mi">0</span>,
<span class="nv">pluggy</span><span class="o">-</span><span class="mi">0</span>.<span class="mi">13</span>.<span class="mi">1</span>
<span class="nv">rootdir</span>: <span class="o">/</span><span class="nv">Users</span><span class="o">/</span><span class="nv">brian</span><span class="o">/</span><span class="nv">Code</span><span class="o">/</span><span class="nv">property</span><span class="o">-</span><span class="nv">based</span><span class="o">-</span><span class="nv">testing</span><span class="o">-</span><span class="k">for</span><span class="o">-</span><span class="nv">ml</span><span class="o">-</span><span class="nv">models</span>
<span class="nv">plugins</span>: <span class="nv">pylama</span><span class="o">-</span><span class="mi">7</span>.<span class="mi">7</span>.<span class="mi">1</span>, <span class="nv">hypothesis</span><span class="o">-</span><span class="mi">6</span>.<span class="mi">14</span>.<span class="mi">5</span>, <span class="nv">subtests</span><span class="o">-</span><span class="mi">0</span>.<span class="mi">5</span>.<span class="mi">0</span>,
<span class="nv">schemathesis</span><span class="o">-</span><span class="mi">3</span>.<span class="mi">9</span>.<span class="mi">7</span>, <span class="nv">anyio</span><span class="o">-</span><span class="mi">3</span>.<span class="mi">3</span>.<span class="mi">0</span>, <span class="nv">html</span><span class="o">-</span><span class="mi">3</span>.<span class="mi">1</span>.<span class="mi">1</span>, <span class="nv">metadata</span><span class="o">-</span><span class="mi">1</span>.<span class="mi">11</span>.<span class="mi">0</span>
<span class="nv">collected</span> <span class="mi">2</span> <span class="nv">items</span>
<span class="nv">tests</span><span class="o">/</span><span class="nv">api_tests</span>.<span class="nv">py</span> .. [<span class="mi">100</span><span class="o">%</span>]
<span class="o">=======================</span> <span class="mi">2</span> <span class="nv">passed</span> <span class="nv">in</span> <span class="mi">62</span>.<span class="mi">51</span><span class="nv">s</span> <span class="ss">(</span><span class="mi">0</span>:<span class="mi">01</span>:<span class="mi">02</span><span class="ss">)</span><span class="o">===========================</span>
</code></pre></div>
<p>The code for the property-based API tests is in <a href="https://github.com/schmidtbri/property-based-testing-for-ml-models/blob/main/tests/api_tests.py">this
file</a>.</p>
<p>By executing property-based tests against the model service, we're able
to more thoroughly test the model deployment by executing the service
code along with the model code in the tests. Although the service code
is very simple and lightweight, it helps that we're including it because
it makes the hypothesis tests into full integration tests that test the
entire service along with the model.</p>
<h1>Conclusion</h1>
<p>Using property-based tests we were able to find two common errors that
can come up when deploying machine learning models. A mismatch between
the model's schema and the data that it is actually able to process can
cause many issues that are hard to debug. By using this type of
generative testing, we were able to find both errors that we introduced
to the schema pretty easily.</p>
<p>In this blog post we also saw the benefits of using a package like
pydantic for creating the input and output schemas for an ML model. By
stating the schemas as code, we're able to clearly show what data is
allowed as input and what data is returned by the model. The model's
designer does not have to write documentation to explain the input and
output data because it is already built into the input and output schema
classes.If we didn't have the model's schemas as pydantic classes, the
hypothesis and schemathesis packages would not even be able to generate
test cases for the model and model service.</p>Training and Deploying an ML Model2021-07-15T08:26:00-05:002021-07-15T08:26:00-05:00Brian Schmidttag:www.tekhnoal.com,2021-07-15:/regression-model.html<p>This post is a collection of several different techniques that I wanted to learn. In this blog post I'll be using open source python packages to do automated data exploration, automated feature engineering, automated machine learning, and model validation. I'll also be using docker and kubernetes to deploy the model. I'll cover the entire codebase of the model, from the initial data exploration to the deployment of the model behind a RESTful API in Kubernetes.</p><h1>Introduction</h1>
<p>This post is a collection of several different techniques that I wanted
to learn. In this blog post I'll be using open source python packages to
do automated data exploration, automated feature engineering, automated
machine learning, and model validation. I'll also be using docker and
kubernetes to deploy the model. I'll cover the entire codebase of the
model, from the initial data exploration to the deployment of the model
behind a RESTful API in Kubernetes.</p>
<p>Automated feature engineering is a technique that is used to automate
the creation of features from a dataset without having to manually
design them and write the code to create the features. Feature
engineering is very important for being able to create ML models that
work well on a dataset, but it takes a lot of time and effort. Automated
feature engineering is able to generate many candidate features from a
given dataset, from which we can then select the useful ones. In this
blog post, I'll be using the <a href="https://www.featuretools.com/">feature_tools library</a>,
which helps to do feature preprocessing, feature selection, model selection,
and hyperparameter search.</p>
<p>Automated machine learning is a process through which we can create
machine learning models without having to explore many different model
types and hyperparameters. AutoML can automate the process of choosing
the best solution for a dataset, going from a raw dataset to a trained
model. AutoML tools allow non-experts to be able to create ML models
without having to understand everything that is happening under the
hood. All that is needed is a properly processed data set and anyone can
generate a model from the data. In this blog post, I'll be using the
<a href="https://epistasislab.github.io/tpot/">TPOT library</a>, which helps
to do feature preprocessing, feature selection, model selection, and
hyperparameter search.</p>
<p>In this blog post, I'll also show how to create a RESTful service for
the model that will allow us to deploy the model quickly and simply.
We'll also show how to deploy the model service using docker and
Kubernetes. This blog post contains a lot of different tools and
techniques for building and deploying ML models and it is not meant to
be a deep dive into any of the individual techniques, I just wanted to
show how to take a model all the way from data exploration, to training
and finally to deployment.</p>
<h1>Package Structure</h1>
<p>The package we'll develop in this blog post has this structure:</p>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">insurance_charges_model</span>
<span class="o">-</span> <span class="nv">model_files</span> <span class="ss">(</span><span class="nv">output</span> <span class="nv">files</span> <span class="nv">from</span> <span class="nv">model</span> <span class="nv">training</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">prediction</span>, <span class="nv">package</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">prediction</span> <span class="nv">code</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">model</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">prediction</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">schemas</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">model</span> <span class="nv">input</span> <span class="nv">and</span> <span class="nv">output</span> <span class="nv">schemas</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">transformers</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">data</span> <span class="nv">transformers</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">training</span> <span class="ss">(</span><span class="nv">package</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">training</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">data_exploration</span>.<span class="nv">ipynb</span> <span class="ss">(</span><span class="nv">data</span> <span class="nv">exploration</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">data_preparation</span>.<span class="nv">ipynb</span> <span class="ss">(</span><span class="nv">data</span> <span class="nv">preparation</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_training</span>.<span class="nv">ipynb</span> <span class="ss">(</span><span class="nv">model</span> <span class="nv">training</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_validation</span>.<span class="nv">ipynb</span> <span class="ss">(</span><span class="nv">model</span> <span class="nv">validation</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">kubernetes</span> <span class="ss">(</span><span class="nv">kubernetes</span> <span class="nv">manifests</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">deployment</span>.<span class="nv">yml</span>
<span class="o">-</span> <span class="nv">namespace</span>.<span class="nv">yml</span>
<span class="o">-</span> <span class="nv">service</span>.<span class="nv">yml</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span><span class="nv">unit</span> <span class="nv">tests</span> <span class="k">for</span> <span class="nv">model</span> <span class="nv">codel</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">Dockerfile</span> <span class="ss">(</span><span class="nv">instructions</span> <span class="k">for</span> <span class="nv">generating</span> <span class="nv">a</span> <span class="nv">docker</span> <span class="nv">image</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">Makefile</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span> <span class="ss">(</span><span class="nv">list</span> <span class="nv">of</span> <span class="nv">dependencies</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">rest_config</span>.<span class="nv">yaml</span> <span class="ss">(</span><span class="nv">configuration</span> <span class="k">for</span> <span class="nv">REST</span> <span class="nv">model</span> <span class="nv">service</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">service_contract</span>.<span class="nv">yaml</span> <span class="ss">(</span><span class="nv">OpenAPI</span> <span class="nv">service</span> <span class="nv">contract</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">setup</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span> <span class="ss">(</span><span class="nv">test</span> <span class="nv">dependencies</span><span class="ss">)</span>
</code></pre></div>
<p>All of the code is available in a <a href="https://github.com/schmidtbri/regression-model">github repository.</a></p>
<h2>Getting the Data</h2>
<p>In order to train a regression model, we first need to have a dataset.
We went into Kaggle and found <a href="https://www.kaggle.com/mirichoi0218/insurance">a dataset</a> that
contained insurance charges information. To make it easy to download the
data, we installed the <a href="https://pypi.org/project/kaggle/">kaggle python package</a>. Then we executed these
commands to download the data and unzip it into the data folder in the
project:</p>
<div class="highlight"><pre><span></span><code>mkdir -p data
kaggle datasets download -d mirichoi0218/insurance -p ./data <span class="se">\-</span>-unzip
</code></pre></div>
<p>To make it even easier to download the data, we added a Makefile target
for the commands:</p>
<div class="highlight"><pre><span></span><code><span class="nf">download-dataset</span><span class="o">:</span> <span class="c">## download dataset from Kaggle</span>
mkdir -p data
kaggle datasets download -d mirichoi0218/insurance -p ./data <span class="se">\-</span>-unzip
</code></pre></div>
<p>Now all we need to do is execute this command:</p>
<div class="highlight"><pre><span></span><code>make download-data
</code></pre></div>
<p>Instead of having to remember how to get the data needed to do modeling,
I always try to create a repeatable and documented process for creating
the dataset. We also make sure to never store the dataset in source
control, so we'll add this line to the .gitignore file:</p>
<div class="highlight"><pre><span></span><code>data/
</code></pre></div>
<h1>Training a Regression Model</h1>
<p>Now that we have the dataset, we\'ll start working on training a
regression model. We\'ll be doing data exploration, data preparation,
feature engineering, automated model training and selection, and model
validation.</p>
<h2>Exploring the Data</h2>
<p>Data exploration is a key step that can tell us a lot about the dataset
that we have to model. Data exploration can be highly customized to the
specific dataset, but there are also tools that allow us to calculate
the most common things we want to learn about a dataset automatically.
<a href="https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/">pandas_profiling</a>
is a package that accepts a pandas data frame and creates an HTML report
with a profile of the dataset in the data frame. According to the
pandas_profiling documentation it has these capabilities:</p>
<ul>
<li>Type inference: detect the types of columns in a dataframe.</li>
<li>Essentials: type, unique values, missing values</li>
<li>Quantile statistics like minimum value, Q1, median, Q3, maximum, range, interquartile range</li>
<li>Descriptive statistics like mean, mode, standard deviation, sum, median absolute deviation, coefficient of variation, kurtosis, skewness</li>
<li>Most frequent values</li>
<li>Histograms</li>
<li>Correlations highlighting of highly correlated variables, Spearman, Pearson and Kendall matrices</li>
<li>Missing values matrix, count, heatmap and dendrogram of missing values</li>
<li>Duplicate rows Lists the most occurring duplicate rows</li>
<li>Text analysis learn about categories (Uppercase, Space), scripts (Latin, Cyrillic) and blocks (ASCII) of text data</li>
</ul>
<p>These are the things that we would be looking into to learn more about
the data set. To use the pandas_profiling package, we'll first load the
dataset into a pandas dataframe:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
<span class="kn">from</span> <span class="nn">pandas_profiling</span> <span class="kn">import</span> <span class="n">ProfileReport</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s2">"../../data/insurance.csv"</span><span class="p">)</span>
</code></pre></div>
<p>Now we can query the dataframe to find out the column types:</p>
<div class="highlight"><pre><span></span><code><span class="n">data</span><span class="o">.</span><span class="n">dtypes</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">age</span><span class="w"> </span><span class="n">int64</span><span class="w"></span>
<span class="n">sex</span><span class="w"> </span><span class="n">object</span><span class="w"></span>
<span class="n">bmi</span><span class="w"> </span><span class="n">float64</span><span class="w"></span>
<span class="n">children</span><span class="w"> </span><span class="n">int64</span><span class="w"></span>
<span class="n">smoker</span><span class="w"> </span><span class="n">object</span><span class="w"></span>
<span class="n">region</span><span class="w"> </span><span class="n">object</span><span class="w"></span>
<span class="n">charges</span><span class="w"> </span><span class="n">float64</span><span class="w"></span>
<span class="nl">dtype:</span><span class="w"> </span><span class="n">object</span><span class="w"></span>
</code></pre></div>
<p>To create the profile, we'll execute this code:</p>
<div class="highlight"><pre><span></span><code><span class="n">profile</span> <span class="o">=</span> <span class="n">ProfileReport</span><span class="p">(</span><span class="n">data</span><span class="p">,</span>
<span class="n">title</span><span class="o">=</span><span class="s1">'Insurance Dataset Profile Report'</span><span class="p">,</span>
<span class="n">pool_size</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span>
<span class="n">html</span><span class="o">=</span><span class="p">{</span><span class="s1">'style'</span><span class="p">:</span> <span class="p">{</span><span class="s1">'full_width'</span><span class="p">:</span> <span class="kc">True</span><span class="p">}})</span>
<span class="n">profile</span><span class="o">.</span><span class="n">to_notebook_iframe</span><span class="p">()</span>
</code></pre></div>
<p>Once the report is created, we'll save it to disk:</p>
<div class="highlight"><pre><span></span><code><span class="n">profile</span><span class="o">.</span><span class="n">to_file</span><span class="p">(</span><span class="s2">"data_exploration_report.html"</span><span class="p">)</span>
</code></pre></div>
<p>Right away the profile will tell us a few key details about the dataset:</p>
<p><img alt="Dataset Statistics" src="https://www.tekhnoal.com/1.png" width="100%"></p>
<p>The profile also contains a few warnings about the data:</p>
<p><img alt="Dataset Warnings" src="https://www.tekhnoal.com/2.png" width="100%"></p>
<p>None of these warnings are really that surprising given what we know
about insurance charges, since health insurance premiums go up with age,
and being a smoker increases your insurance premiums.</p>
<p>The profile has a description for each variable, here's the description
for the age variable:</p>
<p><img alt="Age Variable" src="https://www.tekhnoal.com/3.png" width="100%"></p>
<p>As well as interactions between variables:</p>
<p><img alt="Dataset Interactions" src="https://www.tekhnoal.com/4.png" width="100%"></p>
<p>And finally the correlations between the variables:</p>
<p><img alt="Dataset Correlations" src="https://www.tekhnoal.com/5.png" width="100%"></p>
<p>By using the pandas_profiling package we can avoid writing the most
common data analysis code that we write for all datasets. All of the
code for data exploration is in the
<a href="https://github.com/schmidtbri/regression-model/blob/master/insurance_charges_model/training/1.%20data_exploration.ipynb">data_exploration.ipynb</a>
notebook.</p>
<h2>Preparing the Data</h2>
<p>In order to model the dataset, we'll first need to prepare and
preprocess the data. To start, let's load the dataset into a dataframe
again:</p>
<div class="highlight"><pre><span></span><code><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s2">"../../data/insurance.csv"</span><span class="p">)</span>
</code></pre></div>
<p>To do data preparation, we'll use the <a href="https://www.featuretools.com/">feature_tools package</a>
to create features from the data that is already in the dataset. To create features, we'll need
to tell the feature_tools package about our data by identifying entities
in the data:</p>
<div class="highlight"><pre><span></span><code><span class="n">entityset</span> <span class="o">=</span> <span class="n">ft</span><span class="o">.</span><span class="n">EntitySet</span><span class="p">(</span><span class="nb">id</span><span class="o">=</span><span class="s2">"Transactions"</span><span class="p">)</span>
<span class="n">entityset</span> <span class="o">=</span> <span class="n">entityset</span><span class="o">.</span><span class="n">entity_from_dataframe</span><span class="p">(</span><span class="n">entity_id</span><span class="o">=</span><span class="s2">"Transactions"</span><span class="p">,</span>
<span class="n">dataframe</span><span class="o">=</span><span class="n">df</span><span class="p">,</span>
<span class="n">make_index</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">index</span><span class="o">=</span><span class="s2">"index"</span><span class="p">)</span>
</code></pre></div>
<p>In the code above, we created an EntitySet with the id "Transactions"
which is the entity that is in the dataframe. The feature_tools package
identified the variables associated with the Transactions entity:</p>
<div class="highlight"><pre><span></span><code><span class="n">entityset</span><span class="p">[</span><span class="s2">"Transactions"</span><span class="p">]</span><span class="o">.</span><span class="n">variables</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">[</span><span class="o"><</span><span class="nl">Variable:</span><span class="w"> </span><span class="n">index</span><span class="w"> </span><span class="p">(</span><span class="n">dtype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">index</span><span class="p">)</span><span class="o">></span><span class="p">,</span><span class="w"></span>
<span class="o"><</span><span class="nl">Variable:</span><span class="w"> </span><span class="n">age</span><span class="w"> </span><span class="p">(</span><span class="n">dtype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">numeric</span><span class="p">)</span><span class="o">></span><span class="p">,</span><span class="w"></span>
<span class="o"><</span><span class="nl">Variable:</span><span class="w"> </span><span class="n">sex</span><span class="w"> </span><span class="p">(</span><span class="n">dtype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">categorical</span><span class="p">)</span><span class="o">></span><span class="p">,</span><span class="w"></span>
<span class="o"><</span><span class="nl">Variable:</span><span class="w"> </span><span class="n">bmi</span><span class="w"> </span><span class="p">(</span><span class="n">dtype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">numeric</span><span class="p">)</span><span class="o">></span><span class="p">,</span><span class="w"></span>
<span class="o"><</span><span class="nl">Variable:</span><span class="w"> </span><span class="n">children</span><span class="w"> </span><span class="p">(</span><span class="n">dtype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">numeric</span><span class="p">)</span><span class="o">></span><span class="p">,</span><span class="w"></span>
<span class="o"><</span><span class="nl">Variable:</span><span class="w"> </span><span class="n">smoker</span><span class="w"> </span><span class="p">(</span><span class="n">dtype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">categorical</span><span class="p">)</span><span class="o">></span><span class="p">,</span><span class="w"></span>
<span class="o"><</span><span class="nl">Variable:</span><span class="w"> </span><span class="n">region</span><span class="w"> </span><span class="p">(</span><span class="n">dtype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">categorical</span><span class="p">)</span><span class="o">></span><span class="p">,</span><span class="w"></span>
<span class="o"><</span><span class="nl">Variable:</span><span class="w"> </span><span class="n">charges</span><span class="w"> </span><span class="p">(</span><span class="n">dtype</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">numeric</span><span class="p">)</span><span class="o">></span><span class="p">]</span><span class="w"></span>
</code></pre></div>
<p>We can now generate some new features on the entity:</p>
<div class="highlight"><pre><span></span><code><span class="n">feature_dataframe</span><span class="p">,</span> <span class="n">features</span> <span class="o">=</span> <span class="n">ft</span><span class="o">.</span><span class="n">dfs</span><span class="p">(</span><span class="n">entityset</span><span class="o">=</span><span class="n">entityset</span><span class="p">,</span>
<span class="n">target_entity</span><span class="o">=</span><span class="s2">"Transactions"</span><span class="p">,</span>
<span class="n">trans_primitives</span><span class="o">=</span><span class="p">[</span><span class="s2">"add_numeric"</span><span class="p">,</span> <span class="s2">"subtract_numeric"</span><span class="p">,</span>
<span class="s2">"multiply_numeric"</span><span class="p">,</span> <span class="s2">"divide_numeric"</span><span class="p">,</span>
<span class="s2">"greater_than"</span><span class="p">,</span> <span class="s2">"less_than"</span><span class="p">],</span>
<span class="n">ignore_variables</span><span class="o">=</span><span class="p">{</span><span class="s2">"Transactions"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">,</span>
<span class="s2">"charges"</span><span class="p">]})</span>
</code></pre></div>
<p>The feature_tools package uses a set of primitive operations to generate
new features from the data. In this case, we're using the "add_numeric"
primitive to generate a new feature by adding up the values in all pairs
of numeric variables. By combining numerical variables in this way,
we'll generate three new columns:</p>
<ul>
<li>age + bmi</li>
<li>age + children</li>
<li>bmi + children</li>
</ul>
<p>The subtract_numeric, multiply_numeric, and divide_numeric primitives
also create new columns in a similar way, by applying subtraction,
multiplication, and division respectively. The greater_than and
less_than primitives create new boolean columns by comparing the values
in all pairs of numerical variables. The greater_than primitive
generated these new features:</p>
<ul>
<li>age > bmi</li>
<li>age > children</li>
<li>bmi > age</li>
<li>bmi > children</li>
<li>children > age</li>
<li>children > bmi</li>
</ul>
<p>At the end of the feature generation, we have 30 new features in the
dataset that were generated from the data already there. Before we can
use these new features, we need to figure out how to integrate the
transformer with <a href="https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html">scikit-learn
pipelines</a>,
which is what we will be using to build up our model. To accomplish this
we created a <a href="https://github.com/schmidtbri/regression-model/blob/master/insurance_charges_model/prediction/transformers.py#L56-L99">transformer</a>
which is instantiated like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">dfs_transformer</span> <span class="o">=</span> <span class="n">DFSTransformer</span><span class="p">(</span><span class="s2">"Transactions"</span><span class="p">,</span>
<span class="n">trans_primitives</span><span class="o">=</span><span class="p">[</span><span class="s2">"add_numeric"</span><span class="p">,</span> <span class="s2">"subtract_numeric"</span><span class="p">,</span>
<span class="s2">"multiply_numeric"</span><span class="p">,</span> <span class="s2">"divide_numeric"</span><span class="p">,</span>
<span class="s2">"greater_than"</span><span class="p">,</span> <span class="s2">"less_than"</span><span class="p">],</span>
<span class="n">ignore_variables</span><span class="o">=</span><span class="p">{</span><span class="s2">"Transactions"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span>
<span class="s2">"region"</span><span class="p">]})</span>
</code></pre></div>
<p>Since the feature generation sometimes creates infinite values, we'll
also need a
<a href="https://github.com/schmidtbri/regression-model/blob/master/insurance_charges_model/prediction/transformers.py#L102-L119">transformer</a>
to convert these to nan values. This transformer is instantiated like
this:</p>
<div class="highlight"><pre><span></span><code><span class="n">infinity_transformer</span> <span class="o">=</span> <span class="n">InfinityToNaNTransformer</span><span class="p">()</span>
</code></pre></div>
<p>To handle the nan values generated by the InfinityToNaN transformer,
we'll use a
<a href="https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html">SimpleImputer</a>
from the scikit-learn library. It is instantiated like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">simple_imputer</span> <span class="o">=</span> <span class="n">SimpleImputer</span><span class="p">(</span><span class="n">missing_values</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">strategy</span><span class="o">=</span><span class="s1">'mean'</span><span class="p">)</span>
</code></pre></div>
<p>The SimpleImputer transformer has problems with imputing values that are
not floats when using the \'mean\' strategy. To fix this, we\'ll create
a
<a href="https://github.com/schmidtbri/regression-model/blob/master/insurance_charges_model/prediction/transformers.py#L36-L53">transformer</a>
that will convert all integer columns into floating point columns:</p>
<div class="highlight"><pre><span></span><code><span class="n">int_to_float_transformer</span> <span class="o">=</span> <span class="n">IntToFloatTransformer</span><span class="p">()</span>
</code></pre></div>
<p>Lastly, we\'ll put the DFSTransformer, IntToFloatTransformer,
InfinityToNaNTransformer, and SimpleImputer transformers into a Pipeline
so they\'ll all work together as a unit:</p>
<div class="highlight"><pre><span></span><code><span class="n">dfs_pipeline</span> <span class="o">=</span> <span class="n">Pipeline</span><span class="p">([</span>
<span class="p">(</span><span class="s2">"dfs_transformer"</span><span class="p">,</span> <span class="n">dfs_transformer</span><span class="p">),</span>
<span class="p">(</span><span class="s2">"int_to_float_transformer"</span><span class="p">,</span> <span class="n">int_to_float_transformer</span><span class="p">),</span>
<span class="p">(</span><span class="s2">"infinity_transformer"</span><span class="p">,</span> <span class="n">infinity_transformer</span><span class="p">),</span>
<span class="p">(</span><span class="s2">"simple_imputer"</span><span class="p">,</span> <span class="n">simple_imputer</span><span class="p">),</span>
<span class="p">])</span>
</code></pre></div>
<p>Next, we'll deal with the boolean features in the dataset. To do this,
we created a
<a href="https://github.com/schmidtbri/regression-model/blob/master/insurance_charges_model/prediction/transformers.py#L7-L33">transformer</a>
that converts string values into the corresponding true or false values.
It's instantiated like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">boolean_transformer</span> <span class="o">=</span> <span class="n">BooleanTransformer</span><span class="p">(</span><span class="n">true_value</span><span class="o">=</span><span class="s2">"yes"</span><span class="p">,</span> <span class="n">false_value</span><span class="o">=</span><span class="s2">"no"</span><span class="p">)</span>
</code></pre></div>
<p>This transformer will be used to convert the "smoker" variable into a
boolean value. The values found in the dataset are "yes" and "no". The
encoder is configured to convert "yes" to True, and "no" to False.</p>
<p>Next, we\'ll create an encoder that will encode the categorical
features. The categorical features that we will encode will be \'sex\'
and \'region\'. We'll use the
<a href="https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OrdinalEncoder.html">OrdinalEncoder</a>
from the scikit-learn library:</p>
<div class="highlight"><pre><span></span><code><span class="n">ordinal_encoder</span> <span class="o">=</span> <span class="n">OrdinalEncoder</span><span class="p">()</span>
</code></pre></div>
<p>Now we can create a
<a href="https://scikit-learn.org/stable/modules/generated/sklearn.compose.ColumnTransformer.html">ColumnTransformer</a>
that combines all of the pipelines and transformers we created above
into one bigger pipeline:</p>
<div class="highlight"><pre><span></span><code><span class="n">column_transformer</span> <span class="o">=</span> <span class="n">ColumnTransformer</span><span class="p">(</span><span class="n">remainder</span><span class="o">=</span><span class="s2">"passthrough"</span><span class="p">,</span>
<span class="n">transformers</span><span class="o">=</span><span class="p">[</span>
<span class="p">(</span><span class="s2">"dfs_pipeline"</span><span class="p">,</span> <span class="n">dfs_pipeline</span><span class="p">,</span> <span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"bmi"</span><span class="p">,</span>
<span class="s2">"children"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">]),</span>
<span class="p">(</span><span class="s2">"boolean_transformer"</span><span class="p">,</span> <span class="n">boolean_transformer</span><span class="p">,</span> <span class="p">[</span><span class="s2">"smoker"</span><span class="p">]),</span>
<span class="p">(</span><span class="s2">"ordinal_encoder"</span><span class="p">,</span> <span class="n">ordinal_encoder</span><span class="p">,</span> <span class="p">[</span><span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">])</span>
<span class="p">])</span>
</code></pre></div>
<p>The ColumnTransformer applies the deep feature synthesis pipeline to all
of the input variables, then it applies the boolean transformer to the
"smoker" variable, and the ordinal encoder to the "sex" and "region"
variables.</p>
<p>Now we do a small test to make sure that the transformations are
happening as expected:</p>
<div class="highlight"><pre><span></span><code><span class="n">test_df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">([[</span><span class="mi">65</span><span class="p">,</span> <span class="s2">"male"</span><span class="p">,</span> <span class="mf">12.5</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">"yes"</span><span class="p">,</span> <span class="s2">"southwest"</span><span class="p">],</span>
<span class="p">[</span><span class="mi">75</span><span class="p">,</span> <span class="s2">"female"</span><span class="p">,</span> <span class="mf">78.770</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">"no"</span><span class="p">,</span> <span class="s2">"southeast"</span><span class="p">]],</span>
<span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"bmi"</span><span class="p">,</span> <span class="s2">"children"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">])</span>
<span class="n">column_transformer</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">test_df</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">column_transformer</span><span class="o">.</span><span class="n">transform</span><span class="p">(</span><span class="n">test_df</span><span class="p">)</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">result</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> <span class="o">!=</span> <span class="mi">33</span><span class="p">:</span> <span class="c1"># expecting 33 features to come out of the ColumnTransformer</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Unexpected number of columns found in the dataframe."</span><span class="p">)</span>
</code></pre></div>
<p>To test the pipeline, we created a dataframe with two rows, then we
fitted the pipeline to it and transformed the dataframe. We expect to
get 33 columns in the output dataframe because of the deep feature
synthesis, so we test for that and raise an exception if it is not the
case.</p>
<p>The columns transformer can now be saved so we can use it later in the
model training process:</p>
<div class="highlight"><pre><span></span><code><span class="n">joblib</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">column_transformer</span><span class="p">,</span> <span class="s2">"transformer.joblib"</span><span class="p">)</span>
</code></pre></div>
<p>In this section we used scikit-learn pipelines to compose a complex
series of data transformations that will be executed when the model is
trained and also when it is used for predictions. By using pipelines, we
are able to make sure that the steps always happen in the same order and
with the same parameters. If we didn't use pipelines, we would end up
rewriting the transformations twice, once for model training and once
for prediction. All of the code for data preparation is in the
<a href="https://github.com/schmidtbri/regression-model/blob/master/insurance_charges_model/training/2.%20data_preparation.ipynb">data_preparation.ipynb</a>
notebook.</p>
<h2>Training a Model</h2>
<p>The next step after preparing the data is to train a model. For this,
we'll use the <a href="https://epistasislab.github.io/tpot/">TPOT
package</a>, which is an
automated machine learning tool that is able to search through many
possible model types and hyperparameters and find the best pipeline for
the dataset. The package uses <a href="https://en.wikipedia.org/wiki/Genetic_programming">genetic
programming</a> to
search the space of possible ML pipelines.</p>
<p>To train the model, we'll first load the dataset:</p>
<div class="highlight"><pre><span></span><code><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s2">"../../data/insurance.csv"</span><span class="p">)</span>
</code></pre></div>
<p>Then, we'll create a training set and a test set by randomly selecting
samples. The training testing split will be 80:20.</p>
<div class="highlight"><pre><span></span><code><span class="n">mask</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">rand</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">df</span><span class="p">))</span> <span class="o"><</span> <span class="mf">0.8</span>
<span class="n">training_set</span> <span class="o">=</span> <span class="n">df</span><span class="p">[</span><span class="n">mask</span><span class="p">]</span>
<span class="n">testing_set</span> <span class="o">=</span> <span class="n">df</span><span class="p">[</span><span class="o">~</span><span class="n">mask</span><span class="p">]</span>
</code></pre></div>
<p>Next, we'll save the data sets to the data folder because we'll need the
two datasets when we do model validation. Since we're choosing to do
this in another Jupyter notebook, we need to keep the data sets on the
hard drive until then.</p>
<div class="highlight"><pre><span></span><code><span class="n">training_set</span><span class="o">.</span><span class="n">to_csv</span><span class="p">(</span><span class="s2">"../../data/training_set.csv"</span><span class="p">)</span>
<span class="n">testing_set</span><span class="o">.</span><span class="n">to_csv</span><span class="p">(</span><span class="s2">"../../data/testing_set.csv"</span><span class="p">)</span>
</code></pre></div>
<p>Now that we have a training set, we'll need to separate the feature
columns from the target column:</p>
<div class="highlight"><pre><span></span><code><span class="n">feature_columns</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"bmi"</span><span class="p">,</span> <span class="s2">"children"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">]</span>
<span class="n">target_column</span> <span class="o">=</span> <span class="s2">"charges"</span>
<span class="n">X_train</span> <span class="o">=</span> <span class="n">training_set</span><span class="p">[</span><span class="n">feature_columns</span><span class="p">]</span>
<span class="n">y_train</span> <span class="o">=</span> <span class="n">training_set</span><span class="p">[</span><span class="n">target_column</span><span class="p">]</span>
<span class="n">X_test</span> <span class="o">=</span> <span class="n">testing_set</span><span class="p">[</span><span class="n">feature_columns</span><span class="p">]</span>
<span class="n">y_test</span> <span class="o">=</span> <span class="n">testing_set</span><span class="p">[</span><span class="n">target_column</span><span class="p">]</span>
</code></pre></div>
<p>Next, we'll apply the preprocessing pipeline that we built in the data
preprocessing code. First we'll load the transformer that we saved to
disk:</p>
<div class="highlight"><pre><span></span><code><span class="n">transformer</span> <span class="o">=</span> <span class="n">joblib</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="s2">"transformer.joblib"</span><span class="p">)</span>
</code></pre></div>
<p>Now we can apply it to the features dataframe in order to calculate the
features that we created using automated feature engineering:</p>
<div class="highlight"><pre><span></span><code><span class="n">features</span> <span class="o">=</span> <span class="n">transformer</span><span class="o">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">X_train</span><span class="p">)</span>
</code></pre></div>
<p>Now that we have a features dataframe that we can train a model with,
we'll launch the training by instantiating a TPOTRegressor object and
calling the fit method:</p>
<div class="highlight"><pre><span></span><code><span class="n">tpot_regressor</span> <span class="o">=</span> <span class="n">TPOTRegressor</span><span class="p">(</span><span class="n">generations</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span>
<span class="n">population_size</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span>
<span class="n">random_state</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>
<span class="n">cv</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span>
<span class="n">n_jobs</span><span class="o">=</span><span class="mi">8</span><span class="p">,</span>
<span class="n">verbosity</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span>
<span class="n">early_stop</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span>
<span class="n">tpot_regressor</span> <span class="o">=</span> <span class="n">tpot_regressor</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">features</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
</code></pre></div>
<p>The TPOTRegressor uses genetic programming so we need to provide some
parameters that will define the size of the population and the number of
generations. The random_state parameter makes it easier to replicate the
training run, the cv parameter is the number of cross validation splits
that we want to use, the n_jobs parameters tells TPOT how many processes
to launch to train the model.</p>
<p>Here is a sample of the output of the tpot_regressor as it trains:</p>
<div class="highlight"><pre><span></span><code>Optimization Progress: 100%
2550/2550 [35:22<00:00, 1.15pipeline/s]
Generation 1 - Current best internal CV score: -19328040.90181576
Generation 2 - Current best internal CV score: -19328040.90181576
Generation 3 - Current best internal CV score: -19291161.694311526
Generation 4 - Current best internal CV score: -19216662.844604537
Generation 5 - Current best internal CV score: -19194856.36477192
...
Generation 48 - Current best internal CV score: -18848299.473418456
Generation 49 - Current best internal CV score: -18848299.473418456
Generation 50 - Current best internal CV score: -18848299.473418456
Best pipeline:
RandomForestRegressor(MaxAbsScaler(SGDRegressor(Normalizer(input_matrix,
norm=l2), alpha=0.01, eta0=1.0, fit_intercept=True, l1_ratio=0.0,
learning_rate=invscaling, loss=squared_loss, penalty=elasticnet,
power_t=0.1)), bootstrap=True, max_features=0.7500000000000001,
min_samples_leaf=16, min_samples_split=14, n_estimators=100)
</code></pre></div>
<p>It looks like the best pipeline found by TPOT includes a
RandomForestRegressor combined with several preprocessing steps. Now
that we have an optimal pipeline created by TPOT we will be adding our
own preprocessors to it. To do this we\'ll need to have an unfitted
pipeline object, we don\'t have that right now because the TPOTRegressor
pipeline has been fitted.</p>
<p>To get an unfitted pipeline we\'ll ask TPOT for the fitted pipeline and
<a href="https://scikit-learn.org/stable/modules/generated/sklearn.base.clone.html">clone</a>
it:</p>
<div class="highlight"><pre><span></span><code><span class="n">unfitted_tpot_regressor</span> <span class="o">=</span> <span class="n">clone</span><span class="p">(</span><span class="n">tpot_regressor</span><span class="o">.</span><span class="n">fitted_pipeline_</span><span class="p">)</span>
</code></pre></div>
<p>Now that we have an unfitted Pipeline that is the same pipeline that was
found by the TPOT package, we\'ll add our own preprocessors to the
pipeline. This will ensure that the final pipeline will accept the
features in the original dataset and will process the features
correctly. We\'ll compose the unfitted TPOT pipeline and the transformer
Pipeline into one Pipeline:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">Pipeline</span><span class="p">([(</span><span class="s2">"transformer"</span><span class="p">,</span> <span class="n">transformer</span><span class="p">),</span>
<span class="p">(</span><span class="s2">"tpot_pipeline"</span><span class="p">,</span> <span class="n">unfitted_tpot_regressor</span><span class="p">)</span>
<span class="p">])</span>
</code></pre></div>
<p>Now we can train the model on the original, unprocessed dataset:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
</code></pre></div>
<p>The final fitted pipeline contains all of the transformations that we
used to do deep feature synthesis and data preprocessing, and all of the
transformations that were added by TPOT. This is the final pipeline:</p>
<div class="highlight"><pre><span></span><code><span class="n">Pipeline</span><span class="p">(</span><span class="n">steps</span><span class="o">=</span><span class="p">[(</span><span class="s1">'transformer'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">ColumnTransformer</span><span class="p">(</span><span class="n">remainder</span><span class="o">=</span><span class="s1">'passthrough'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">transformers</span><span class="o">=</span><span class="p">[(</span><span class="s1">'dfs_pipeline'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">Pipeline</span><span class="p">(</span><span class="n">steps</span><span class="o">=</span><span class="p">[(</span><span class="s1">'dfs_transformer'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">DFSTransformer</span><span class="p">(</span><span class="n">ignore_variables</span><span class="o">=</span><span class="p">{</span><span class="s1">'Transactions'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'sex'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'smoker'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'region'</span><span class="p">]},</span><span class="w"></span>
<span class="w"> </span><span class="n">target_entity</span><span class="o">=</span><span class="s1">'Transactions'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">trans_primitives</span><span class="o">=</span><span class="p">[</span><span class="s1">'add_numeric'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'subtract_numeric'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'multiply_numeric'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'divide_numeric'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'greater_than'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'less_...</span>
<span class="w"> </span><span class="n">Pipeline</span><span class="p">(</span><span class="n">steps</span><span class="o">=</span><span class="p">[(</span><span class="s1">'normalizer'</span><span class="p">,</span><span class="w"> </span><span class="n">Normalizer</span><span class="p">()),</span><span class="w"></span>
<span class="w"> </span><span class="p">(</span><span class="s1">'stackingestimator'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">StackingEstimator</span><span class="p">(</span><span class="n">estimator</span><span class="o">=</span><span class="n">SGDRegressor</span><span class="p">(</span><span class="n">alpha</span><span class="o">=</span><span class="mf">0.01</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">eta0</span><span class="o">=</span><span class="mf">1.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">l1_ratio</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">penalty</span><span class="o">=</span><span class="s1">'elasticnet'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">power_t</span><span class="o">=</span><span class="mf">0.1</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">random_state</span><span class="o">=</span><span class="mi">42</span><span class="p">))),</span><span class="w"></span>
<span class="w"> </span><span class="p">(</span><span class="s1">'maxabsscaler'</span><span class="p">,</span><span class="w"> </span><span class="n">MaxAbsScaler</span><span class="p">()),</span><span class="w"></span>
<span class="w"> </span><span class="p">(</span><span class="s1">'randomforestregressor'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">RandomForestRegressor</span><span class="p">(</span><span class="n">max_features</span><span class="o">=</span><span class="mf">0.7500000000000001</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">min_samples_leaf</span><span class="o">=</span><span class="mi">16</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">min_samples_split</span><span class="o">=</span><span class="mi">14</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">random_state</span><span class="o">=</span><span class="mi">42</span><span class="p">))]))])</span><span class="w"></span>
</code></pre></div>
<p>Finally, we'll test the model with a single sample:</p>
<div class="highlight"><pre><span></span><code><span class="n">test_df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">([[</span><span class="mi">65</span><span class="p">,</span> <span class="s2">"male"</span><span class="p">,</span> <span class="mf">12.5</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">"yes"</span><span class="p">,</span> <span class="s2">"southwest"</span><span class="p">]],</span>
<span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"bmi"</span><span class="p">,</span> <span class="s2">"children"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">])</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">test_df</span><span class="p">)</span>
</code></pre></div>
<p>The result is:</p>
<div class="highlight"><pre><span></span><code>array([19326.59077456])
</code></pre></div>
<p>In order to use the model later, we'll serialize it to disk:</p>
<div class="highlight"><pre><span></span><code><span class="n">joblib</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="s2">"model.joblib"</span><span class="p">)</span>
</code></pre></div>
<p>All of the code for training the model is in the
<a href="https://github.com/schmidtbri/regression-model/blob/master/insurance_charges_model/training/3.%20model_training.ipynb">model_training.ipynb</a>
notebook.</p>
<h2>Validating the Model</h2>
<p>In order to validate the model generated by the autoML process, we'll
use the <a href="https://www.scikit-yb.org/en/latest/">yellow_brick
library</a>.</p>
<p>First, we'll load the training and testing sets that we previously saved
to disk:</p>
<div class="highlight"><pre><span></span><code><span class="n">training_set</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s2">"../../data/training_set.csv"</span><span class="p">)</span>
<span class="n">testing_set</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s2">"../../data/testing_set.csv"</span><span class="p">)</span>
</code></pre></div>
<p>Next, we'll separate the predictor variables from the target variable:</p>
<div class="highlight"><pre><span></span><code><span class="n">feature_columns</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"bmi"</span><span class="p">,</span> <span class="s2">"children"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">]</span>
<span class="n">target_column</span> <span class="o">=</span> <span class="s2">"charges"</span>
<span class="n">X_train</span> <span class="o">=</span> <span class="n">training_set</span><span class="p">[</span><span class="n">feature_columns</span><span class="p">]</span>
<span class="n">y_train</span> <span class="o">=</span> <span class="n">training_set</span><span class="p">[</span><span class="n">target_column</span><span class="p">]</span>
<span class="n">X_test</span> <span class="o">=</span> <span class="n">testing_set</span><span class="p">[</span><span class="n">feature_columns</span><span class="p">]</span>
<span class="n">y_test</span> <span class="o">=</span> <span class="n">testing_set</span><span class="p">[</span><span class="n">target_column</span><span class="p">]</span>
</code></pre></div>
<p>We'll load the fitted model object that was saved in a previous step:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">joblib</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="s2">"model.joblib"</span><span class="p">)</span>
</code></pre></div>
<p>We can now try to make predictions on the test set with the fitted
pipeline:</p>
<div class="highlight"><pre><span></span><code><span class="n">predictions</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X_test</span><span class="p">)</span>
</code></pre></div>
<p>The model's r\^2 and errors are calculated like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">r2</span> <span class="o">=</span> <span class="n">r2_score</span><span class="p">(</span><span class="n">y_test</span><span class="p">,</span> <span class="n">predictions</span><span class="p">)</span>
<span class="n">mse</span> <span class="o">=</span> <span class="n">mean_squared_error</span><span class="p">(</span><span class="n">y_test</span><span class="p">,</span> <span class="n">predictions</span><span class="p">)</span>
<span class="n">mae</span> <span class="o">=</span> <span class="n">mean_absolute_error</span><span class="p">(</span><span class="n">y_test</span><span class="p">,</span> <span class="n">predictions</span><span class="p">)</span>
</code></pre></div>
<p>The results are:</p>
<div class="highlight"><pre><span></span><code>r2 score: 0.827414647586443
mean squared error: 24830561.579995826
mean absolute error: 2713.6533067216383
</code></pre></div>
<p>Next, we'll create a yellow_brick visualizer for the model:</p>
<div class="highlight"><pre><span></span><code><span class="n">visualizer</span> <span class="o">=</span> <span class="n">ResidualsPlot</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">score</span><span class="p">(</span><span class="n">X_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div>
<p>The <a href="https://www.scikit-yb.org/en/latest/api/regressor/residuals.html">ResidualsPlot
visualizer</a>
shows us the difference between the observed value and the predicted
value of the target variable. This visualization is useful to see if
there are value ranges for the target variable that have more or less
error than other value ranges. The plot generated for our model looks
like this:</p>
<p><img alt="Residuals Plot" src="https://www.tekhnoal.com/6.png" width="100%"></p>
<p>Next, we'll generate the prediction error plot for the model using the
<a href="https://www.scikit-yb.org/en/latest/api/regressor/peplot.html">PredictionError
visualizer</a>:</p>
<div class="highlight"><pre><span></span><code><span class="n">visualizer</span> <span class="o">=</span> <span class="n">PredictionError</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">score</span><span class="p">(</span><span class="n">X_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">)</span>
<span class="n">visualizer</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div>
<p>The prediction error plot shows the actual values of the target variable
against the predicted values generated by the model. This allows us to
see how much variance is in the predictions made by the model. The plot
generated for our model looks like this:</p>
<p><img alt="Prediction Error Plot" src="https://www.tekhnoal.com/7.png" width="100%"></p>
<p>All of the code for validating the model is in the
<a href="https://github.com/schmidtbri/regression-model/blob/master/insurance_charges_model/training/4.%20model_validation.ipynb">model_validation.ipynb</a>
notebook.</p>
<h1>Making Predictions with the Model</h1>
<p>The insurance charges model is now ready to be used to make predictions,
so now we need to make it available in an easy to use format. The
<a href="https://schmidtbri.github.io/ml-base/">ml_base package</a> defines
a simple base class for model prediction code that allows us to "wrap"
the code in a class that follows the MLModel interface. This interface
publishes this information about the model:</p>
<ul>
<li>Qualified Name, a unique identifier for the model</li>
<li>Display Name, a friendly name for the model used in user interfaces</li>
<li>Description, a description for the model</li>
<li>Version, semantic version of the model codebase</li>
<li>Input Schema, an object that describes the model\'s input data</li>
<li>Output Schema, an object that describes the model\'s output schema</li>
</ul>
<p>The MLModel interface also dictates that the model class implements two
methods:</p>
<ul>
<li>__init__, initialization method which loads any model artifacts needed to make predictions</li>
<li>predict, prediction method that receives model inputs makes a prediction and returns model outputs</li>
</ul>
<p>By using the MLModel base class we'll be able to do more interesting
things later with the model. If you'd like to learn more about the
ml_base package, there is a <a href="https://www.tekhnoal.com/introducing-ml-base-package.html">blog post</a>
about it. </p>
<p>To install the ml_base package, execute this command:</p>
<div class="highlight"><pre><span></span><code>pip install ml_base
</code></pre></div>
<h2>Creating Input and Output Schemas</h2>
<p>Before writing the model class, we'll need to define the input and
output schemas of the model. To do this, we'll use the <a href="https://pydantic-docs.helpmanual.io/">pydantic
package</a>.</p>
<p>The "sex" feature used by the model is a categorical feature that can be
stated as an enumeration because it has a limited number of allowed
values:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">SexEnum</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="n">male</span> <span class="o">=</span> <span class="s2">"male"</span>
<span class="n">female</span> <span class="o">=</span> <span class="s2">"female"</span>
</code></pre></div>
<p>We'll use this class as a type in the input schema of the model.</p>
<p>We'll also need another enumeration for the region feature:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">RegionEnum</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="n">southwest</span> <span class="o">=</span> <span class="s2">"southwest"</span>
<span class="n">southeast</span> <span class="o">=</span> <span class="s2">"southeast"</span>
<span class="n">northwest</span> <span class="o">=</span> <span class="s2">"northwest"</span>
<span class="n">northeast</span> <span class="o">=</span> <span class="s2">"northeast"</span>
</code></pre></div>
<p>Now we're ready to create the input schema for the model:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">InsuranceChargesModelInput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">age</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"Age"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">18</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">65</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Age of primary beneficiary in years."</span><span class="p">)</span>
<span class="n">sex</span><span class="p">:</span> <span class="n">SexEnum</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"Sex"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Gender of beneficiary."</span><span class="p">)</span>
<span class="n">bmi</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"Body Mass Index"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mf">15.0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mf">50.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Body mass index of beneficiary."</span><span class="p">)</span>
<span class="n">children</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"Children"</span><span class="p">,</span> <span class="n">ge</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">le</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Number of children covered by health insurance."</span><span class="p">)</span>
<span class="n">smoker</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"Smoker"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Whether beneficiary is a smoker."</span><span class="p">)</span>
<span class="n">region</span><span class="p">:</span> <span class="n">RegionEnum</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"Region"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Region where beneficiary lives."</span><span class="p">)</span>
</code></pre></div>
<p>We used the SexEnum and RegionEnum as types for the categorical
variables, adding descriptions to the fields. We also added the age,
bmi, children, and smoker fields. These fields are of type integer,
float, integer, and boolean in turn.</p>
<p>We can use the class to create an object like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">insurance_charges_model.prediction.schemas</span> <span class="kn">import</span> <span class="n">InsuranceChargesModelInput</span>
<span class="nb">input</span> <span class="o">=</span> <span class="n">InsuranceChargesModelInput</span><span class="p">(</span><span class="n">age</span><span class="o">=</span><span class="mi">22</span><span class="p">,</span> <span class="n">sex</span><span class="o">=</span><span class="s2">"male"</span><span class="p">,</span> <span class="n">bmi</span><span class="o">=</span><span class="mf">20.0</span><span class="p">,</span> <span class="n">children</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">region</span><span class="o">=</span><span class="s2">"southwest"</span><span class="p">)</span>
</code></pre></div>
<p>Now that we have the model input defined, we'll move on to the model
output. This class is a lot simpler:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">InsuranceChargesModelOutput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">charges</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="kc">None</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s2">"Charges"</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Individual medical costs billed by health insurance to customer in US dollars."</span><span class="p">)</span>
</code></pre></div>
<p>The model only has one output, the charges in US dollars that are
predicted, which is a floating point field. The model schemas are in the
<a href="https://github.com/schmidtbri/regression-model/blob/master/insurance_charges_model/prediction/schemas.py">schemas
module</a>
in the prediction package.</p>
<h2>Creating the Model Class</h2>
<p>Since we now have the input and output schemas defined for the model,
we'll be able to create the class that wraps around the model.</p>
<p>To start, we'll define the class and add all of the required properties:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">InsuranceChargesModel</span><span class="p">(</span><span class="n">MLModel</span><span class="p">):</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">display_name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s2">"Insurance Charges Model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">qualified_name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s2">"insurance_charges_model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">description</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s2">"Model to predict the insurance charges of a customer."</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">version</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="n">__version__</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">InsuranceChargesModelInput</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">InsuranceChargesModelOutput</span>
</code></pre></div>
<p>The properties are required by the MLModel base class and they are used
to easily access metadata about the model. The input and output schema
classes are returned from the input_schema and output_schema properties.</p>
<p>The __init__ method of the class looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">dir_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">realpath</span><span class="p">(</span><span class="vm">__file__</span><span class="p">)))</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir_path</span><span class="p">,</span> <span class="s2">"model_files"</span><span class="p">,</span> <span class="s2">"1"</span><span class="p">,</span> <span class="s2">"model.joblib"</span><span class="p">),</span> <span class="s1">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span> <span class="o">=</span> <span class="n">joblib</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
</code></pre></div>
<p>The init method is used to load the model parameters from disk and store
the model object as an object attribute. The model object will be used
to make predictions. Once the init method completes, the model object
should be initialized and ready to make predictions.</p>
<p>The prediction method of the model class looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">:</span> <span class="n">InsuranceChargesModelInput</span><span class="p">)</span> <span class="o">-></span> <span class="n">InsuranceChargesModelOutput</span><span class="p">:</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">([[</span><span class="n">data</span><span class="o">.</span><span class="n">age</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">sex</span><span class="o">.</span><span class="n">value</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">bmi</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">children</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">smoker</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">region</span><span class="o">.</span><span class="n">value</span><span class="p">]],</span>
<span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s2">"age"</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"bmi"</span><span class="p">,</span> <span class="s2">"children"</span><span class="p">,</span> <span class="s2">"smoker"</span><span class="p">,</span> <span class="s2">"region"</span><span class="p">])</span>
<span class="n">y_hat</span> <span class="o">=</span> <span class="nb">round</span><span class="p">(</span><span class="nb">float</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X</span><span class="p">)[</span><span class="mi">0</span><span class="p">]),</span> <span class="mi">2</span><span class="p">)</span>
<span class="k">return</span> <span class="n">InsuranceChargesModelOutput</span><span class="p">(</span><span class="n">charges</span><span class="o">=</span><span class="n">y_hat</span><span class="p">)</span>
</code></pre></div>
<p>The predict method accepts an object of type InsuranceChargesModelInput
and returns an object of type InsuranceChargesModelOutput. First, the
method converts the incoming data into a pandas dataframe, then the
dataframe is used to make a prediction, and the result is converted to a
floating point number and rounded to two decimal places. Lastly, the
output object is created using the prediction and returned to the
caller.</p>
<p>The model class is defined in the <a href="https://github.com/schmidtbri/regression-model/blob/master/insurance_charges_model/prediction/model.py">model
module</a>
in the prediction package.</p>
<h1>Creating a RESTful Service</h1>
<p>Now that we have a model class defined, we are finally able to build the
RESTful service that will host the model when it is deployed. Luckily,
we don't actually need to write any code for this because we'll be using
the <a href="https://pypi.org/project/rest-model-service/">rest_model_service package</a>. If you'd
like to learn more about the rest_model_service package, there is a
<a href="https://www.tekhnoal.com/rest-model-service.html">blog post</a>
about it.</p>
<p>To install the package, execute this command:</p>
<div class="highlight"><pre><span></span><code>pip install rest_model_service
</code></pre></div>
<p>To create a service for our model, all that is needed is that we add a
YAML configuration file to the project. The <a href="https://github.com/schmidtbri/regression-model/blob/master/rest_config.yaml">configuration
file</a>
looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Insurance Charges Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">qualified_name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model</span><span class="w"></span>
<span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance_charges_model.prediction.model.InsuranceChargesModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
</code></pre></div>
<p>The service title is the name we'll give the service in the
documentation. The models array contains references to the models that
we'd like to host within the service. Each model needs to have the
qualified name of the model along with the class path to the model's
MLModel class. The create_endpoint option is set to true to tell the
service to create an endpoint for the model.</p>
<p>Using the configuration file, we're able to create an OpenAPI
specification file for the model service by executing this command:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
generate_openapi <span class="se">\-</span>-output_file<span class="o">=</span>service_contract.yaml
</code></pre></div>
<p>The
<a href="https://github.com/schmidtbri/regression-model/blob/master/service_contract.yaml">service_contract.yaml</a>
file will be generated and it will contain the specification that was
generated for the model service. The
<a href="https://github.com/schmidtbri/regression-model/blob/master/service_contract.yaml#L183-L218">insurance_charges_model</a>
endpoint is the one we'll call to make predictions with the model. The
model's <a href="https://github.com/schmidtbri/regression-model/blob/master/service_contract.yaml#L183-L218">input and output
schemas</a>
were automatically extracted and added to the specification.</p>
<p>To run the service locally, execute these commands:</p>
<div class="highlight"><pre><span></span><code>uvicorn rest_model_service.main:app <span class="se">\-</span>-reload
</code></pre></div>
<p>The service should come up and can be accessed in a web browser at
<a href="http://127.0.0.1:8000">http://127.0.0.1:8000</a>. When you access
that URL you will be redirected to the documentation page that is
generated by the FastAPI package:</p>
<p><img alt="Documentation Page" src="https://www.tekhnoal.com/8.png" width="100%"></p>
<p>The documentation allows you to make requests against the API in order
to try it out. Here's a prediction request against the insurance charges
model:</p>
<p><img alt="Request" src="https://www.tekhnoal.com/9.png" width="100%"></p>
<p>And the prediction result:</p>
<p><img alt="Prediction Result" src="https://www.tekhnoal.com/10.png" width="100%"></p>
<p>By using the MLModel base class provided by the ml_base package and the
REST service framework provided by the rest_model_service package we're
able to quickly stand up a service to host the model.</p>
<h1>Deploying the Model</h1>
<p>Now that we have a working model and model service, we'll need to deploy
it somewhere. To do this, we'll use docker and kubernetes.</p>
<h2>Creating a Docker Image</h2>
<p>Before moving forward, let's create a docker image and run it locally.
The docker image is generated using instructions in the
<a href="https://github.com/schmidtbri/regression-model/blob/master/Dockerfile">Dockerfile</a>:</p>
<div class="highlight"><pre><span></span><code><span class="k">FROM</span><span class="w"> </span><span class="s">tiangolo/uvicorn-gunicorn-fastapi:python3.7</span>
<span class="k">MAINTAINER</span><span class="w"> </span><span class="s">Brian Schmidt</span>
<span class="s2">"6666331+schmidtbri@users.noreply.github.com"</span>
<span class="k">WORKDIR</span><span class="w"> </span><span class="s">./service</span>
<span class="k">COPY</span><span class="w"> </span>./insurance_charges_model ./insurance_charges_model
<span class="k">COPY</span><span class="w"> </span>./rest_config.yaml ./rest_config.yaml
<span class="k">COPY</span><span class="w"> </span>./service_requirements.txt ./service_requirements.txt
<span class="k">RUN</span><span class="w"> </span>pip install -r service_requirements.txt
<span class="k">ENV</span><span class="w"> </span><span class="nv">APP_MODULE</span><span class="o">=</span>rest_model_service.main:app
</code></pre></div>
<p>The Dockerfile is used by this command to create the docker image:</p>
<div class="highlight"><pre><span></span><code>docker build -t insurance_charges_model:0.1.0 .
</code></pre></div>
<p>To make sure everything worked as expected, we'll look through the
docker images in our system:</p>
<div class="highlight"><pre><span></span><code>docker image ls
</code></pre></div>
<p>The insurance_charges_model image should be listed. Next, we'll start
the image to see if everything is working as expected:</p>
<div class="highlight"><pre><span></span><code>docker run -d -p <span class="m">80</span>:80 insurance_charges_model:0.1.0
</code></pre></div>
<p>The service should be accessible on port 80 of localhost, so we'll try
to make a prediction using the curl command:</p>
<div class="highlight"><pre><span></span><code>curl -X <span class="s1">'POST'</span> <span class="se">\</span>
<span class="s1">'http://localhost/api/models/insurance_charges_model/prediction'</span> <span class="se">\</span>
-H <span class="s1">'accept: application/json'</span> <span class="se">\</span>
-H <span class="s1">'Content-Type: application/json'</span> <span class="se">\</span>
-d <span class="s1">'{</span>
<span class="s1">"age": 65,</span>
<span class="s1">"sex": "male",</span>
<span class="s1">"bmi": 50,</span>
<span class="s1">"children": 5,</span>
<span class="s1">"smoker": true,</span>
<span class="s1">"region": "southwest"</span>
<span class="s1">}'</span>
</code></pre></div>
<p>We got back this output, which tells us that the service is working as
expected:</p>
<div class="highlight"><pre><span></span><code>{"charges":46918.68}
</code></pre></div>
<p>If there are any problems, we should be able to debug them using the
logs. To see the logs emitted by the running container, execute this
command:</p>
<div class="highlight"><pre><span></span><code>docker logs <span class="k">$(</span>docker ps -lq<span class="k">)</span>
</code></pre></div>
<p>To stop the docker container, execute this command:</p>
<div class="highlight"><pre><span></span><code>docker <span class="nb">kill</span> <span class="k">$(</span>docker ps -lq<span class="k">)</span>
</code></pre></div>
<h2>Setting up Digital Ocean</h2>
<p>To show how to deploy the model service we created, we'll use <a href="https://www.digitalocean.com/">Digital
Ocean</a>. In this section we'll be
using the doctl command line utility which will help us to interact with
the Digital Ocean Kubernetes service. We followed <a href="https://docs.digitalocean.com/reference/doctl/how-to/install/">these
instructions</a>
to install the doctl utility. Before we can do anything with the Digital
Ocean API, we need to authenticate, so we created an API token by
following these instructions. Once we have the token we can add it to
the doctl utility by creating a new authentication context with this
command:</p>
<div class="highlight"><pre><span></span><code>doctl auth init <span class="se">\-</span>-context model-services-context
</code></pre></div>
<p>The command creates a new context called "model-services-context" that
we'll use to interact with the Digital Ocean API. The command asks for
the API token we generated and saves it into the configuration file of
the tool. To make sure that the context was created correctly and is the
current context, execute this command:</p>
<div class="highlight"><pre><span></span><code>doctl auth list
</code></pre></div>
<p>If the context we created is not the current context, we can switch to
it with this command:</p>
<div class="highlight"><pre><span></span><code>doctl auth switch <span class="se">\-</span>-context model-services-context
</code></pre></div>
<p>To make sure that we are working in the right account, execute this
command:</p>
<div class="highlight"><pre><span></span><code>doctl account get
</code></pre></div>
<p>The account details should match the account that you used to login. Now
that we are connecting to the right account in DO, we'll work on
uploading the docker image that contains the model service so that we
can use it in the Kubernetes cluster. First, we'll create a container
registry with this command:</p>
<div class="highlight"><pre><span></span><code>doctl registry create model-services-registry <span class="se">\-</span>-subscription-tier basic
</code></pre></div>
<p>We called the new registry "model-services-registry" and we used the
basic tier, which costs \$5 a month.</p>
<h3>Pushing the Image</h3>
<p>Now that we have a registry, we need to add credentials to our local
docker daemon in order to be able to upload images, to do that we'll use
this command:</p>
<div class="highlight"><pre><span></span><code>doctl registry login
</code></pre></div>
<p>In order to upload the image, we need to tag it with the URL of the DO
registry we created. The docker tag command looks like this:</p>
<div class="highlight"><pre><span></span><code>docker tag insurance_charges_model:0.1.0
registry.digitalocean.com/model-services-registry/insurance_charges_model:0.1.0
</code></pre></div>
<p>Now we can push the image to the DO registry:</p>
<div class="highlight"><pre><span></span><code>docker push registry.digitalocean.com/model-services-registry/insurance_charges_model:0.1.0
</code></pre></div>
<h3>Creating the Kubernetes Cluster</h3>
<p>The doctl tool provides an option for creating a Kubernetes cluster, the
command goes like this:</p>
<div class="highlight"><pre><span></span><code>doctl kubernetes cluster create model-services-cluster
</code></pre></div>
<p>The cluster should come up after a while. The default cluster size is 3
nodes which should cost about \$30 to run for a month. We'll shut the
cluster down later to save money.</p>
<p>Next, we need to add Kubernetes integration with Digital Ocean's docker
registry, this allows the kubernetes cluster to pull images from the
docker registry we created above. To do this execute this command:</p>
<div class="highlight"><pre><span></span><code>doctl kubernetes cluster registry add model-services-cluster
</code></pre></div>
<p>To access the cluster, doctl has another option that will set up the
kubectl tool for us:</p>
<div class="highlight"><pre><span></span><code>doctl kubernetes cluster kubeconfig save <span class="m">85866655</span>-708d-47a9-8797-bcca56a10401
</code></pre></div>
<p>The unique identifier is for the cluster that was just created and is
returned by the previous command. When the command finishes, the current
context in kubectl should be switched to the newly created cluster. To
list the contexts in kubectl, execute this command:</p>
<div class="highlight"><pre><span></span><code>kubectl config get-contexts
</code></pre></div>
<p>A listing of the contexts currently in the kubectl configuration should
appear, and there should be a star next to the new cluster's context. We
can get a list of the nodes in the cluster with this command:</p>
<div class="highlight"><pre><span></span><code>kubectl get nodes
</code></pre></div>
<p>Now that we have a cluster and are connected to it, we'll create a
namespace to hold the resources for our model deployment. We'll create a
namespace using this YAML manifest:</p>
<div class="highlight"><pre><span></span><code><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">v1</span><span class="w"></span>
<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">kind</span><span class="p p-Indicator">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Namespace</span><span class="w"></span>
<span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">metadata</span><span class="p p-Indicator">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">model-services-namespace</span><span class="w"></span>
</code></pre></div>
<p>The manifest can be found in <a href="https://github.com/schmidtbri/regression-model/blob/master/kubernetes/namespace.yml">this
file</a>.
To apply the manifest to the cluster, execute this command:</p>
<div class="highlight"><pre><span></span><code>kubectl create -f kubernetes/namespace.yml
</code></pre></div>
<p>To take a look at the namespaces, execute this command:</p>
<div class="highlight"><pre><span></span><code>kubectl get namespace
</code></pre></div>
<p>The new namespace should appear in the listing along with other
namespaces created by default by the system. To use the new
namespace for the rest of the operations, execute this command:</p>
<div class="highlight"><pre><span></span><code>kubectl config set-context --current --namespace=model-services-namespace
</code></pre></div>
<h2>Creating a Kubernetes Deployment</h2>
<p>We are now ready to actually create a deployment in the cluster. A
deployment is a resource created within the Kubernetes cluster that
provides declarative updates to individual pods and ReplicaSets. A pod
represents a single instance of the web service that is hosting our
model. We'll use a Deployment to launch two instances of the service in
the cluster. The Deployment will manage the state of the Pods that hold
the service instances and make sure that the desired state is always
maintained in the cluster.</p>
<p>The Deployment is defined as YAML like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">apps/v1</span><span class="w"></span>
<span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Deployment</span><span class="w"></span>
<span class="nt">metadata</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance-charges-model-deployment</span><span class="w"></span>
<span class="w"> </span><span class="nt">labels</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance-charges-model</span><span class="w"></span>
<span class="nt">spec</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">replicas</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1</span><span class="w"></span>
<span class="w"> </span><span class="nt">selector</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">matchLabels</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance-charges-model</span><span class="w"></span>
<span class="w"> </span><span class="nt">template</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">metadata</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">labels</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance-charges-model</span><span class="w"></span>
<span class="w"> </span><span class="nt">spec</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">containers</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance-charges-model</span><span class="w"></span>
<span class="w"> </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">registry.digitalocean.com/model-services-registry/insurance_charges_model:0.1.0</span><span class="w"></span>
<span class="w"> </span><span class="nt">ports</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">containerPort</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">80</span><span class="w"></span>
<span class="w"> </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">TCP</span><span class="w"></span>
<span class="w"> </span><span class="nt">imagePullPolicy</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Always</span><span class="w"></span>
<span class="w"> </span><span class="nt">resources</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">requests</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">cpu</span><span class="p">:</span><span class="w"> </span><span class="s">"250m"</span><span class="w"></span>
</code></pre></div>
<p>The file containing the YAML is
<a href="https://github.com/schmidtbri/regression-model/blob/master/kubernetes/deployment.yml">here</a>.
The deployment specifies that there should be two replicas of the docker
image running in the cluster. The "app=insurance-charges-model" is
applied to the two Pods and is used to select them later.</p>
<p>The Deployment is created within the Kubernetes cluster with this
command:</p>
<div class="highlight"><pre><span></span><code>kubectl apply -f kubernetes/deployment.yml
</code></pre></div>
<p>Once the command finishes we can see the new deployment with this
command:</p>
<div class="highlight"><pre><span></span><code>kubectl get deployments
</code></pre></div>
<p>We can view the pods that are being managed by the deployment with this
command:</p>
<div class="highlight"><pre><span></span><code>kubectl get pods
</code></pre></div>
<p>The output should look something like this:</p>
<div class="highlight"><pre><span></span><code>NAME READY STATUS RESTARTS AGE
insurance-charges-model-deployment-7d58f6d569-zwjpw 1/1 Running 0 3m48s
</code></pre></div>
<h2>Creating a Kubernetes Service</h2>
<p>Now that we have a set of pods, we need to make them accessible to the
outside world. The Service resource within Kubernetes is used to select
a set of Pods and allow access to them through a single entry point. The
Service allows us to decouple the Pods and Deployment resources that
make up our REST service from the way that they are exposed to users.</p>
<p>The Service is defined as YAML like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">v1</span><span class="w"></span>
<span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Service</span><span class="w"></span>
<span class="nt">metadata</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance-charges-model-service</span><span class="w"></span>
<span class="nt">spec</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">LoadBalancer</span><span class="w"></span>
<span class="w"> </span><span class="nt">selector</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">app</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">insurance-charges-model</span><span class="w"></span>
<span class="w"> </span><span class="nt">ports</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">http</span><span class="w"></span>
<span class="w"> </span><span class="nt">protocol</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">TCP</span><span class="w"></span>
<span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">80</span><span class="w"></span>
<span class="w"> </span><span class="nt">targetPort</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">80</span><span class="w"></span>
</code></pre></div>
<p>The YAML file is
<a href="https://github.com/schmidtbri/regression-model/blob/master/kubernetes/deployment.yml">here</a>.
The Service is selecting the same Pods that are managed by the
Deployment resource which we created above by using the same selector.</p>
<p>The Service is created within the Kubernetes cluster with this command:</p>
<div class="highlight"><pre><span></span><code>kubectl apply -f kubernetes/service.yml
</code></pre></div>
<p>You can see the new service with this command:</p>
<div class="highlight"><pre><span></span><code>kubectl get services
</code></pre></div>
<p>The Service type is LoadBalancer, which means that the cloud provider is
providing a load balancer and public IP address through which we can
contact the service. To view details about the load balancer provided by
Digital Ocean for this Service, we'll execute this command:</p>
<div class="highlight"><pre><span></span><code>kubectl describe service insurance-charges-model-service <span class="p">|</span> grep <span class="s2">"LoadBalancer Ingress"</span>
</code></pre></div>
<p>The load balancer can take a while longer than the service to come up,
until the load balancer is running the command won't return anything.
The IP address that the Digital Ocean load balancer sits behind will be
listed in the output of the command. To get access to the service, we'll
hit the IP address with a web browser:</p>
<p><img alt="Prediction Result" src="https://www.tekhnoal.com/11.png" width="100%"></p>
<p>We can access the service documentation through the load balancer and
the Pod that is running the REST service is returning the webpage.</p>
<p>We'll try the same curl command as before to see if the model is
reachable:</p>
<div class="highlight"><pre><span></span><code>curl -X <span class="s1">'POST'</span> <span class="s1">'http://143.244.214.226/api/models/insurance_charges_model/prediction'</span> <span class="se">\</span>
-H <span class="s1">'accept: application/json'</span> <span class="se">\</span>
-H <span class="s1">'Content-Type: application/json'</span> <span class="se">\</span>
-d <span class="s1">'{</span>
<span class="s1">"age": 65,</span>
<span class="s1">"sex": "male",</span>
<span class="s1">"bmi": 50,</span>
<span class="s1">"children": 5,</span>
<span class="s1">"smoker": true,</span>
<span class="s1">"region": "southwest"</span>
<span class="s1">}'</span>
</code></pre></div>
<p>A prediction was returned from the model:</p>
<div class="highlight"><pre><span></span><code>{"charges":46277.67}
</code></pre></div>
<h1>Deleting the Resources</h1>
<p>Now that we're done with the service we need to destroy the resources.
To destroy the load balancer, execute this command:</p>
<div class="highlight"><pre><span></span><code>doctl compute load-balancer delete <span class="se">\-</span>-force <span class="k">$(</span>kubectl get svc insurance-charges-model-service -o <span class="nv">jsonpath</span><span class="o">=</span><span class="s2">"{.metadata.annotations.kubernetes\.digitalocean\.com/load-balancer-id}"</span><span class="k">)</span>
</code></pre></div>
<p>To destroy the kubernetes cluster, execute this command:</p>
<div class="highlight"><pre><span></span><code>doctl k8s cluster delete <span class="m">85866655</span>-708d-47a9-8797-bcca56a10401
</code></pre></div>
<p>To destroy the docker registry, execute this command:</p>
<div class="highlight"><pre><span></span><code>doctl registry delete model-services-registry
</code></pre></div>
<h1>Closing</h1>
<p>This blog post was created as a demonstration of how to build and deploy
machine learning models quickly and easily. Although I didn\'t do any
deep explanations of how the different tools work, I made sure to link
to other resources from which you can learn more about them. The
techniques and packages used are all open source and can be easily
downloaded and used in other projects.</p>
<p>The dataset that we used happens to be useful for predicting insurance
charges, but the code in this project can be used to train a model based
on any regression data set because of the automated feature engineering
and automated machine learning techniques that we used. We should be
able to throw any dataset at the code and the automations that we built
will enable us to quickly build a model and deploy a RESTful service
with it.</p>
<p>Something that we can improve on in the future is to create a Helm chart
that we can use to deploy an ML model service quickly and easily. Since
the Kubernetes resources for the model service are likely to be very
similar to other model services, we should be able to create a Helm
chart that we can reuse to quickly spin up model services that follow
the same pattern as this one.</p>
<p>Another thing that we can improve on is the automated generation of
input and output schemas for the model. When we built the input and
output schemas for the model, we had to manually extract the field
information from the dataframes. By introspecting the dataframe
metadata, we should be able to automatically generate the input and
output schemas, which can be used to automatically generate the code in
the schemas.py module. This is just one way in which we can further
automate the deployment process of an ML model.</p>A RESTful ML Model Service2021-04-29T07:32:00-05:002021-04-29T07:32:00-05:00Brian Schmidttag:www.tekhnoal.com,2021-04-29:/rest-model-service.html<p>Sometimes you find yourself writing the same code over and over. When that starts happening you know it's time to take what you've learned and create a reusable piece of code that can be applied in the future. Because of the experience that we've gained in writing previous blog posts, I think that it is a good time to make a reusable service that can host any number of machine learning models.</p><h1>Introduction</h1>
<p>Sometimes you find yourself writing the same code over and over. When
that starts happening you know it's time to take what you've learned and
create a reusable piece of code that can be applied in the future.
Because of the experience that we've gained in writing previous blog
posts, I think that it is a good time to make a reusable service that
can host any number of machine learning models.</p>
<p>In previous blog posts we've built many different types of services that
can host ML models, in this blog post we'll aim at building a reusable
service that can host an ML model behind a RESTful API. APIs are called
RESTful when they follow the guidelines of the REST standard. REST
stands for <a href="https://en.wikipedia.org/wiki/Representational_state_transfer">Representational State
Transfer</a>
and is a subset of the HTTP protocol that is useful for building web
applications. RESTful APIs are widely used in production systems and are
an industry standard for integrating different systems.</p>
<p>The features that we want this reusable service to have are simple. We
want to be able to install the service code as a package, that is to say
through the pip python package manager. We want the API of the service
to follow well-established standards, in this case we'll follow the REST
standard for web APIs. We want to be able to configure the service to
host any number of ML models. Lastly, we want to make the service be
self-documenting, so that we don't have to create OpenAPI documentation
for the service manually.</p>
<p>All of these things are possible and indeed easy to implement because we
will be relying on a common interface for all the ML models that the
service will host. This interface is the MLModel interface and it is
defined in another package that we've already created. This interface
and the package are fully described in a <a href="https://www.tekhnoal.com/introducing-ml-base-package.html">previous blog
post</a>.
By requiring every model that we want to host in the service fulfill the
requirements of the interface, we are able to write the service one time
and reuse it.</p>
<p>The MLModel interface is very simple. It requires that a model class be
created that contains two methods: an __init__ method that
initializes the model object and a predict method that actually makes a
prediction. This approach is very similar to the approach taken by Uber
in their internal ML platform, they describe how they structure their ML
model code
<a href="https://eng.uber.com/michelangelo-pyml/">here</a>.
SeldonCore is an open source project for deploying ML models which also
takes a similar approach which is described
<a href="https://docs.seldon.io/projects/seldon-core/en/latest/python/python_component.html">here</a>.
In this blog post we will leverage the standardization that the MLModel
interface makes possible to write a RESTful service that can host any
model that follows the standard.</p>
<h1>Package Structure</h1>
<p>The service codebase will be structured into the following files:</p>
<div class="highlight"><pre><span></span><code>- rest_model_service
- __init__.py
- configuration.py # data models for configuration
- generate_openapi.py # script to generate an openapi spec
- main.py # entry point for service
- routes.py # controllers for routes
- schemas.py # service schemas
- tests
- requirements.txt
- setup.py
- test_requirements.txt
</code></pre></div>
<p>This structure can be seen in the <a href="https://github.com/schmidtbri/rest-model-service">github
repository</a>.</p>
<h1>FastAPI</h1>
<p>Now that we have a set of requirements and have described our approach,
let's start building the REST service. For the web framework, we'll use
the popular FastAPI framework. FastAPI is a modern framework for
building web applications that uses python 3.6 and above. One of the
great things about it is that it uses type hints by default, which helps
to reduce the number of bugs in your code. By using the
<a href="https://pydantic-docs.helpmanual.io/">pydantic package</a> for
defining schemas, a FastAPI can generate an OpenAPI specification for
your application without any extra effort. Because FastAPI supports
asynchronous operations, it is also one of the fastest python web
frameworks available. FastAPI is a great choice for our REST service
because it follows a number of best practices by default which will
raise the quality of our code. The ml_base package uses the pydantic
package to define model input and output schemas which makes interfacing
with FastAPI very easy.</p>
<p>We'll build up our understanding of how the service works by exploring
the individual endpoints of the service. An endpoint is simply a point
through which the service interacts with the outside world. The service
has two types of endpoints: the metadata endpoint and all of the model
endpoints. We'll talk about the metadata endpoint first.</p>
<h1>Model Metadata Endpoint</h1>
<p>The service needs to be able to expose information about the models that
it is hosting to client systems. To do this, we'll add an endpoint that
returns model metadata. The first thing we need to do is create the data
model for the information that the endpoint will return:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">ModelMetadata</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="sd">"""Metadata of a model."""</span>
<span class="n">display_name</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"The display name of the model."</span><span class="p">)</span>
<span class="n">qualified_name</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"The qualified name of the model."</span><span class="p">)</span>
<span class="n">description</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"The description of the model."</span><span class="p">)</span>
<span class="n">version</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"The version of the model."</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/rest-model-service/blob/0d1705cb62e6a942f90150da3bcf51e3e1265a25/rest_model_service/schemas.py#L6-L12">here</a>.</p>
<p>The ModelMetadata object represents one model that is being hosted by
the service. We actually want to be able to host many models within the
service, so we need to create a "collection" data model that can hold
many ModelMetadata objects:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">ModelMetadataCollection</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="sd">"""Collection of model metadata."""</span>
<span class="n">models</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="n">ModelMetadata</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"A collection of model description."</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/rest-model-service/blob/0d1705cb62e6a942f90150da3bcf51e3e1265a25/rest_model_service/schemas.py#L15-L18">here</a>.</p>
<p>Now that we have the data models, we can build the function that the
client will interact with to get the model metadata:</p>
<div class="highlight"><pre><span></span><code><span class="k">async</span> <span class="k">def</span> <span class="nf">get_models</span><span class="p">():</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">models_metadata_collection</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">()</span>
<span class="n">models_metadata_collection</span> <span class="o">=</span> <span class="n">ModelMetadataCollection</span><span class="p">(</span><span class="o">**</span><span class="p">{</span><span class="s2">"models"</span><span class="p">:</span> <span class="n">models_metadata_collection</span><span class="p">})</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="k">return</span> <span class="n">JSONResponse</span><span class="p">(</span><span class="n">status_code</span><span class="o">=</span><span class="mi">200</span><span class="p">,</span> <span class="n">content</span><span class="o">=</span><span class="n">models_metadata_collection</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">error</span> <span class="o">=</span> <span class="n">Error</span><span class="p">(</span><span class="nb">type</span><span class="o">=</span><span class="s2">"ServiceError"</span><span class="p">,</span> <span class="n">message</span><span class="o">=</span><span class="nb">str</span><span class="p">(</span><span class="n">e</span><span class="p">))</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="k">return</span> <span class="n">JSONResponse</span><span class="p">(</span><span class="n">status_code</span><span class="o">=</span><span class="mi">500</span><span class="p">,</span> <span class="n">content</span><span class="o">=</span><span class="n">error</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/rest-model-service/blob/0d1705cb62e6a942f90150da3bcf51e3e1265a25/rest_model_service/routes.py#L18-L30">here</a>.</p>
<p>The function does not accept any parameters because we don't need to
select any specific model, we want to return metadata about all of the
models. The first thing the function does is instantiate the
ModelManager singleton. The ModelManager is a simple utility that we use
to manage model instances, we described how it operates in a <a href="https://brianschmidt-78145.medium.com/introducing-the-ml-base-package-1cc80ded39b4">previous
blog
post</a>.
The ModelManager object should already contain instances of models, and
by calling the get_models() method, we can get the metadata that we
will return to the client.</p>
<p>The model_metadata_collection object is instantiated using the data
model we created above, and returned as a JSONResponse to the client. If
anything goes wrong, the function catches the exception object and
returns a JSONResponse with the error details and a 500 status code.</p>
<h1>Prediction Endpoint</h1>
<p>To enable the service to host many instances of models, the code for the
prediction endpoint needs to be a bit more complex than the metadata
endpoint. We'll use a class instead of a function to create the
controller for the endpoint:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">PredictionController</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">model</span><span class="p">:</span> <span class="n">MLModel</span><span class="p">)</span> <span class="o">-></span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="o">=</span> <span class="n">model</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/rest-model-service/blob/0d1705cb62e6a942f90150da3bcf51e3e1265a25/rest_model_service/routes.py#L33-L43">here</a>.</p>
<p>The class is initialized with a reference to the instance of the model
that it will be hosting. In this way, we can instantiate one controller
object for each model that is living inside of the model service. To
make predictions with the model, we'll add a method:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__call__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="k">return</span> <span class="n">JSONResponse</span><span class="p">(</span><span class="n">status_code</span><span class="o">=</span><span class="mi">200</span><span class="p">,</span> <span class="n">content</span><span class="o">=</span><span class="n">prediction</span><span class="p">)</span>
<span class="k">except</span> <span class="n">MLModelSchemaValidationException</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">error</span> <span class="o">=</span> <span class="n">Error</span><span class="p">(</span><span class="nb">type</span><span class="o">=</span><span class="s2">"SchemaValidationError"</span><span class="p">,</span> <span class="n">message</span><span class="o">=</span><span class="nb">str</span><span class="p">(</span><span class="n">e</span><span class="p">))</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="k">return</span> <span class="n">JSONResponse</span><span class="p">(</span><span class="n">status_code</span><span class="o">=</span><span class="mi">400</span><span class="p">,</span> <span class="n">content</span><span class="o">=</span><span class="n">error</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">error</span> <span class="o">=</span> <span class="n">Error</span><span class="p">(</span><span class="nb">type</span><span class="o">=</span><span class="s2">"ServiceError"</span><span class="p">,</span> <span class="n">message</span><span class="o">=</span><span class="nb">str</span><span class="p">(</span><span class="n">e</span><span class="p">))</span><span class="o">.</span><span class="n">dict</span><span class="p">()</span>
<span class="k">return</span> <span class="n">JSONResponse</span><span class="p">(</span><span class="n">status_code</span><span class="o">=</span><span class="mi">500</span><span class="p">,</span> <span class="n">content</span><span class="o">=</span><span class="n">error</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/rest-model-service/blob/0d1705cb62e6a942f90150da3bcf51e3e1265a25/rest_model_service/routes.py#L45-L55">here</a>.</p>
<p>The method is a dunder method named "__call__". This type of dunder
method makes an object instantiated from the class <a href="https://www.geeksforgeeks.org/__call__-in-python/">behave like a
function</a>,
which means that once we instantiate it, we'll be able to register it as
an endpoint on the service.</p>
<p>The method is pretty simple, it takes the data object and sends it to
the model to make a prediction. It then returns a JSONResponse that
contains the prediction and a 200 status code. This response will be
returned by the service if everything goes well. If the model raises an
MLModelSchemaValidationException, then the method will return a
JSONResponse with the 400 status code. For any other exceptions the
method will return a 500 status code.</p>
<p>In the next section we'll see how this class is instantiated in order to
allow the service to host any number of MLModel instances. We'll also
see how we use the input and output models provided by each model object
to create the documentation automatically.</p>
<h1>Application Startup</h1>
<p>At startup, the service does not know anything about which models it
will be hosting, so it needs to load a configuration file to find out.
In the main.py file, the configuration file is loaded from disk with
this code:</p>
<div class="highlight"><pre><span></span><code><span class="k">if</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"REST_CONFIG"</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">file_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"REST_CONFIG"</span><span class="p">]</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">file_path</span> <span class="o">=</span> <span class="s2">"rest_config.yaml"</span>
<span class="k">if</span> <span class="n">path</span><span class="o">.</span><span class="n">exists</span><span class="p">(</span><span class="n">file_path</span><span class="p">)</span> <span class="ow">and</span> <span class="n">path</span><span class="o">.</span><span class="n">isfile</span><span class="p">(</span><span class="n">file_path</span><span class="p">):</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">file_path</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">configuration</span> <span class="o">=</span> <span class="n">yaml</span><span class="o">.</span><span class="n">full_load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="n">configuration</span> <span class="o">=</span> <span class="n">Configuration</span><span class="p">(</span><span class="o">**</span><span class="n">configuration</span><span class="p">)</span>
<span class="n">app</span> <span class="o">=</span> <span class="n">create_app</span><span class="p">(</span><span class="n">configuration</span><span class="o">.</span><span class="n">service_title</span><span class="p">,</span> <span class="n">configuration</span><span class="o">.</span><span class="n">models</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Could not find configuration file '</span><span class="si">{}</span><span class="s2">'."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">file_path</span><span class="p">))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/rest-model-service/blob/0d1705cb62e6a942f90150da3bcf51e3e1265a25/rest_model_service/main.py#L65-L79">here</a>.</p>
<p>The default configuration file path is "rest_config.yaml" which is used
if no other path is provided to the service. To provide an alternative
path, we can set it in the "REST_CONFIG" environment variable. Once we
have the yaml file loaded, we can call the create_app() function which
creates the FastAPI application object.</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">create_app</span><span class="p">(</span><span class="n">service_title</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">models</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="n">Model</span><span class="p">])</span> <span class="o">-></span> <span class="n">FastAPI</span><span class="p">:</span>
<span class="n">app</span><span class="p">:</span> <span class="n">FastAPI</span> <span class="o">=</span> <span class="n">FastAPI</span><span class="p">(</span><span class="n">title</span><span class="o">=</span><span class="n">service_title</span><span class="p">,</span> <span class="n">version</span><span class="o">=</span><span class="n">__version__</span><span class="p">)</span>
<span class="n">app</span><span class="o">.</span><span class="n">add_api_route</span><span class="p">(</span><span class="s2">"/"</span><span class="p">,</span>
<span class="n">get_root</span><span class="p">,</span>
<span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s2">"GET"</span><span class="p">])</span>
<span class="n">app</span><span class="o">.</span><span class="n">add_api_route</span><span class="p">(</span><span class="s2">"/api/models"</span><span class="p">,</span>
<span class="n">get_models</span><span class="p">,</span>
<span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s2">"GET"</span><span class="p">],</span>
<span class="n">response_model</span><span class="o">=</span><span class="n">ModelMetadataCollection</span><span class="p">,</span>
<span class="n">responses</span><span class="o">=</span><span class="p">{</span>
<span class="mi">500</span><span class="p">:</span> <span class="p">{</span><span class="s2">"model"</span><span class="p">:</span> <span class="n">Error</span><span class="p">}</span>
<span class="p">})</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/rest-model-service/blob/0d1705cb62e6a942f90150da3bcf51e3e1265a25/rest_model_service/main.py#L19-L35">here</a>.</p>
<p>The create_app() function first creates the app object with the service
title that we loaded from the configuration file and the version. We
then add two routes to the app: the root route and the model metadata
route. The root route simply reroutes the request to the /docs route
which hosts the auto-generated documentation. The model metadata route
returns metadata for all of the models hosted by the service.</p>
<p>The next thing the function does is actually load the models:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="k">for</span> <span class="n">model</span> <span class="ow">in</span> <span class="n">models</span><span class="p">:</span>
<span class="n">model_manager</span><span class="o">.</span><span class="n">load_model</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">class_path</span><span class="p">)</span>
<span class="k">if</span> <span class="n">model</span><span class="o">.</span><span class="n">create_endpoint</span><span class="p">:</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">)</span>
<span class="n">controller</span> <span class="o">=</span> <span class="n">PredictionController</span><span class="p">(</span><span class="n">model</span><span class="o">=</span><span class="n">model</span><span class="p">)</span>
<span class="n">controller</span><span class="o">.</span><span class="fm">__call__</span><span class="o">.</span><span class="vm">__annotations__</span><span class="p">[</span><span class="s2">"data"</span><span class="p">]</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">input_schema</span>
<span class="n">app</span><span class="o">.</span><span class="n">add_api_route</span><span class="p">(</span><span class="s2">"/api/models/</span><span class="si">{}</span><span class="s2">/prediction"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">),</span>
<span class="n">controller</span><span class="p">,</span>
<span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s2">"POST"</span><span class="p">],</span>
<span class="n">response_model</span><span class="o">=</span><span class="n">model</span><span class="o">.</span><span class="n">output_schema</span><span class="p">,</span>
<span class="n">description</span><span class="o">=</span><span class="n">model</span><span class="o">.</span><span class="n">description</span><span class="p">,</span>
<span class="n">responses</span><span class="o">=</span><span class="p">{</span>
<span class="mi">400</span><span class="p">:</span> <span class="p">{</span><span class="s2">"model"</span><span class="p">:</span> <span class="n">Error</span><span class="p">},</span>
<span class="mi">500</span><span class="p">:</span> <span class="p">{</span><span class="s2">"model"</span><span class="p">:</span> <span class="n">Error</span><span class="p">}</span>
<span class="p">})</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"Skipped creating an endpoint for model:</span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">))</span>
<span class="k">return</span> <span class="n">app</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/rest-model-service/blob/0d1705cb62e6a942f90150da3bcf51e3e1265a25/rest_model_service/main.py#L37-L62">here</a>.</p>
<p>The first thing we do is instantiate the ModelManager singleton. Next,
we'll process each model in the configuration. For each model, we'll
load it into the ModelManage and then create an endpoint for it. An
endpoint is only created for a model if the configuration sets the
"create_endpoint" option to true for that model.</p>
<p>Creating an endpoint for a model is a little tricky because we need to
dynamically create an endpoint and add all of the options that FastAPI
supports.</p>
<p>To create an endpoint for a model, we first need to get a reference to
the model from the ModelManager singleton. We then instantiate the
PredictionController class and pass the reference to the model to the
__init__() method of the class. We now have a function that we can
register with the FastAPI application as an endpoint controller. Before
we can do that, we need to add an annotation to the function that will
allow FastAPI to automatically create documentation for the endpoint.
We'll annotate the controller function with the pydantic type that the
model accepts as input. Now we are ready to register the function as a
controller, when we do that we also provide the FastAPI app with the
HTTP method, response pydantic model, description, and error response
models. All of these options give the FastAPI app information about the
endpoint which will be used later to auto-generate the documentation.</p>
<h1>Creating a Package</h1>
<p>This service will be most useful when it can be "added on" to a model
project so that it can provide the deployment functionality for a
machine learning model without becoming part of the codebase. If we take
this approach, then the rest_model_service package is installed in the
python environment and it will live as a dependency of the ml model
package.</p>
<p>To enable all of this, the rest_model_service package is available as
a package that can be installed from PyPi using the pip package manager.
To install the package into your project you can execute this command:</p>
<div class="highlight"><pre><span></span><code>pip install rest_model_service
</code></pre></div>
<p>Once the service package is installed, we can use it within an ML model
project to create a RESTful service for the model.</p>
<h1>Using the Service</h1>
<p>In order to try out the service we'll need a model that follows the
MLModel interface. There is a simple mocked model in the tests.mocks
module that we'll use to try out the service:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">IrisModelInput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">sepal_length</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">gt</span><span class="o">=</span><span class="mf">5.0</span><span class="p">,</span> <span class="n">lt</span><span class="o">=</span><span class="mf">8.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Length of the sepal of the flower."</span><span class="p">)</span>
<span class="n">sepal_width</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">gt</span><span class="o">=</span><span class="mf">2.0</span><span class="p">,</span> <span class="n">lt</span><span class="o">=</span><span class="mf">6.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Width of the sepal of the flower."</span><span class="p">)</span>
<span class="n">petal_length</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">gt</span><span class="o">=</span><span class="mf">1.0</span><span class="p">,</span> <span class="n">lt</span><span class="o">=</span><span class="mf">6.8</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Length of the petal of the flower."</span><span class="p">)</span>
<span class="n">petal_width</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">gt</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">lt</span><span class="o">=</span><span class="mf">3.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"Width of the petal of the flower."</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">Species</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="n">iris_setosa</span> <span class="o">=</span> <span class="s2">"Iris setosa"</span>
<span class="n">iris_versicolor</span> <span class="o">=</span> <span class="s2">"Iris versicolor"</span>
<span class="n">iris_virginica</span> <span class="o">=</span> <span class="s2">"Iris virginica"</span>
<span class="k">class</span> <span class="nc">IrisModelOutput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">species</span><span class="p">:</span> <span class="n">Species</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"Predicted species of the flower."</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">IrisModel</span><span class="p">(</span><span class="n">MLModel</span><span class="p">):</span>
<span class="n">display_name</span> <span class="o">=</span> <span class="s2">"Iris Model"</span>
<span class="n">qualified_name</span> <span class="o">=</span> <span class="s2">"iris_model"</span>
<span class="n">description</span> <span class="o">=</span> <span class="s2">"Model for predicting the species of a flower based on its measurements."</span>
<span class="n">version</span> <span class="o">=</span> <span class="s2">"1.0.0"</span>
<span class="n">input_schema</span> <span class="o">=</span> <span class="n">IrisModelInput</span>
<span class="n">output_schema</span> <span class="o">=</span> <span class="n">IrisModelOutput</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">pass</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="k">return</span> <span class="n">IrisModelOutput</span><span class="p">(</span><span class="n">species</span><span class="o">=</span><span class="s2">"Iris setosa"</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/rest-model-service/blob/0d1705cb62e6a942f90150da3bcf51e3e1265a25/tests/mocks.py#L7-L38">here</a>.</p>
<p>The mock model class works just like any other MLModel class, but it
always returns a prediction of "Iris setosa". As you can see, the model
references the IrisModelInput and IrisModelOutput pydantic models for
its input and output.</p>
<p>Once we have a model, we'll need a configuration file to your project
that will be used by the model service to find the models that you want
to deploy. The configuration file should look like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">service_title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">REST Model Service</span><span class="w"></span>
<span class="nt">models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="nt">qualified_name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">iris_model</span><span class="w"></span>
<span class="w"> </span><span class="nt">class_path</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">tests.mocks.IrisModel</span><span class="w"></span>
<span class="w"> </span><span class="nt">create_endpoint</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">true</span><span class="w"></span>
</code></pre></div>
<p>This file can be found in the examples folder
<a href="https://github.com/schmidtbri/rest-model-service/blob/0d1705cb62e6a942f90150da3bcf51e3e1265a25/examples/rest_config.yaml#L1-L5">here</a>.</p>
<p>To start up the service locally, we need to point the service at the
configuration file using an environment variable and then execute the
uvicorn command:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>examples/rest_config.yaml
uvicorn rest_model_service.main:app --reload
</code></pre></div>
<p>The service should start and we can view the documentation page on port
8000:</p>
<p><img alt="Documentation" src="https://www.tekhnoal.com/documentation.png" width="100%"></p>
<p>As you can see, the root endpoint and model metadata endpoint are part
of the API. We also have an automatically generated endpoint for the
iris_model mocked model that we added to the service through the
configuration. The model's input and output data models are also added
to documentation:</p>
<p><img alt="Model Endpoint" src="https://www.tekhnoal.com/model_endpoint.png" width="100%"></p>
<p>We can even try a prediction out:</p>
<p><img alt="Prediction" src="https://www.tekhnoal.com/prediction.png" width="100%"></p>
<p>Of course, the prediction will always be the same because it's a mocked
model.</p>
<h1>Generating the OpenAPI Contract</h1>
<p>The FastAPI application actually generates the OpenAPI service
specification at runtime and it is available for download from the
documentation page. However, we'd like to generate the specification and
save it to a file in source control. To do this we can use a script
provided by the rest_model_service package called "generate_openapi".
The script is installed with the package and is registered to be used
within the environment where the package is installed. Here is how to
use it:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">REST_CONFIG</span><span class="o">=</span>examples/rest_config.yaml
generate_openapi --output_file<span class="o">=</span>example.yaml
</code></pre></div>
<p>The script uses the same configuration that the service uses, but it
doesn't run the webservice. It instead uses the FastAPI framework to
generate the contract and saves it to the output file.</p>
<p>The generated contract will look like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">info</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">title</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">REST Model Service</span><span class="w"></span>
<span class="w"> </span><span class="nt">version</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain"><version_placeholder></span><span class="w"></span>
<span class="nt">openapi</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">3.0.2</span><span class="w"></span>
<span class="nt">paths</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">/</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">get</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Root of API.</span><span class="w"></span>
<span class="w"> </span><span class="nt">operationId</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">get_root__get</span><span class="w"></span>
<span class="w"> </span><span class="nt">responses</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="s">'200'</span><span class="p p-Indicator">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">content</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">application/json</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">schema</span><span class="p">:</span><span class="w"> </span><span class="p p-Indicator">{}</span><span class="w"></span>
<span class="w"> </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Successful Response</span><span class="w"></span>
<span class="w"> </span><span class="nt">summary</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">Get Root</span><span class="w"></span>
<span class="w"> </span><span class="nt">/api/models</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">get</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">List of models available.</span><span class="w"></span>
<span class="nn">...</span><span class="w"></span>
</code></pre></div>
<h1>Closing</h1>
<p>In this blog post we've shown how to create a web service that is easy
to install, configure and deploy that is able to deploy any machine
learning model that we throw at it. By using the MLModel base class, any
model can be made to work with the service. When deploying machine
learning models to production systems, it's a common practice to create
a custom service that "wraps" around the model code and creates an
interface that other systems can use to access the model. With the
approach described in this blog post, the service is created
automatically by using the interface definition provided by the model
itself. Furthermore, the documentation is also created automatically by
using the tooling provided by FastAPI. Lastly, we've made the service
easy to add to any project by putting the package into the Pypi
repository, from where it can be installed by using a simple "pip
install" command.</p>
<p>The service currently does not allow any extra code that is not model
code to be hosted by the service. When deploying a model into a
production setting, we often have extra logic that we need to deploy
alongside the model that is not technically part of the model. This is
usually called the "business logic" of the solution. The service
currently does not support the ability to add the business logic
alongside the model logic. Granted, it is possible to throw the business
logic into the model class and just deploy, but this combines the code
together into one class and it makes it harder to test the code and
reason about it correctly. To fix this shortcoming, we can add "plugin
points" that allow us to add our own logic before and after the model
executes where we can add the business logic.</p>
<p>One of the ways in which we could improve the service in the future is
to allow more configuration of the models when they are instantiated by
the service.It's not possible to customize the model when it is created
by the service at startup time right now. In this future, it would be
nice to allow the configuration of the service to hold parameters that
would be passed to the model classes when they are instantiated.</p>Introducing the ml_base Package2021-02-22T07:54:00-05:002021-02-22T07:54:00-05:00Brian Schmidttag:www.tekhnoal.com,2021-02-22:/introducing-ml-base-package.html<p>The ml_base package defines a common set of base classes that are useful for working with machine learning model prediction code. The base classes define a set of interfaces that help to write ML code that is reusable and testable. The core of the ml_base package is the MLModel class which defines a simple interface for doing machine learning model prediction. I this blog post, we'll show how to use the MLModel class.</p><h1>Introducing the ml_base Package</h1>
<p>These examples run within an Jupyter notebook session. To clear out the results of cells that we don't want to see we'll use the clear_output() function provided by Jupyter:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">IPython.display</span> <span class="kn">import</span> <span class="n">clear_output</span>
</code></pre></div>
<p>To get started we'll install the ml_base package:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">ml_base</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<h2>Creating a Simple Model</h2>
<p>To show how to work with the MLModel base class we'll create a simple model that we can make predictions with. We'll use the scikit-learn library, so we'll need to install it:</p>
<div class="highlight"><pre><span></span><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">scikit</span><span class="o">-</span><span class="n">learn</span>
<span class="n">clear_output</span><span class="p">()</span>
</code></pre></div>
<p>Now we can write some code to train a model:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">sklearn</span> <span class="kn">import</span> <span class="n">datasets</span>
<span class="kn">from</span> <span class="nn">sklearn</span> <span class="kn">import</span> <span class="n">svm</span>
<span class="kn">import</span> <span class="nn">pickle</span>
<span class="c1"># loading the Iris dataset</span>
<span class="n">iris</span> <span class="o">=</span> <span class="n">datasets</span><span class="o">.</span><span class="n">load_iris</span><span class="p">()</span>
<span class="c1"># instantiating an SVM model from scikit-learn</span>
<span class="n">svm_model</span> <span class="o">=</span> <span class="n">svm</span><span class="o">.</span><span class="n">SVC</span><span class="p">(</span><span class="n">gamma</span><span class="o">=</span><span class="mf">1.0</span><span class="p">,</span> <span class="n">C</span><span class="o">=</span><span class="mf">1.0</span><span class="p">)</span>
<span class="c1"># fitting the model</span>
<span class="n">svm_model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">iris</span><span class="o">.</span><span class="n">data</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">],</span> <span class="n">iris</span><span class="o">.</span><span class="n">target</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span>
<span class="c1"># serializing the model and saving it</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"svc_model.pickle"</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">)</span>
<span class="n">pickle</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">svm_model</span><span class="p">,</span> <span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</code></pre></div>
<h2>Creating a Wrapper Class for Your Model</h2>
<p>Now that we have a model object, we'll define a class that implements the prediction functionality for the code:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">os</span>
<span class="kn">from</span> <span class="nn">numpy</span> <span class="kn">import</span> <span class="n">array</span>
<span class="k">class</span> <span class="nc">IrisModel</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">dir_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">abspath</span><span class="p">(</span><span class="s1">''</span><span class="p">)</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir_path</span><span class="p">,</span> <span class="s2">"svc_model.pickle"</span><span class="p">),</span> <span class="s1">'rb'</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">:</span> <span class="nb">dict</span><span class="p">):</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">array</span><span class="p">([</span><span class="n">data</span><span class="p">[</span><span class="s2">"sepal_length"</span><span class="p">],</span> <span class="n">data</span><span class="p">[</span><span class="s2">"sepal_width"</span><span class="p">],</span> <span class="n">data</span><span class="p">[</span><span class="s2">"petal_length"</span><span class="p">],</span> <span class="n">data</span><span class="p">[</span><span class="s2">"petal_width"</span><span class="p">]])</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="n">y_hat</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X</span><span class="p">)[</span><span class="mi">0</span><span class="p">])</span>
<span class="n">targets</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'setosa'</span><span class="p">,</span> <span class="s1">'versicolor'</span><span class="p">,</span> <span class="s1">'virginica'</span><span class="p">]</span>
<span class="n">species</span> <span class="o">=</span> <span class="n">targets</span><span class="p">[</span><span class="n">y_hat</span><span class="p">]</span>
<span class="k">return</span> <span class="p">{</span><span class="s2">"species"</span><span class="p">:</span> <span class="n">species</span><span class="p">}</span>
</code></pre></div>
<p>The class above wraps the pickled model object and makes the model easier to use by converting the inputs and outputs.
To use the model, all we need to do is this:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="p">{</span>
<span class="s2">"sepal_length"</span><span class="p">:</span><span class="mf">1.0</span><span class="p">,</span>
<span class="s2">"sepal_width"</span><span class="p">:</span><span class="mf">1.1</span><span class="p">,</span>
<span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.2</span><span class="p">,</span>
<span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">1.3</span><span class="p">})</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{'species': 'virginica'}
</code></pre></div>
<h2>Creating an MLModel Class for Your Model</h2>
<p>The model is already much easier to use because it provides the prediction from a class. The user of the model doesn't
need to worry about loading the pickled model object, or converting the model's input into a numpy array. However, we
are still not using the MLModel abstract base class, now we'll implement a part of the MLModel's interface to show how
it works:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">ml_base</span> <span class="kn">import</span> <span class="n">MLModel</span>
<span class="k">class</span> <span class="nc">IrisModel</span><span class="p">(</span><span class="n">MLModel</span><span class="p">):</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">display_name</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="s2">"Iris Model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">qualified_name</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="s2">"iris_model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">description</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="s2">"A model to predict the species of a flower based on its measurements."</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">version</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="s2">"1.0.0"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">dir_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">abspath</span><span class="p">(</span><span class="s1">''</span><span class="p">)</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir_path</span><span class="p">,</span> <span class="s2">"svc_model.pickle"</span><span class="p">),</span> <span class="s1">'rb'</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">:</span> <span class="nb">dict</span><span class="p">):</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">array</span><span class="p">([</span><span class="n">data</span><span class="p">[</span><span class="s2">"sepal_length"</span><span class="p">],</span> <span class="n">data</span><span class="p">[</span><span class="s2">"sepal_width"</span><span class="p">],</span> <span class="n">data</span><span class="p">[</span><span class="s2">"petal_length"</span><span class="p">],</span> <span class="n">data</span><span class="p">[</span><span class="s2">"petal_width"</span><span class="p">]])</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="n">y_hat</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X</span><span class="p">)[</span><span class="mi">0</span><span class="p">])</span>
<span class="n">targets</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'setosa'</span><span class="p">,</span> <span class="s1">'versicolor'</span><span class="p">,</span> <span class="s1">'virginica'</span><span class="p">]</span>
<span class="n">species</span> <span class="o">=</span> <span class="n">targets</span><span class="p">[</span><span class="n">y_hat</span><span class="p">]</span>
<span class="k">return</span> <span class="p">{</span><span class="s2">"species"</span><span class="p">:</span> <span class="n">species</span><span class="p">}</span>
</code></pre></div>
<p>The MLModel base class defines a set of properties that must be provided by any class that inherits from it. Because the IrisModel class now provides this metadata about the model, we can access it directly from the model object like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="nb">print</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>iris_model
</code></pre></div>
<p>The qualified name of the model uniquely identifies the instance of the model within the system. Right now the qualified name is hardcoded in the code of the model's class, but this can be made more dynamic in the future. The qualified name should also be a string that is easy to embed in a URL, so it shouldn't have spaces or special characters.</p>
<p>The model's display name is also available from the model object:</p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">display_name</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Iris Model
</code></pre></div>
<p>The display name of a model should be a string that looks good in a user interface.</p>
<p>The model description is also available from the model object:</p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">description</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>A model to predict the species of a flower based on its measurements.
</code></pre></div>
<p>The model version is also available as a string from the model object:</p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">version</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">1.0.0</span><span class="w"></span>
</code></pre></div>
<p>As you can see, we didn't implement the input_schema and output_schema properties above, we'll add those next.</p>
<h2>Adding Schemas to Your Model</h2>
<p>To add schema information to the model class, we'll use the pydantic package. The pydantic package allows us to state the schema requirements of the model's input and output programatically as Python classes:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">pydantic</span> <span class="kn">import</span> <span class="n">BaseModel</span><span class="p">,</span> <span class="n">Field</span>
<span class="kn">from</span> <span class="nn">pydantic</span> <span class="kn">import</span> <span class="n">ValidationError</span>
<span class="kn">from</span> <span class="nn">enum</span> <span class="kn">import</span> <span class="n">Enum</span>
<span class="k">class</span> <span class="nc">ModelInput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">sepal_length</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">gt</span><span class="o">=</span><span class="mf">5.0</span><span class="p">,</span> <span class="n">lt</span><span class="o">=</span><span class="mf">8.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"The length of the sepal of the flower."</span><span class="p">)</span>
<span class="n">sepal_width</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">gt</span><span class="o">=</span><span class="mf">2.0</span><span class="p">,</span> <span class="n">lt</span><span class="o">=</span><span class="mf">6.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"The width of the sepal of the flower."</span><span class="p">)</span>
<span class="n">petal_length</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">gt</span><span class="o">=</span><span class="mf">1.0</span><span class="p">,</span> <span class="n">lt</span><span class="o">=</span><span class="mf">6.8</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"The length of the petal of the flower."</span><span class="p">)</span>
<span class="n">petal_width</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">gt</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">lt</span><span class="o">=</span><span class="mf">3.0</span><span class="p">,</span> <span class="n">description</span><span class="o">=</span><span class="s2">"The width of the petal of the flower."</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">Species</span><span class="p">(</span><span class="nb">str</span><span class="p">,</span> <span class="n">Enum</span><span class="p">):</span>
<span class="n">iris_setosa</span> <span class="o">=</span> <span class="s2">"Iris setosa"</span>
<span class="n">iris_versicolor</span> <span class="o">=</span> <span class="s2">"Iris versicolor"</span>
<span class="n">iris_virginica</span> <span class="o">=</span> <span class="s2">"Iris virginica"</span>
<span class="k">class</span> <span class="nc">ModelOutput</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
<span class="n">species</span><span class="p">:</span> <span class="n">Species</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s2">"The predicted species of the flower."</span><span class="p">)</span>
</code></pre></div>
<p>The ModelInput class inherits from the pydantic BaseModel class and it defines four required fields, all of them floating point numbers. The pydantic package allows for defining upper bounds and lower bounds for the values accepted by each field, and also a description for the field.</p>
<p>The ModelOutput is made up of a single fields, which is an enumerated string that contains the predicted species of the flower.</p>
<p>Now that we have the ModelInput and ModelOutput schemas defined as pydantic BaseModel classes, we'll add them to the IrisModel class by returning them from the input_schema and output_schema properties:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">ml_base.ml_model</span> <span class="kn">import</span> <span class="n">MLModel</span><span class="p">,</span> <span class="n">MLModelSchemaValidationException</span>
<span class="k">class</span> <span class="nc">IrisModel</span><span class="p">(</span><span class="n">MLModel</span><span class="p">):</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">display_name</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="s2">"Iris Model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">qualified_name</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="s2">"iris_model"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">description</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="s2">"A model to predict the species of a flower based on its measurements."</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">version</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="s2">"1.0.0"</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">ModelInput</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">ModelOutput</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">dir_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">abspath</span><span class="p">(</span><span class="s1">''</span><span class="p">)</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir_path</span><span class="p">,</span> <span class="s2">"svc_model.pickle"</span><span class="p">),</span> <span class="s1">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">f</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">:</span> <span class="n">ModelInput</span><span class="p">):</span>
<span class="c1"># creating a numpy array using the fields in the input object</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">array</span><span class="p">([</span><span class="n">data</span><span class="o">.</span><span class="n">sepal_length</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">sepal_width</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">petal_length</span><span class="p">,</span>
<span class="n">data</span><span class="o">.</span><span class="n">petal_width</span><span class="p">])</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="c1"># making a prediction, at this point its a number</span>
<span class="n">y_hat</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X</span><span class="p">)[</span><span class="mi">0</span><span class="p">])</span>
<span class="c1"># converting the prediction from a number to a string</span>
<span class="n">targets</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"Iris setosa"</span><span class="p">,</span> <span class="s2">"Iris versicolor"</span><span class="p">,</span> <span class="s2">"Iris virginica"</span><span class="p">]</span>
<span class="n">species</span> <span class="o">=</span> <span class="n">targets</span><span class="p">[</span><span class="n">y_hat</span><span class="p">]</span>
<span class="c1"># returning the prediction inside an object</span>
<span class="k">return</span> <span class="n">ModelOutput</span><span class="p">(</span><span class="n">species</span><span class="o">=</span><span class="n">species</span><span class="p">)</span>
</code></pre></div>
<p>Notice that we are also using the pydantic models to validate the input before prediction and to
create an object that will be returned from the model's predict() method.</p>
<p>If we use the model class now, we'll get this result:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">ModelInput</span><span class="p">(</span>
<span class="n">sepal_length</span><span class="o">=</span><span class="mf">6.0</span><span class="p">,</span>
<span class="n">sepal_width</span><span class="o">=</span><span class="mf">2.1</span><span class="p">,</span>
<span class="n">petal_length</span><span class="o">=</span><span class="mf">1.2</span><span class="p">,</span>
<span class="n">petal_width</span><span class="o">=</span><span class="mf">1.3</span><span class="p">))</span>
<span class="n">prediction</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>ModelOutput(species=<Species.iris_virginica: 'Iris virginica'>)
</code></pre></div>
<p>By adding input and output schemas to the model, we can automate many other operations later. Also, we can query
the model object itself for the schema. The pydantic package is able to create JSON schema from the fields in the input and output schema objects of the model:</p>
<div class="highlight"><pre><span></span><code><span class="n">model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="n">model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>{'title': 'ModelInput',
'type': 'object',
'properties': {'sepal_length': {'title': 'Sepal Length',
'description': 'The length of the sepal of the flower.',
'exclusiveMinimum': 5.0,
'exclusiveMaximum': 8.0,
'type': 'number'},
'sepal_width': {'title': 'Sepal Width',
'description': 'The width of the sepal of the flower.',
'exclusiveMinimum': 2.0,
'exclusiveMaximum': 6.0,
'type': 'number'},
'petal_length': {'title': 'Petal Length',
'description': 'The length of the petal of the flower.',
'exclusiveMinimum': 1.0,
'exclusiveMaximum': 6.8,
'type': 'number'},
'petal_width': {'title': 'Petal Width',
'description': 'The width of the petal of the flower.',
'exclusiveMinimum': 0.0,
'exclusiveMaximum': 3.0,
'type': 'number'}},
'required': ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']}
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="n">model</span><span class="o">.</span><span class="n">output_schema</span><span class="o">.</span><span class="n">schema</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'ModelOutput'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'object'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'properties'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'species'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/Species'</span><span class="p">}},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'required'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'species'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'definitions'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'Species'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Species'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'An enumeration.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'Iris setosa'</span><span class="p">,</span><span class="w"> </span><span class="s1">'Iris versicolor'</span><span class="p">,</span><span class="w"> </span><span class="s1">'Iris virginica'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">}}}</span><span class="w"></span>
</code></pre></div>
<p>Although it is not required to use the pydantic package to create model schemas, it is recommended. The pydantic
package is installed as a dependency of the ml_base package.</p>
<h2>Using the ModelManager Class</h2>
<p>The ModelManager class is provided to help manage model objects. It is a singleton class that is designed to enable
model instances to be instantiated once during the lifecycle of a process and accessed many times:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">ml_base.utilities</span> <span class="kn">import</span> <span class="n">ModelManager</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
</code></pre></div>
<p>Because it is a singleton object, a reference to the same object is returned no matter how many times we instantiate it:</p>
<div class="highlight"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="nb">id</span><span class="p">(</span><span class="n">model_manager</span><span class="p">))</span>
<span class="n">another_model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="nb">print</span><span class="p">(</span><span class="nb">id</span><span class="p">(</span><span class="n">another_model_manager</span><span class="p">))</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="mf">4505980208</span><span class="w"></span>
<span class="mf">4505980208</span><span class="w"></span>
</code></pre></div>
<p>You can add model instances to the ModelManager singleton by asking it to instantiate the model class:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_manager</span><span class="o">.</span><span class="n">load_model</span><span class="p">(</span><span class="s2">"__main__.IrisModel"</span><span class="p">)</span>
</code></pre></div>
<p>The load_model() method is able to find the MLModel class that we defined above and instantiate it, after that it stores a reference to the instance internally.</p>
<p>The ModelManager is also able to save references to model instances that were instantiated in some other way by using the add_model() method:</p>
<div class="highlight"><pre><span></span><code><span class="n">another_iris_model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">model_manager</span><span class="o">.</span><span class="n">add_model</span><span class="p">(</span><span class="n">another_iris_model</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">ValueError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>A model with the same qualified name is already in the ModelManager singleton.
</code></pre></div>
<p>In this case, the ModelManager did not save the instance of the IrisModel because we already had an instance of the model. The models are uniquely identified by their qualified name properties.</p>
<p>The ModelManager instance can list the models that it contains with the get_models() method, the details of the instance of IrisModel that we just created are returned:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>[{'display_name': 'Iris Model',
'qualified_name': 'iris_model',
'description': 'A model to predict the species of a flower based on its measurements.',
'version': '1.0.0'}]
</code></pre></div>
<p>The ModelManager instance can return the metadata of any of the models. The metadata includes the input and output schemas as well:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_manager</span><span class="o">.</span><span class="n">get_model_metadata</span><span class="p">(</span><span class="s2">"iris_model"</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="s1">'display_name'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Iris Model'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'qualified_name'</span><span class="p">:</span><span class="w"> </span><span class="s1">'iris_model'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'A model to predict the species of a flower based on its measurements.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'version'</span><span class="p">:</span><span class="w"> </span><span class="s1">'1.0.0'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'input_schema'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'ModelInput'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'object'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'properties'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'sepal_length'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Sepal Length'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'The length of the sepal of the flower.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'exclusiveMinimum'</span><span class="p">:</span><span class="w"> </span><span class="mf">5.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'exclusiveMaximum'</span><span class="p">:</span><span class="w"> </span><span class="mf">8.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'number'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'sepal_width'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Sepal Width'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'The width of the sepal of the flower.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'exclusiveMinimum'</span><span class="p">:</span><span class="w"> </span><span class="mf">2.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'exclusiveMaximum'</span><span class="p">:</span><span class="w"> </span><span class="mf">6.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'number'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'petal_length'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Petal Length'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'The length of the petal of the flower.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'exclusiveMinimum'</span><span class="p">:</span><span class="w"> </span><span class="mf">1.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'exclusiveMaximum'</span><span class="p">:</span><span class="w"> </span><span class="mf">6.8</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'number'</span><span class="p">},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'petal_width'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Petal Width'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'The width of the petal of the flower.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'exclusiveMinimum'</span><span class="p">:</span><span class="w"> </span><span class="mf">0.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'exclusiveMaximum'</span><span class="p">:</span><span class="w"> </span><span class="mf">3.0</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'number'</span><span class="p">}},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'required'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'sepal_length'</span><span class="p">,</span><span class="w"> </span><span class="s1">'sepal_width'</span><span class="p">,</span><span class="w"> </span><span class="s1">'petal_length'</span><span class="p">,</span><span class="w"> </span><span class="s1">'petal_width'</span><span class="p">]},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'output_schema'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'ModelOutput'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'object'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'properties'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'species'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'$ref'</span><span class="p">:</span><span class="w"> </span><span class="s1">'#/definitions/Species'</span><span class="p">}},</span><span class="w"></span>
<span class="w"> </span><span class="s1">'required'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'species'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'definitions'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'Species'</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s1">'title'</span><span class="p">:</span><span class="w"> </span><span class="s1">'Species'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'description'</span><span class="p">:</span><span class="w"> </span><span class="s1">'An enumeration.'</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="s1">'enum'</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s1">'Iris setosa'</span><span class="p">,</span><span class="w"> </span><span class="s1">'Iris versicolor'</span><span class="p">,</span><span class="w"> </span><span class="s1">'Iris virginica'</span><span class="p">],</span><span class="w"></span>
<span class="w"> </span><span class="s1">'type'</span><span class="p">:</span><span class="w"> </span><span class="s1">'string'</span><span class="p">}}}}</span><span class="w"></span>
</code></pre></div>
<p>The ModelManager can return a reference to the instance of any model that it is holding:</p>
<div class="highlight"><pre><span></span><code><span class="n">iris_model</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model</span><span class="p">(</span><span class="s2">"iris_model"</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">iris_model</span><span class="o">.</span><span class="n">display_name</span><span class="p">)</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>Iris Model
</code></pre></div>
<p>The instance is identified by the qualified name of the model.</p>
<p>Lastly, a model instance can be removed by calling the remove_model() method:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_manager</span><span class="o">.</span><span class="n">remove_model</span><span class="p">(</span><span class="s2">"iris_model"</span><span class="p">)</span>
<span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">()</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>[]
</code></pre></div>
<p>To clear the ModelManager instance, you can call the clear_instance() method:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_manager</span><span class="o">.</span><span class="n">clear_instance</span><span class="p">()</span>
</code></pre></div>
<p>To create a new singleton you have to instantiate the ModelManager again:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
</code></pre></div>10 Ways to Deploy a Machine Learning Model2020-10-28T08:00:00-05:002020-10-28T08:00:00-05:00Brian Schmidttag:www.tekhnoal.com,2020-10-28:/10-ways-to-deploy-an-ml-model.html<p>In previous blog posts we've seen how it is possible to deploy the same model in ten different ways. The model itself was developed one time and released as a package, which was then used in each deployment. These blog posts started as an exercise in finding new and interesting ways to deploy an ML model, so we decided to write this blog post about some of the things that we've learned along the way. In order to be able to deploy the same model in 10 different ways, we needed to build the model so that it was not incompatible with all the different ways we wanted to deploy it. We also needed to make it easy to install and to make sure that the model published metadata about itself. All of these features of the model became very important once we needed to deploy it into a real software system.</p><p>This blog post builds on the ideas started in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog
posts</a>.</p>
<p>This blog post also references previous blog posts in which I
deployed the same ML model in several different ways. I deployed
the model as a batch job in this <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog
post</a>,
as a task queue in this <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">blog
post</a>,
inside an AWS Lambda in this <a href="https://www.tekhnoal.com/lambda-ml-model-deployment.html">blog
post</a>,
as a Kafka streaming application in this <a href="https://www.tekhnoal.com/streaming-ml-model-deployment.html">blog
post</a>,
a gRPC service in this <a href="https://www.tekhnoal.com/grpc-ml-model-deployment.html">blog
post</a>,
as a MapReduce job in this <a href="https://www.tekhnoal.com/map-reduce-ml-model-deployment.html">blog
post</a>,
as a Websocket service in this <a href="https://www.tekhnoal.com/websocket-ml-model-deployment.html">blog
post</a>,
as a ZeroRPC service in this <a href="https://www.tekhnoal.com/zerorpc-ml-model-deployment.html">blog
post</a>,
and as an Apache Beam job in this <a href="https://www.tekhnoal.com/apache-beam-ml-model-deployment.html">blog
post</a>.</p>
<h1>Introduction</h1>
<p>In previous blog posts we've seen how it is possible to deploy the same
model in ten different ways. The model itself was developed one time and
released as a package, which was then used in each deployment. These
blog posts started as an exercise in finding new and interesting ways to
deploy an ML model, so we decided to write this blog post about some of
the things that we've learned along the way.</p>
<p>In order to be able to deploy the same model in 10 different ways, we
needed to build the model so that it was not incompatible with all the
different ways we wanted to deploy it. We also needed to make it easy to
install and to make sure that the model published metadata about itself.
All of these features of the model became very important once we needed
to deploy it into a real software system.</p>
<p>In a <a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous blog
post</a>,
we developed a model that we called the "iris_model". This model was
designed for the purposes of the blog posts that we planned to write
later on, so it followed several best practices that we will be
describing in this blog post. To make sure that the model was compatible
with every deployment option we wanted to pursue, we needed to build it
to work as a software component, as a software library, and as a
software package. In this blog post we'll describe how and why these
approaches make it easier to deploy the model.</p>
<p>To be able to abstract away the details of an ML model from the code
that is using it, we developed the MLModel base class in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">these</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">blog
posts</a>.
The base class is used to create a standard interface for the prediction
code of an ML model, which makes it easier to deploy the model. This
approach made it possible to write the model deployment code in such a
way that it can support any model that implements the MLModel interface.
This approach can be thought of as applying the strategy design pattern
to machine learning models. In this blog post we'll describe how the
strategy pattern is useful in ML model deployments.</p>
<p>When we started implementing all of the different deployments for the
model, we started seeing patterns around the way that the model is
accessible to its clients. These patterns coalesced into a few different
classes of model deployments which help to talk about the strengths and
weaknesses of each approach to deploying the model. In this blog post,
we'll describe an ontology that can help developers to talk about and
choose the best approach to deploying an ML model.</p>
<h1>ML Models as Software Components</h1>
<p>To create an ML model that is easy to deploy, we need to build it as a
software component. A software component is simply a small part of a
bigger software system that can be easily isolated from the rest of the
system. That is to say, the component is not deeply tied to the rest of
the system and it exposes an interface so that the rest of the system
can access it. A software component is designed to fulfill a small part
of the requirements of a larger software system, and to be easy to
integrate with other software components in the system. Good software
components are designed to be reused in many contexts and must follow
good design patterns to achieve this goal.</p>
<p>One of the most important parts of a software component is the public
API of the component. The API of the IrisModel class has proven to be
very simple and adaptable to a wide variety of technologies. For
example, when we deployed the IrisModel as a <a href="https://www.tekhnoal.com/websocket-ml-model-deployment.html">Websocket
service</a>,
we didn't need to rewrite any of the model code to adapt it to the model
component's API. The reason for this is that the <a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/iris_model/iris_predict.py#L10-L67">IrisModel
class</a>
inherits from the <a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/iris_model/iris_predict.py#L10-L67">MLModel
interface</a>.
This interface has a few requirements: your model must instantiate
itself, it must receive prediction requests, and it must publish certain
metadata about itself. By creating a standard interface around these
requirements, the MLModel interface makes it possible to deploy a wide
range of machine learning models in the same way.</p>
<p>When we designed the MLModel interface we made sure that it would not
enforce any specific technology on the user. For example, there is no
requirement that says that the models that implement the MLModel
interface must use a specific serialization and deserialization
standard. In all of the blog posts where we deployed the iris_model
package we used JSON for serialization and deserialization, but this was
an implementation detail that can easily be changed since the model code
itself does not do any serialization or deserialization. Another
important aspect of the design is the fact that the MLModel interface
does not enforce any particular integration pattern on the code. For
example, we were able to create a <a href="https://www.tekhnoal.com/using-ml-model-abc.html">RESTful
service</a>
and <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">a batch
job</a>
with the same model. In fact, the choice of deployment technology had no
effect on the model codebase. This makes it possible to reuse the same
model in many different contexts.</p>
<p>Certain technologies required advanced knowledge of the schema of the
data that the model component would receive and send back. For example,
the <a href="https://www.tekhnoal.com/grpc-ml-model-deployment.html">gRPC
service</a>
required that we compile a protocol buffer from the input and output
schemas of the model. In this case we were able to isolate the
requirements of the deployment from the model itself by leveraging the
schema metadata provided by the model. In other cases, the schema
metadata was only useful for documentation purposes, since a user of the
model would need to know about the model's input and about schemas to be
able to use it. Because we return schema information from the API of the
ML model software component, we were able to handle this situation
smoothly.</p>
<h1>ML Models as Libraries</h1>
<p>To create an ML model that is easy to deploy, we must build it so that
it works as a software library. A software library is a collection of
reusable software components that can be used in many different
contexts. A library is designed and built so that it is reusable.</p>
<p>By treating a machine learning model as a library we gain many different
benefits, for example, models can easily be reused in many different
services and applications without having to copy and paste the model
code and parameters. There is no need to embed an ML model inside of a
codebase in such a way that it cannot be reused somewhere else because
the library can be installed into a project. When we used the
iris_model library in our deployments, all we had to do was execute
"from iris_model.iris_predict import IrisModel" and the model would be
available to be used.</p>
<p>Another benefit that we gain when we treat ML models as libraries is
that it is easy to version them. Since libraries are built and released
many times, everyone understands how to version them and release them
for use by other developers. The semantic versioning standard has been
used widely in the software world and we used it to version the
iris_model package. One of the main benefits of a strong versioning
standard for ML models is that everyone understands that the ML model
will be evolving in the future, and that they can access newer versions
of the model by installing a newer version of the library.</p>
<p>By thinking about ML models as libraries we break the pattern of making
custom models for very specific use cases. If we are going to spend the
time and effort to build a complex ML model, why not make it easy to
reuse in different contexts? This requires a bit of realignment in most
cases, but it is certainly possible.</p>
<h1>ML Models as Packages</h1>
<p>To create an ML model that is easy to deploy, we must build it so that
is a software package. A software package is a distributable file that
contains the necessary files to install a software component or library
in the programming environment. Software packages are usually managed
using package managers. Software libraries are usually released as
packages as well, to make them easy to install.</p>
<p>One of the most important factors that allowed us to deploy the
IrisModel model in 10 different ways is the fact that the model code is
isolated inside of a Python package. The first two blog posts were
concerned with creating a model codebase that could be installed into
any python environment. Once we could install the model as a python
package with the pip install command, it was easy to reuse the same
model in many different contexts.</p>
<p>An important part of this approach is the fact that we can install all
of the dependencies of the model package automatically when the model
package is installed. Often, a model that runs in one person's computer
won't run in another person's computer because dependency management is
not taken care of. In order to create a python package, the dependencies
of the package must be listed in the setup.py file of the Python
project, because of this the ML model is a lot easier to work with and
can be easily installed by anybody. For example, the iris_model package
lists the exact version of scikit-learn that it needs, which takes the
guesswork out of installing and using it.</p>
<p>Lastly, by distributing the ML model as a package, we're able to
download and install the model parameters along with the model code.
Oftentimes, an ML model is just a file that contains serialized model
parameters (often a pickle file). However, distributing a model this way
ignores the fact that we might need to install some custom prediction
code along with the model parameters. By using a package manager, we are
able to ensure that the model parameters and the prediction codebase are
installed correctly into the programming environment. In the case of the
IrisModel package, the model parameters were installed by including the
file in the package's manifest which ensures that the parameters are
copied into the distributable file.</p>
<h1>ML Models and the Strategy Pattern</h1>
<p>The strategy pattern is a design pattern used in object oriented design.
It is a behavioral design pattern that allows a software component to
select an appropriate algorithm at runtime to execute a task. The
strategy pattern is applied by defining an interface that every
implementation of the strategy must inherit and implement. The MLModel
class that the IrisModel class inherits from fulfills this purpose. The
benefit that we gain from using the strategy pattern is that we can
write code that doesn't care about the details of a machine learning
model's prediction algorithm, because it can use any algorithm that
meets the requirements of the interface.</p>
<p>In practice, this means that we were able to deploy an ML model simply
by installing the package and writing a reference to the class that
implements the MLModel interface into the configuration. The deployment
code reads the configuration at runtime, loads the right model, and
makes it available to the client. Some model deployments that we built
were even able to handle multiple models. For example, the ZeroRPC
service that we created in <a href="https://www.tekhnoal.com/zerorpc-ml-model-deployment.html">this blog
post</a>
is able to dynamically create an endpoint for every model that is listed
in the configuration.</p>
<p>By creating models as components and making them available as packages,
we're able to make models reusable in many different situations. When we
use the strategy pattern, we get a similar benefit, because the pattern
makes it possible to reuse the model deployment code to deploy any model
in the future. As long as the model we want to deploy implements the
MLModel interface, we are able to reuse the deployment codebase to
deploy it. In the future, it would be easy to build reusable codebases
that can deploy models, the code would be configured with the model that
needs to be deployed and there would be no need to create a custom
service for each model that wanted to deploy.</p>
<h1>An Ontology of ML Model Deployments</h1>
<p>Now that we have deployed the same model in ten different ways, we can
compare and contrast the ways the model was deployed. This section tries
to build a complete picture of the effect that a deployment option can
have on the way we can use the model.</p>
<h2>Interactive and Non-Interactive Model Deployments</h2>
<p>ML models can be deployed in an interactive manner and a non-interactive
manner. A model is deployed "interactively" when a client of the model
is able to request predictions from the model and get a prediction
directly back without waiting an indeterminate amount of time to get the
prediction. Interactive model deployments make the model directly
available to the client through an API and make it possible for the
client to send in any data allowed by the model's input schema to make a
prediction. In "non-interactive" model deployments, the client is not
able to send data to the model directly, which usually means that the
client has to access predictions that were previously stored in a data
store. The distinction between interactive and non-interactive model
deployments can have a large impact on the design of the client systems
that make use of the ML model. If a model is deployed non-interactively,
the clients of the system don't have direct access to the model and they
can't send any data they want to the model, the only predictions that
are available from the model are the ones previously made and stored.</p>
<p>An example of an interactive deployment is the REST service that we
built in <a href="https://www.tekhnoal.com/using-ml-model-abc.html">this blog
post</a>.
The service is designed to run continuously, which means that a client
can contact the service anytime, request a prediction, and get a
prediction back directly from the model. An example of a non-interactive
deployment is the batch job that we built in this <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog
post</a>,
since a user of the model can only access the predictions that are saved
by the batch job. At first sight, it would seem that the task queue
deployment that we built in <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">this blog
post</a>
is non-interactive because the user has to wait to get a prediction.
However, the task queue is actually interactive because the predictions
are always made from the input provided by the client and the
predictions become available to the client after the asynchronous task
completes.</p>
<h2>Single-Record and Batch Model Deployments</h2>
<p>Single-record model deployments are designed to receive inputs from
clients, make a single prediction, and return the results to the client.
Batch model deployments are designed to receive many inputs from the
client system, make predictions and return the results to the client as
a batch of records. Batch systems often make better use of resources
because they are able to
<a href="https://en.wikipedia.org/wiki/Array_programming">vectorize</a>
their operations, this makes their operation more efficient.
Single-record systems are usually more responsive to clients because
they are able to quickly return a result.</p>
<p>System performance can be measured in two ways: throughput and latency.
Throughput is defined as the number of records that can be processed by
the system in a given period of time. Latency is the amount of time it
takes the system to process a single request. A single-record model
deployment is often optimizing for the total latency of a single
request, and a batch model deployment is often optimizing for the total
throughput of the system.</p>
<p>An example of a single-record model deployment is the gRPC service that
we built in this <a href="https://www.tekhnoal.com/grpc-ml-model-deployment.html">blog
post</a>.
The gRPC only allows one prediction to be made for each RPC call to the
model, this is enforced in the protocol buffer interface definition of
the service which does not allow arrays of prediction inputs to be
received by the service. An example of a batch model deployment is the
MapReduce job we built in this <a href="https://www.tekhnoal.com/map-reduce-ml-model-deployment.html">blog
post</a>.
The MapReduce system is specifically designed to allow massive parallel
batch jobs that run across multiple computers in a cluster. The system
is most efficient when processing large datasets because of the amount
of time it takes to start a processing run. The distinction between
single-record and batch deployments can sometimes be hard to draw
because we can support multiple predictions in the gRPC service API, as
long as the client is willing to wait for all of the predictions to
complete. As always, there are many tradeoffs that we can make between
the two extremes.</p>
<h2>Synchronous and Asynchronous Model Deployments</h2>
<p>Synchronous ML model deployments are characterized by the client being
blocked while the model is making a prediction. An asynchronous model
deployment allows the client system to request a prediction from the
model and not wait for the prediction to complete to continue
processing. Typically, an asynchronous deployment allows the client to
retrieve the model's prediction after it completes, but this is not
required for the system to be considered asynchronous. The predictions
made by a synchronous model deployment are returned to the client as
soon as they are completed.</p>
<p>An example of a synchronous model deployment is the AWS Lambda
deployment we built in this <a href="https://www.tekhnoal.com/lambda-ml-model-deployment.html">blog
post</a>.
The Lambda receives prediction requests through an AWS API Gateway,
makes a prediction and returns it while the client system waits for it.
An example of an asynchronous model deployment is the task queue we
built for this <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">blog
post</a>.
The task queue is specifically designed to receive predictions requests
from clients and fulfill them while the client system works on other
things. The task queue makes the prediction available to the client in a
"result backend" which can be accessed by the client once the prediction
is completed. Another asynchronous deployment is the Kafka stream
processor we built in this <a href="https://www.tekhnoal.com/streaming-ml-model-deployment.html">blog
post</a>,
although it is not designed to return the prediction results directly to
the client like the task queue deployment.</p>
<h2>Real-time and Non-real-time Model Deployments</h2>
<p>Another area of optimization for ML model deployments is the ability to
return a prediction very quickly. A real-time system needs to be
optimized to have very low and very predictable latency so that we can
ensure that interactions with the model can always happen quickly and
end within a defined period of time.</p>
<p>An example of a real time model deployment is the Websocket service that
we created in <a href="https://www.tekhnoal.com/websocket-ml-model-deployment.html">this blog
post.</a>
The Websocket service is particularly useful for this type of deployment
because websocket connections are designed to transfer data with very
low overhead. Some examples of a non-real-time service is the Apache
Beam ETL job we built in <a href="https://www.tekhnoal.com/apache-beam-ml-model-deployment.html">this blog
post</a>
and the Hadoop MapReduce job we built in <a href="https://www.tekhnoal.com/map-reduce-ml-model-deployment.html">this blog
post</a>.
These deployments are designed to make millions of predictions and are
optimized for that purpose, which means that they are not useful in
situations in which we need real-time predictions.</p>
<p>In the blog posts that we wrote, we didn't try to deploy a model on a
consumer device like a phone or tablet. All of the approaches we took
were designed to execute the model on a server and return the prediction
to the client through the network. For a real-time system, being able to
execute directly on the client device would be more efficient and faster
since no network hop is required.</p>
<h2>Deterministic and Non-deterministic Models</h2>
<p>The last distinction we will make is between deterministic and
nondeterministic model prediction code. Deterministic models will always
return the same result when given the same input, non-deterministic
models can return different results when given the same input. This
distinction can have a large impact on the deployment of the model. If
we don't distinguish between models that are deterministic and
non-deterministic, doing things like storing predictions for later use
and prediction caching can become much more complicated. Any model that
is being deployed that is non-deterministic should publish that fact to
its users so that they can be ready to deal with the side effects of
non-determinism.</p>
<h1>Conclusion</h1>
<p>At the beginning of this series of blog posts we challenged ourselves to
come up with a simple base class that would enable us to abstract out
the details of a machine learning model. We started by creating a base
class that could hide the details of the ML model behind an abstraction,
then added features that we thought would be useful. From the beginning,
the base class was designed to make it easy to deploy machine learning
models. The base class was not designed for the training parts of a
model codebase.</p>
<p>To be able to introspect details about the model, we also added the
ability for the model to provide metadata about itself. The metadata
aspect of the model was not really required for most model deployments,
but it did become important for certain deployments. Model metadata like
the version and the input and output schemas of the model becomes more
important when we have to manage dozens or hundreds of deployed models.</p>
<p>To enable us to easily deploy any ML model, we also needed to make the
model codebase easy to install, which we accomplished by making the ML
model into a Python package that could be installed with the pip package
manager. By making the model codebase easy to install we enabled anybody
to reuse the model in whichever context they needed it without having to
understand the code or manually install the dependencies of the model.
Having the model inside of a package also allowed us to install the very
same model in 10 different applications with no changes to the model
code.</p>
<p>Overall, this series of blog posts is much less concerned with the
details of training a machine learning model. It is mainly concerned
with integrating the trained ML model with other software systems. To
this end, we sought to use a wide variety of integration technologies to
make sure that our approach worked in every situation. In every case,
the model codebase remained the same and we did not have to adapt it to
any of the integrations. This speaks to the flexibility of the approach,
which allowed us to isolate the details of the ML model from the
deployment and integration problems. Furthermore, we can reuse any of
the deployment codebases to deploy any ML model code that implements the
MLModel base class, which makes the deployment codebases reusable as
well.</p>
<p>To sum up, the best strategy for building an ML model that can be used
in many different contexts is to: code the model prediction code behind
an interface, build and release the model as a package, and then to
install it into the environment where it will be used. All deployment
details should be kept out of the model package so that we are able
choose the right approach to model deployment later on.</p>An Apache Beam ML Model Deployment2020-07-31T19:00:00-05:002020-07-31T19:00:00-05:00Brian Schmidttag:www.tekhnoal.com,2020-07-31:/apache-beam-ml-model-deployment.html<p>Data processing pipelines are useful for solving a wide range of problems. For example, an Extract, Transform, and Load (ETL) pipeline is a type of data processing pipeline that is used to extract data from one system and save it to another system. Inside of an ETL, the data may be transformed and aggregated into more useful formats. ETL jobs are useful for making the predictions made by a machine learning model available to users or to other systems. The ETL for such an ML model deployment lookslike this: extract features used for prediction from a source system, send the features to the model for prediction, and save the predictions to a destination system. In this blog post we will show how to deploy a machine learning model inside of a data processing pipeline that runs on the Apache Beam framework.</p><p>This blog post builds on the ideas started in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog
posts</a>.</p>
<p>In this blog post I'll show how to deploy the same ML model that we
deployed as a batch job in this <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog post</a>,
as a task queue in this <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">blog post</a>,
inside an AWS Lambda in this <a href="https://www.tekhnoal.com/lambda-ml-model-deployment.html">blog post</a>,
as a Kafka streaming application in this <a href="https://www.tekhnoal.com/streaming-ml-model-deployment.html">blog post</a>,
a gRPC service in this <a href="https://www.tekhnoal.com/grpc-ml-model-deployment.html">blog post</a>,
as a MapReduce job in this <a href="https://www.tekhnoal.com/map-reduce-ml-model-deployment.html">blog post</a>,
as a Websocket service in this <a href="https://www.tekhnoal.com/websocket-ml-model-deployment.html">blog post</a>,
and as a ZeroRPC service in this <a href="https://www.tekhnoal.com/zerorpc-ml-model-deployment.html">blog post</a>.</p>
<p>The code in this blog post can be found in this <a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment">github repo</a>.</p>
<h1>Introduction</h1>
<p>Data processing pipelines are useful for solving a wide range of
problems. For example, an Extract, Transform, and Load (ETL) pipeline is
a type of data processing pipeline that is used to extract data from one
system and save it to another system. Inside of an ETL, the data may be
transformed and aggregated into more useful formats. ETL jobs are useful
for making the predictions made by a machine learning model available to
users or to other systems. The ETL for such an ML model deployment looks
like this: extract features used for prediction from a source system,
send the features to the model for prediction, and save the predictions
to a destination system. In this blog post we will show how to deploy a
machine learning model inside of a data processing pipeline that runs on
the Apache Beam framework.</p>
<p>Apache Beam is an open source framework for doing data processing. It is
most useful for doing parallel data processing that can easily be split
among many computers. The Beam framework is different from other data
processing frameworks because it supports batch and stream processing
using the same API, which allows developers to write the code one time
and deploy it in two different contexts without change. An interesting
feature of the Beam programming model is that once we have written the
code, we can deploy into an array of different runners like Apache
Spark, Apache Flink, Apache MapReduce, and others.</p>
<p>The Google Cloud Platform has a service that can run Beam pipelines. The
Dataflow service allows users to run their workloads in the cloud
without having to worry about managing servers and manages automated
provisioning and management of processing resources for the user. In
this blog post, we'll also be deploying the machine learning pipeline to
the Dataflow service to demonstrate how it works in the cloud.</p>
<h1>Building Beam Jobs</h1>
<p>A Beam job is defined as a driver process that uses the Beam SDK to
state the data processing steps that the Beam job does. The Beam SDK can
be used from Python, Java, or Go processes. The driver process defines a
data processing pipeline of components which are executed in the right
order to load data, process it, and store the results. The driver
program also accepts execution options that can be set to modify the
behavior of the pipeline. In our example, we will be loading data from
an LDJSON file, sending it to a model to make predictions, and storing
the results in an LDJSON file.</p>
<p>The Beam programming model works by defining a PCollection, which is a
collection of data records that need to be processed. A PCollection is a
data structure that is created at the beginning of the execution of the
pipeline, and is received and processed by each step in a Beam pipeline.
Each step in the pipeline that modifies the contents of the PCollection
is called a PTransform. For this blog post we will create a PTransform
component that takes a PCollection, makes predictions with it, and
returns a PCollection with the prediction results. We will combine this
PTransform with other components to build a data processing pipeline.</p>
<h1>Package Structure</h1>
<p>The code used in this blog post is hosted in <a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment">this Github
repository.</a>
The codebase is structured like this:</p>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">data</span> <span class="ss">(</span> <span class="nv">data</span> <span class="k">for</span> <span class="nv">testing</span> <span class="nv">job</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_beam_job</span> <span class="ss">(</span><span class="nv">python</span> <span class="nv">package</span> <span class="k">for</span> <span class="nv">apache</span> <span class="nv">beam</span> <span class="nv">package</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">main</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">pipeline</span> <span class="nv">definition</span> <span class="nv">and</span> <span class="nv">launcher</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">ml_model_operator</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">prediction</span> <span class="nv">step</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span> <span class="nv">unit</span> <span class="nv">tests</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">Makefile</span>
<span class="o">-</span> <span class="nv">README</span>.<span class="nv">md</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">setup</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span>
</code></pre></div>
<h1>Installing the Model</h1>
<p>As in previous blog posts, we'll be deploying a model that is packaged
separately from the deployment codebase. This approach allows us to
deploy the same model in many different systems and contexts. To install
the model package, we'll install the model into the virtual environment.
The model package can be installed from a git repository with this
command:</p>
<div class="highlight"><pre><span></span><code>pip install git+https://github.com/schmidtbri/ml-model-abc-improvements
</code></pre></div>
<p>Now that we have the model installed in the environment, we can try it
out by opening a python interpreter and entering this code:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="kn">from</span> <span class="nn">iris_model.iris_predict</span> <span class="kn">import</span> <span class="n">IrisModel</span>
<span class="o">>>></span> <span class="n">model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="o">>>></span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">({</span><span class="s2">"sepal_length"</span><span class="p">:</span><span class="mf">1.1</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">1.2</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">1.3</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.4</span><span class="p">})</span>
<span class="p">{</span><span class="s1">'species'</span><span class="p">:</span> <span class="s1">'setosa'</span><span class="p">}</span>
</code></pre></div>
<p>The IrisModel class implements the prediction logic of the iris_model
package. This class is a subtype of the MLModel class, which ensures
that a standard interface is followed. The MLModel interface allows us
to deploy any model we want into the Beam job, as long as it implements
the required interface. More details about this approach to deploying
machine learning models can be found in the first
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">blog posts</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">in this series.</a></p>
<h1>MLModelPredictOperation Class</h1>
<p>The first thing we'll do is create a PTransform class for the code that
receives records from the Beam framework and makes predictions with the
MLModel class. This is the class:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MLModelPredictOperation</span><span class="p">(</span><span class="n">beam</span><span class="o">.</span><span class="n">DoFn</span><span class="p">):</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment/blob/master/model_beam_job/ml_model_operator.py#L10">here</a>.</p>
<p>The class we'll be working with is called MLModelPredictOperation and it
is a subtype of the <a href="https://beam.apache.org/documentation/programming-guide/#core-beam-transforms">DoFn
class</a>
that is part of the Beam framework. The DoFn class defines a method
which will be applied to each record in the PCollection. To initialize
the object with the right model, we'll add an __init__ method:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">module_name</span><span class="p">,</span> <span class="n">class_name</span><span class="p">):</span>
<span class="n">beam</span><span class="o">.</span><span class="n">DoFn</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span>
<span class="n">model_module</span> <span class="o">=</span> <span class="n">importlib</span><span class="o">.</span><span class="n">import_module</span><span class="p">(</span><span class="n">module_name</span><span class="p">)</span>
<span class="n">model_class</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">model_module</span><span class="p">,</span> <span class="n">class_name</span><span class="p">)</span>
<span class="n">model_object</span> <span class="o">=</span> <span class="n">model_class</span><span class="p">()</span>
<span class="k">if</span> <span class="nb">issubclass</span><span class="p">(</span><span class="nb">type</span><span class="p">(</span><span class="n">model_object</span><span class="p">),</span> <span class="n">MLModel</span><span class="p">)</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"The model object is not a subclass of MLModel."</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="o">=</span> <span class="n">model_object</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment/blob/master/model_beam_job/ml_model_operator.py#L22-L34">here</a>.</p>
<p>We'll start by calling the __init__ method of the DoFn super class,
this initializes the super class. We then find and load the python
module that contains the MLModel class that contains the prediction
code, get a reference to the class, and instantiate the MLModel class
into an object. Now that we have an instantiated model object, we check
the type of the object to make sure that it is a subtype of MLModel. If
it is a subtype, we store a reference to it.</p>
<p>Now that we have an initialized DoFn object with a model object inside
of it, we need to actually do the prediction:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="k">yield</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment/blob/master/model_beam_job/ml_model_operator.py#L36-L38">here</a>.</p>
<p>The prediction is very simple, we take the record and pass it directly
to the model, and yield the result of the prediction. To make sure that
this code will work inside of a Beam pipeline, we need to make sure that
the pipeline feeds a PCollection of dictionaries to the DoFn object.
When we create the pipeline, we'll make sure that this is the case.</p>
<h1>Creating the Pipeline</h1>
<p>Now that we have a class that can make a prediction with the model, we
need to build a simple pipeline around it that can load data, send it to
the model, and save the resulting predictions.</p>
<p>The creation of the Beam pipeline is done in the <a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment/blob/master/model_beam_job/main.py#L30-L50">run
function</a>
in the main.py module:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">run</span><span class="p">(</span><span class="n">argv</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="n">parser</span> <span class="o">=</span> <span class="n">argparse</span><span class="o">.</span><span class="n">ArgumentParser</span><span class="p">()</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'--input'</span><span class="p">,</span> <span class="n">dest</span><span class="o">=</span><span class="s1">'input'</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s1">'Input file to process.'</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'--output'</span><span class="p">,</span> <span class="n">dest</span><span class="o">=</span><span class="s1">'output'</span><span class="p">,</span> <span class="n">required</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s1">'Output file to write results to.'</span><span class="p">)</span>
<span class="n">known_args</span><span class="p">,</span> <span class="n">pipeline_args</span> <span class="o">=</span> <span class="n">parser</span><span class="o">.</span><span class="n">parse_known_args</span><span class="p">(</span><span class="n">argv</span><span class="p">)</span>
<span class="n">pipeline_options</span> <span class="o">=</span> <span class="n">PipelineOptions</span><span class="p">(</span><span class="n">pipeline_args</span><span class="p">)</span>
<span class="n">pipeline_options</span><span class="o">.</span><span class="n">view_as</span><span class="p">(</span><span class="n">SetupOptions</span><span class="p">)</span><span class="o">.</span><span class="n">save_main_session</span> <span class="o">=</span> <span class="kc">True</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment/blob/master/model_beam_job/main.py#L30-L38">here</a>.</p>
<p>The pipeline options is an object that is given to the Beam job to
modify the way that it runs. The parameters loaded from a command line
parser are fed directly to the PipelineOptions object. Two parameters
are loaded in the command line parser: the location of the input files,
and the location where the output of the job will be stored.</p>
<p>When we are done loading the pipeline options, we can arrange the steps
that make up the pipeline:</p>
<div class="highlight"><pre><span></span><code><span class="k">with</span> <span class="n">beam</span><span class="o">.</span><span class="n">Pipeline</span><span class="p">(</span><span class="n">options</span><span class="o">=</span><span class="n">pipeline_options</span><span class="p">)</span> <span class="k">as</span> <span class="n">p</span><span class="p">:</span>
<span class="p">(</span><span class="n">p</span>
<span class="o">|</span> <span class="s1">'read_input'</span> <span class="o">>></span> <span class="n">ReadFromText</span><span class="p">(</span><span class="n">known_args</span><span class="o">.</span><span class="n">input</span><span class="p">,</span> <span class="n">coder</span><span class="o">=</span><span class="n">JsonCoder</span><span class="p">())</span>
<span class="o">|</span> <span class="s1">'apply_model'</span> <span class="o">>></span> <span class="n">beam</span><span class="o">.</span><span class="n">ParDo</span><span class="p">(</span><span class="n">MLModelPredictOperation</span><span class="p">(</span><span class="n">module_name</span><span class="o">=</span><span class="s2">"iris_model.iris_predict"</span><span class="p">,</span> <span class="n">class_name</span><span class="o">=</span><span class="s2">"IrisModel"</span><span class="p">))</span>
<span class="o">|</span> <span class="s1">'write_output'</span> <span class="o">>></span> <span class="n">WriteToText</span><span class="p">(</span><span class="n">known_args</span><span class="o">.</span><span class="n">output</span><span class="p">,</span> <span class="n">coder</span><span class="o">=</span><span class="n">JsonCoder</span><span class="p">())</span>
<span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment/blob/master/model_beam_job/main.py#L40-L47">here</a>.</p>
<p>The pipeline object is created by providing it with the PipelineOptions
object that we created above. The pipeline is made up of three steps: a
step that loads data from an LDJSON file and creates a PCollection from
it, a step that makes predictions with that PCollection, and a step that
saves the resulting predictions as an LDJSON file. The input and output
steps use a class called
<a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment/blob/master/model_beam_job/main.py#L18-L27">JsonCoder</a>,
which takes care of serializing and deserializing the data in the LDJSON
files.</p>
<p>Now that we have a configured pipeline, we can run it:</p>
<div class="highlight"><pre><span></span><code><span class="n">result</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="n">run</span><span class="p">()</span>
<span class="n">result</span><span class="o">.</span><span class="n">wait_until_finish</span><span class="p">()</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment/blob/master/model_beam_job/main.py#L49-L50">here</a>.</p>
<p>The main.py module is responsible for arranging the steps of the
pipeline, receiving parameters, and running the Beam job. This script
will be used to run the job locally and in the cloud.</p>
<h1>Testing the Job Locally</h1>
<p>We can test the job locally by running with the python interpreter:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
python -m model_beam_job.main --input data/input.json --output data/output.json
</code></pre></div>
<p>The job takes as input the "input.json" file in the data folder, and
produces a file called "output.json" to the same folder.</p>
<h1>Deploying to Google Cloud</h1>
<p>The next thing we'll do is run the same job that we ran locally in the
<a href="https://cloud.google.com/dataflow">Google Cloud Dataflow
service</a>. The Dataflow
service is an offering in the Google Cloud suite of services that can do
scalable data processing for batch and streaming jobs. The Dataflow
service runs Beam jobs exclusively and manages the job, handling
resource management and performance optimization.</p>
<p>To run the model Beam job in the cloud, we'll need to create a project.
In the Cloud Console, in the <a href="https://console.cloud.google.com/projectselector2/home/dashboard">project selector
page</a>
click on "Create Cloud Project", then create a project for your
solution. The newly created project should be the currently selected
project, then any resources that we create next will be held in the
project. In order to use the GCP Dataflow service, we'll need to have
billing enabled for the project. To make sure that billing is working,
follow <a href="https://cloud.google.com/billing/docs/how-to/modify-project#confirm_billing_is_enabled_on_a_project">these
steps</a>.</p>
<p>To be able to create the Dataflow job, we'll need to have access to the
Cloud Dataflow, Compute Engine, Stackdriver Logging, Cloud Storage,
Cloud Storage JSON, BigQuery, Cloud Pub/Sub, Cloud Datastore, and Cloud
Resource Manager APIs from your new project. To enable access to these
APIs, follow <a href="https://console.cloud.google.com/flows/enableapi?apiid=dataflow,compute_component,logging,storage_component,storage_api,bigquery,pubsub,datastore.googleapis.com,cloudresourcemanager.googleapis.com">this
link</a>,
then select your new project and click the "Continue" button.</p>
<p>Next, we'll create a service account for our project. In the Cloud
Console, go to the <a href="https://console.cloud.google.com/apis/credentials/serviceaccountkey">Create service account key
page</a>.
From the Service account list, select "New service account". In the
Service account name field, enter a name. From the Role list, select
Project -> Owner and click on the "Create" button. A JSON file will be
created and downloaded to your computer, copy this file to the root of
the project directory. To use the file in the project, open a command
shell and set the GOOGLE_APPLICATION_CREDENTIALS environment variable
to the full path to the JSON file that you placed in the project root.
The command will look like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">export</span><span class="w"> </span><span class="n">GOOGLE_APPLICATION_CREDENTIALS</span><span class="o">=/</span><span class="n">Users</span><span class="o">/.../</span><span class="n">apache</span><span class="o">-</span><span class="n">beam</span><span class="o">-</span><span class="n">ml</span><span class="o">-</span><span class="n">model</span><span class="o">-</span><span class="n">deployment</span><span class="o">/</span><span class="n">model</span><span class="o">-</span><span class="n">beam</span><span class="o">-</span><span class="n">job</span><span class="o">-</span><span class="n">a7c5c1d9c22c</span><span class="o">.</span><span class="n">json</span><span class="w"></span>
</code></pre></div>
<p>To store the file we will be processing, we need to create a storage
bucket in the Google Cloud Storage service. To do this, go to the
<a href="https://console.cloud.google.com/storage/browser">bucket browser
page</a>,
click on the "Create Bucket" button, and fill in the details to create a
bucket. Now we can upload our test data to a bucket so that it can be
processed by the job. To upload the test data click on the "Upload
Files" button in the bucket details page and select the <a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment/blob/master/data/input.json">input.json
file</a>
in the data directory of the project.</p>
<p>Next, we need to create a tar.gz file that contains the model package
that will be run by the Beam job. This package is special because it
cannot be installed from the public Pypi repository, so it must be
uploaded along with the Beam job to the Dataflow job. To create the
tar.gz file, we created <a href="https://github.com/schmidtbri/apache-beam-ml-model-deployment/blob/master/Makefile#L10-L17">a target in the project
Makefile</a>
called "build-dependencies". When executed, the target downloads the
code for the iris_model package, builds a tar.gz.distribution file, and
leaves in the "dependencies" directory.</p>
<p>We're finally ready to send the job to be executed in the Dataflow
service. To do this, execute this command:</p>
<div class="highlight"><pre><span></span><code>python -m model_beam_job.main --region us-east1 <span class="se">\ </span>
--input gs://model-beam-job/input.json <span class="se">\</span>
--output gs://model-beam-job/results/outputs <span class="se">\ </span>
--runner DataflowRunner <span class="se">\</span>
--machine_type n1-standard-4 <span class="se">\ </span>
--project model-beam-job-294711 <span class="se">\ </span>
--temp_location gs://model-beam-job/tmp/ <span class="se">\ </span>
--extra_package dependencies/iris_model-0.1.0.tar.gz <span class="se">\ </span>
--setup_file ./setup.py
</code></pre></div>
<p>The job is sent by executing the same python scripts that we used to
test the job locally, but we've added more command line options. The
input and output options work the same as in the local execution of the
job, but now they point to locations in the Google Cloud Storage bucket.
The runner option tells the Beam framework that we want to use the
Dataflow runner. The machine_type option tells the Dataflow service
that we want to use that specific machine type when running the job. The
project option points to the Google Cloud project we created above. The
temp_location option tells the Dataflow service that we want to store
temporary files in the same Google Cloud Storage bucket that we are
using for the input and output. The extra_package option points to the
iris_model distribution tar.gz file that we created above, this file
will be sent to the Dataflow service along with the job code. Lastly,
the setup_file option points at the setup.py file of the
model_beam_job package itself, this allows the command to package up
any code files that the job depends on.</p>
<p>Once we execute the command, the job will be started in the cloud. As
the job runs it will output a link to a webpage that can be used to
monitor the progress of the job. Once the job completes, the results
will be in the Google Cloud Storage bucket that we created above.</p>
<p><img alt="Dataflow UI" src="https://www.tekhnoal.com/dataflow_ui.png" width="100%"></p>
<h1>Closing</h1>
<p>By using the Beam framework, we are able to easily deploy a machine
learning prediction job to the cloud. Because of the simple design of
the Beam framework, a lot of the complexities of running a job on many
computers are abstracted out. Furthermore, we are able to leverage all
of the features of the Beam framework for advanced data processing.</p>
<p>One of the important features of this codebase is the fact that it can
accept any machine learning model that implements the MLModel interface.
By installing another model package and importing the class that
inherits from the MLModel base class, we can easily deploy any number of
models in the same Beam job without changing the code. However, we do
need to change the pipeline definition to change or add models to it.
Once again, the MLModel interface allowed us to abstract out the
building a machine learning model from the complexity of deploying a
machine learning model.</p>
<p>One thing that we can improve about the code is the fact that the job
only accepts files encoded as LDJSON. We did this to make the code easy
to understand, but we can easily add other options for the format of the
input data making the pipeline more flexible and easier to use.</p>A ZeroRPC ML Model Deployment2020-05-04T09:26:00-05:002020-05-04T09:26:00-05:00Brian Schmidttag:www.tekhnoal.com,2020-05-04:/zerorpc-ml-model-deployment.html<p>There are many different ways for two software processes to communicate with each other. When deploying a machine learning model, it's often simpler to isolate the model code inside of its own process. Any code that needs to use the model to make predictions then needs to communicate with the process that is running the model code to make predictions. This approach is easier than embedding the model code in the process that needs the predictions because it saves us the trouble of recreating the model's algorithm in the programming language of the process that needs the predictions. RPC calls are also used widely to connect code that is executing in different processes. In the last few years, the rise in popularity of microservice architectures has also caused the rise in popularity of RPC for integrating systems.</p><p>This blog post builds on the ideas started in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog posts</a>.</p>
<p>In this blog post I'll show how to deploy the same ML model that we
deployed as a batch job in this <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog post</a>,
as a task queue in this <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">blog post</a>,
inside an AWS Lambda in this <a href="https://www.tekhnoal.com/lambda-ml-model-deployment.html">blog post</a>,
as a Kafka streaming application in this <a href="https://www.tekhnoal.com/streaming-ml-model-deployment.html">blog post</a>,
a gRPC service in this <a href="https://www.tekhnoal.com/grpc-ml-model-deployment.html">blog post</a>,
as a MapReduce job in this <a href="https://www.tekhnoal.com/map-reduce-ml-model-deployment.html">blog post</a>,
and as a Websocket service in this <a href="https://www.tekhnoal.com/websocket-ml-model-deployment.html">blog post</a>.</p>
<p>The code in this blog post can be found in this <a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment">github
repo</a>.</p>
<h1>Introduction</h1>
<p>There are many different ways for two software processes to communicate
with each other. When deploying a machine learning model, it's often
simpler to isolate the model code inside of its own process. Any code
that needs to use the model to make predictions then needs to
communicate with the process that is running the model code to make
predictions. This approach is easier than embedding the model code in
the process that needs the predictions because it saves us the trouble
of recreating the model's algorithm in the programming language of the
process that needs the predictions. RPC calls are also used widely to
connect code that is executing in different processes. In the last few
years, the rise in popularity of microservice architectures has also
caused the rise in popularity of RPC for integrating systems.</p>
<p>RPC stands for Remote Procedure Call. A remote procedure is just a
function call that is executed in a different process from the process
that initiated the call. The input parameters for the call come from the
calling process and the result of the call is returned to the calling
process. The function call looks as if it was executed locally. RPC
therefore executes as a request-response protocol. The process that
initiates the call is called the client and the process that executes
the call is the server. RPC is useful when you want to call a function
that is not implemented in the local process and you don't want to worry
about the complexities of inter-process communication. RPC is similar to
but a lot simpler than REST and HTTP-based inter-process communication.</p>
<p>An RPC call follows a series of steps to complete the call. First, the
client code will call a piece of code called the "stub" in the client
process. The stub behaves like a normal function but actually calls the
remote procedure in the server. The stub then takes the parameters
provided by the client code and serializes them so that they can be
transported over the communication channel. The stub uses the
communication channel to communicate with the remote process, sending
the necessary information to execute the procedure. The server stub
receives the information and deserializes the parameters, then executes
the procedure. The series of steps are then executed in reverse order to
return the results of the procedure to the client code.</p>
<p>In previous blog posts we showed how to do RPC with a RESTful service
and a gRPC service. In this blog post we'll continue exploring the
options available to us for interprocess communication with a ZeroRPC
service that can host machine learning models.</p>
<h1>ZeroRPC</h1>
<p>ZeroRPC is a simple RPC framework that works in many different
languages. ZeroRPC uses
<a href="https://msgpack.org/index.html">MessagePack</a> for
parameter serialization and deserialization, and it uses
<a href="https://zeromq.org/">ZeroMQ</a> for transporting data
between processes. ZeroRPC supports advanced features such as streamed
responses, heartbeats, and timeouts. The framework also supports
introspection of the service and exceptions.</p>
<p>The ZeroRPC framework uses the ZeroMQ messaging framework to transport
messages between processes. ZeroMQ is a high-performance low-level
messaging framework that can be used in many different types of
communication patterns. The ZeroRPC framework uses the ZeroMQ framework
in a request-response pattern to do RPC calls. ZeroMQ also supports the
publish-subscribe pattern along with other patterns. ZeroMQ is designed
to support highly distributed and concurrent applications. ZeroMQ works
in many different programming languages and in many operating systems.</p>
<p>The ZeroRPC framework uses the MessagePack format for serialization.
This format is similar to JSON but is binary, which makes it more space
efficient and allows for faster serialization and deserialization. The
MessagePack format is similar to the Protocol Buffer format that is used
by gRPC, but it allows us to serialize arbitrary data structures. This
is different from Protocol Buffers which require a schema for the data
to be serialized. MessagePack is also dynamically typed which makes
developing code with it faster and simpler, but lacks the documentation
and code generation features of Protocol Buffers.</p>
<h1>Package Structure</h1>
<p>The service codebase is structured like this:</p>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">model_zerorpc_service</span> <span class="ss">(</span> <span class="nv">python</span> <span class="nv">package</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">zerorpc</span> <span class="nv">service</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">config</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">ml_model_zerorpc_endpoint</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">ml_model_manager</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">service</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">scripts</span> <span class="ss">(</span><span class="nv">scripts</span> <span class="k">for</span> <span class="nv">testing</span> <span class="nv">the</span> <span class="nv">service</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span><span class="nv">unit</span> <span class="nv">tests</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">Dockerfile</span> <span class="ss">(</span><span class="nv">used</span> <span class="nv">to</span> <span class="nv">build</span> <span class="nv">a</span> <span class="nv">docker</span> <span class="nv">image</span> <span class="nv">of</span> <span class="nv">the</span> <span class="nv">service</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">Makefle</span>
<span class="o">-</span> <span class="nv">README</span>.<span class="nv">md</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">setup</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span>
</code></pre></div>
<p>This structure can be seen in the <a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment">github
repository</a>.</p>
<h1>Installing the Model</h1>
<p>Our aim for this blog post is to show how to build a ZeroRPC service
that is able to host any ML model that works with the MLModel base
class. To show how this can be done, we'll use the same model that we've
deployed in previous blog posts. To install the model into the Python
environment, execute this command:</p>
<div class="highlight"><pre><span></span><code>pip install git+<span class="o">[</span>https://github.com/schmidtbri/ml-model-abc-improvements<span class="o">](</span>https://github.com/schmidtbri/ml-model-abc-improvements%5C<span class="o">)</span>
</code></pre></div>
<p>This command installs the model code and parameters from the model's git
repository. To understand how the model code works, check out <a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">this blog post</a>.
Once the model is installed, we can test it out by executing this Python
code in an interactive session:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="kn">from</span> <span class="nn">iris_model.iris_predict</span> <span class="kn">import</span> <span class="n">IrisModel</span>
<span class="o">>>></span> <span class="n">model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="o">>>></span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">({</span><span class="s2">"sepal_length"</span><span class="p">:</span><span class="mf">1.1</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">1.2</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">1.3</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.4</span><span class="p">})</span>
<span class="p">{</span><span class="s1">'species'</span><span class="p">:</span> <span class="s1">'setosa'</span><span class="p">}</span>
</code></pre></div>
<p>The code above imports the class that implements the MLModel interface,
instantiates it, and sends the model object a prediction request. The
model successfully responds with a prediction for the flower species.</p>
<p>In order for the ZeroRPC service to find the model that we want to
deploy, we'll create a configuration module that points to the model's
package and module:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">Config</span><span class="p">(</span><span class="nb">dict</span><span class="p">):</span>
<span class="n">models</span> <span class="o">=</span> <span class="p">[{</span>
<span class="s2">"module_name"</span><span class="p">:</span> <span class="s2">"iris_model.iris_predict"</span><span class="p">,</span>
<span class="s2">"class_name"</span><span class="p">:</span> <span class="s2">"IrisModel"</span>
<span class="p">}]</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/model_zerorpc_service/config.py#L4-L12">here</a>.</p>
<p>This configuration gives us the flexibility to add and remove models
from the service dynamically. A service can host any number of models if
they are installed in the environment and added to the configuration.
The module_name and class_name fields in the configuration point to a
class that implements the MLModel interface, which allows the service to
make predictions using the model.</p>
<p>As in previous blog posts, we'll use a singleton object to manage the
ML model objects that will be used to make predictions. The class that
the singleton object is instantiated from is called ModelManager. The
class is responsible for instantiating MLModel objects, managing the
instances, returning information about the MLModel objects, and
returning references to the objects when needed. The code for the
ModelManager class can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/model_zerorpc_service/model_manager.py">here</a>.
A complete explanation of the ModelManager class can be found in <a href="https://www.tekhnoal.com/using-ml-model-abc.html">this
blog
post</a>.</p>
<h1>ZeroRPC Endpoint</h1>
<p>In order to host a machine learning model, we have to handle incoming
prediction requests, produce responses for them, and integrate with the
ZeroRPC framework. The class described in this section will handle these
aspects of the service.</p>
<p>First, we'll declare the class:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MLModelZeroRPCCEndpoint</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/model_zerorpc_service/ml_model_zerorpc_endpoint.py#L10">here</a>.</p>
<p>Next, we'll add the code that will initialize the object when the class
is instantiated:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">model_qualified_name</span><span class="p">):</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">model_instance</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="p">)</span>
<span class="k">if</span> <span class="n">model_instance</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"'</span><span class="si">{}</span><span class="s2">' not found in ModelManager instance."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="p">))</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="vm">__doc__</span> <span class="o">=</span> <span class="s2">"Predict with the </span><span class="si">{}</span><span class="s2">."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">display_name</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/model_zerorpc_service/ml_model_zerorpc_endpoint.py#L19-L40">here</a>.</p>
<p>The __init__ method has one argument called
"model_qualified_name" which tells the endpoint class which model it
will be hosting. The __init__ method first gets a reference to the
ModelManager singleton instance that is initialized when the service
starts up. Then we get a reference to the specific model that is being
hosted by this instance of the MLModelZeroRPCCEndpoint class. Next, we
check if the model reference is equal to None which happens when the
ModelManager can't find a model with the name requested, if there is no
model by the name we raise an exception. If the model exists, we save a
reference to it in the self variable which will make it easier to access
in the future. Lastly, we modify the docstring property of the self
variable which will cause the service to return it when doing
introspection, we'll see how this works later.</p>
<p>Now that we have an instance of the endpoint, we need to handle incoming
prediction requests:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__call__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="k">return</span> <span class="n">prediction</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/model_zerorpc_service/ml_model_zerorpc_endpoint.py#L42-L47">here</a>.</p>
<p>The code in the method is very simple, it receives a parameter called
"data" from the client, sends it to the model's predict method, and
returns the prediction object that is returned by the model. Behind the
scenes, the ZeroRPC framework is handling serialization and
deserialization, the connection between the client and the server, and
any exceptions raised by the server.</p>
<p>This class uses a special feature of Python which is the <a href="https://www.journaldev.com/22761/python-callable-__call__">callable
magic
method</a>.
The __call__ method is a special method that turns any instance of
the MLModelZeroRPCCEndpoint class into a callable, which allows
instances of the class to be used as functions or methods. This will be
useful later when we need to initialize a dynamic number of endpoints in
the gRPC service.</p>
<h1>ZeroRPC Service</h1>
<p>Now that we have a model and a way to host the model within an endpoint,
we can go ahead and write the code that will create the service. Before
we can do that, we have to load the configuration:</p>
<div class="highlight"><pre><span></span><code><span class="n">configuration</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">config</span><span class="p">,</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"APP_SETTINGS"</span><span class="p">])</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/model_zerorpc_service/service.py#L15-L16">here</a>.</p>
<p>The configuration is loaded dynamically by importing a class from the
config.py module. The name of the class is received through an
environment variable called APP_SETTINGS.</p>
<p>A ZeroRPC service is built as a class that provides methods that are
exposed to the outside world as RPC calls. The class that will become
the service is defined like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">ModelZeroRPCService</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/model_zerorpc_service/service.py#L19">here</a>.</p>
<p>When the model service is started the __init__ method will be
executed:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model_manager</span><span class="o">.</span><span class="n">load_models</span><span class="p">(</span><span class="n">configuration</span><span class="o">=</span><span class="n">configuration</span><span class="o">.</span><span class="n">models</span><span class="p">)</span>
<span class="k">for</span> <span class="n">model</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">():</span>
<span class="n">endpoint</span> <span class="o">=</span> <span class="n">MLModelZeroRPCCEndpoint</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="o">=</span><span class="n">model</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">])</span>
<span class="n">operation_name</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">_predict"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">model</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">])</span>
<span class="nb">setattr</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">operation_name</span><span class="p">,</span> <span class="n">endpoint</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/model_zerorpc_service/service.py#L22-L30">here</a>.</p>
<p>The service starts by instantiating the ModelManager singleton, and
loading the models from the configuration. Next the service instantiates
one MLModelZeroRPCCEndpoint class for each model in the ModelManager and
attaches it to the "self" parameter with a dynamically created operation
name. The service method is mapped to the model's "predict" method by
the endpoint object that wraps it. The reason for this is so that we are
able to host any number of MLModel objects in the service, this code
allows us to attach them to the service object dynamically. At the end
of the initialization method, we have one service method for each model
that is hosted by the service.</p>
<p>The service is now able to receive prediction requests for the models
and return the predictions to the clients, but we can add some
functionality by exposing metadata about the models being hosted, the
get_models method does this:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">get_models</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">models</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">()</span>
<span class="k">return</span> <span class="n">models</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/model_zerorpc_service/service.py#L32-L36">here</a>.</p>
<p>The get_models procedure returns a list of models available for use,
but does not return all of the metadata available for a model. To
provide all of the metadata for a model, we'll add the
get_model_metadata method:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">get_model_metadata</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">qualified_name</span><span class="p">):</span>
<span class="n">model_metadata</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">model_manager</span><span class="o">.</span><span class="n">get_model_metadata</span><span class="p">(</span><span class="n">qualified_name</span><span class="o">=</span><span class="n">qualified_name</span><span class="p">)</span>
<span class="k">if</span> <span class="n">model_metadata</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="n">model_metadata</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"Metadata not found for this model."</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/model_zerorpc_service/service.py#L38-L44">here</a>.</p>
<h1>Using the Service</h1>
<p>To show now to use the service, we wrote a few scripts in the <a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/tree/master/scripts">scripts
folder</a>
of the project. To execute the scripts we first have to start up the
service with these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">APP_SETTINGS</span><span class="o">=</span>ProdConfig
python model_zerorpc_service/service.py
</code></pre></div>
<p>The ZeroRPC Python package has a utility that allows for communication
with a ZeroRPC service from the command line. Now that we have a ZeroRPC
service running, we can execute this command to get a list of procedures
available on the service:</p>
<div class="highlight"><pre><span></span><code>zerorpc tcp://127.0.0.1:4242
connecting to <span class="s2">"tcp://127.0.0.1:4242"</span>
<span class="o">[</span>ModelZeroRPCService<span class="o">]</span>
get_model_metadata Return metadata about a model hosted by the service.
get_models Return list of models hosted <span class="k">in</span> this service.
iris_model_predict Predict with the Iris Model.
</code></pre></div>
<p>The ZeroRPC tool will return a description of the methods available in
the service. The iris_model_predict procedure's documentation string
was generated when we instantiated the model's endpoint.</p>
<p>Next, we'll call a procedure on the service with Python code. Getting a
list the models available by calling the "get_models" procedure is very
simple:</p>
<div class="highlight"><pre><span></span><code><span class="n">client</span> <span class="o">=</span> <span class="n">zerorpc</span><span class="o">.</span><span class="n">Client</span><span class="p">()</span>
<span class="n">client</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span><span class="s2">"tcp://127.0.0.1:4242"</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">get_models</span><span class="p">()</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Result: </span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">result</span><span class="p">))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/scripts/get_models.py">here</a>.</p>
<p>Executing the code able should print out a list of models that are being
hosted by the service:</p>
<div class="highlight"><pre><span></span><code>Result: <span class="o">[{</span><span class="s1">'display_name'</span>: <span class="s1">'Iris Model'</span>, <span class="s1">'qualified_name'</span>:
<span class="s1">'iris_model'</span>, <span class="s1">'description'</span>: <span class="s1">'A machine learning model for</span>
<span class="s1">predicting the species of a flower based on its measurements.'</span>,
<span class="s1">'major_version'</span>: <span class="m">0</span>, <span class="s1">'minor_version'</span>: <span class="m">1</span><span class="o">}]</span>
</code></pre></div>
<p>Making a prediction with the service is just as easy:</p>
<div class="highlight"><pre><span></span><code><span class="n">client</span> <span class="o">=</span> <span class="n">zerorpc</span><span class="o">.</span><span class="n">Client</span><span class="p">()</span>
<span class="n">client</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span><span class="s2">"tcp://127.0.0.1:4242"</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">iris_model_predict</span><span class="p">({</span><span class="s2">"sepal_length"</span><span class="p">:</span> <span class="mf">1.1</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">1.2</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.4</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">1.5</span><span class="p">})</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Result: </span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">result</span><span class="p">))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/zerorpc-ml-model-deployment/blob/master/scripts/predict_with_model.py#L5-L11">here</a>.</p>
<p>To see how exceptions are handled by the ZeroRPC service, we'll change
the code of the client to purposefully cause an exception in the MLModel
class:</p>
<div class="highlight"><pre><span></span><code><span class="n">client</span> <span class="o">=</span> <span class="n">zerorpc</span><span class="o">.</span><span class="n">Client</span><span class="p">()</span>
<span class="n">client</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span><span class="s2">"tcp://127.0.0.1:4242"</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">iris_model_predict</span><span class="p">({</span><span class="s2">"sepal_length"</span><span class="p">:</span> <span class="mf">1.1</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">1.2</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.4</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="s2">"abc"</span><span class="p">})</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Result: </span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">result</span><span class="p">))</span>
</code></pre></div>
<p>When we execute the client code, we get this exception being thrown:</p>
<div class="highlight"><pre><span></span><code>python scripts/predict_with_model.py
Traceback <span class="o">(</span>most recent call last<span class="o">)</span>:
File <span class="s2">"scripts/predict_with_model.py"</span>, line <span class="m">15</span>, <span class="k">in</span> <module>
...
File /Users/brian/Code/zerorpc-ml-model-deployment/venv/lib/python3.7/site-packages/iris_model/iris_predict.py<span class="s2">",</span>
<span class="s2">line 51, in predict</span>
<span class="s2">raise MLModelSchemaValidationException("</span>Failed to validate input data:
<span class="o">{}</span><span class="s2">".format(str(e)))</span>
<span class="s2">ml_model_abc.MLModelSchemaValidationException: Failed to validate</span>
<span class="s2">input data: Key 'petal_width' error: 'abc' should be instance of 'float'</span>
</code></pre></div>
<h1>Closing</h1>
<p>In this blog post we've shown how it is possible to deploy a machine
learning model using the ZeroRPC framework. The service is able to host
any number of models that implement the MLModel interface. The service
codebase is simpler than a RESTful service, and is more lightweight than
the JSON serialization format that is usually used by REST web services.
RPC services are also simpler to understand than REST services, since
they mimic a normal function call on the client side.</p>
<p>The ZeroRPC service has some benefits, but also has some drawbacks when
compared to gRPC. The ZeroRPC framework does not have any way to provide
schema information for the data structures that make up the request and
responses of the service. In comparison, gRPC Protocol Buffers require
the developer of the service to provide a full data contract for the
service, and REST services have JSON Schema and the OpenAPI
specification that can provide this information about the service. By
building the get_model_metadata endpoint, we've been able to provide
this information for each model hosted in the service, but not for the
whole service.</p>
<p>The ZeroRPC framework provides more functionality for RPC calls by
allowing the server to stream responses back to the client. This allows
the server to send back prediction responses as they become available at
the server and provides a simple interface. In the future, it would be
interesting to leverage this feature of ZeroRPC to stream prediction
responses to the client.</p>A Websocket ML Model Deployment2020-04-04T09:25:00-05:002020-04-04T09:25:00-05:00Brian Schmidttag:www.tekhnoal.com,2020-04-04:/websocket-ml-model-deployment.html<p>In the world of web applications, the ability to create responsive and interactive experiences is limited when we do normal request-response requests against a REST API. In the request-response programming paradigm, requests are always initiated by the client system and fulfilled by the server and continuously sending and receiving data is not supported. To fix this problem, the Websocket standard was created. Websockets allow a client and service to exchange data in a bidirectional, full-duplex connection which stays open for a long period of time. This approach offers much higher efficiency in the communication between the server and client. Just like a normal HTTP connection, Websockets work in ports 80 and 443 and support proxies and load balancers. Websockets also allow the server to send data to the client without having first received a request from the client which helps us to build more interactive applications.</p><p>This blog post builds on the ideas started in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog posts</a>.</p>
<p>In this blog post I'll show how to deploy the same ML model that l
deployed as a batch job in this <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog post</a>,
as a task queue in this <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">blog post</a>,
inside an AWS Lambda in this <a href="https://www.tekhnoal.com/lambda-ml-model-deployment.html">blog post</a>,
as a Kafka streaming application in this <a href="https://www.tekhnoal.com/streaming-ml-model-deployment.html">blog post</a>,
and a gRPC service in this <a href="https://www.tekhnoal.com/grpc-ml-model-deployment.html">blog post</a>,
and as a MapReduce job in this <a href="https://www.tekhnoal.com/map-reduce-ml-model-deployment.html">blog post</a>.</p>
<p>The code in this blog post can be found in this <a href="https://github.com/schmidtbri/websocket-ml-model-deployment">github repo</a>.</p>
<h1>Introduction</h1>
<p>In the world of web applications, the ability to create responsive and
interactive experiences is limited when we do normal request-response
requests against a REST API. In the request-response programming
paradigm, requests are always initiated by the client system and
fulfilled by the server and continuously sending and receiving data is
not supported. To fix this problem, the Websocket standard was created.
Websockets allow a client and service to exchange data in a
bidirectional, full-duplex connection which stays open for a long period
of time. This approach offers much higher efficiency in the
communication between the server and client. Just like a normal HTTP
connection, Websockets work in ports 80 and 443 and support proxies and
load balancers. Websockets also allow the server to send data to the
client without having first received a request from the client which
helps us to build more interactive applications.</p>
<p>Just like other web technologies, Websockets are useful for creating
applications that run in a web browser. Websockets are useful for
deploying machine learning models when the predictions made by the model
need to be available to a user interface running in a web browser. One
benefit of the Websocket protocol is that we are not limited to making a
prediction when the client requests it, since the server is able to send
a prediction from the model to the client at any time without waiting
for the client to make a prediction request. In this blog post we will
show how to build a Websocket service that works with machine learning
models.</p>
<h1>Package Structure</h1>
<p>To begin, we set up the project structure for the websocket service:</p>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">model_websocket_service</span> <span class="ss">(</span> <span class="nv">python</span> <span class="nv">package</span> <span class="k">for</span> <span class="nv">websocket</span> <span class="nv">service</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">static</span> <span class="ss">(</span><span class="nv">Javascript</span> <span class="nv">files</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">templates</span> <span class="ss">(</span><span class="nv">HTML</span> <span class="nv">templates</span> <span class="k">for</span> <span class="nv">UI</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">config</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">configuration</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">service</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">endpoints</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">Websocket</span> <span class="nv">handler</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">ml_model_manager</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">class</span> <span class="k">for</span> <span class="nv">managing</span> <span class="nv">models</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">schemas</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">schemas</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">API</span> <span class="nv">data</span> <span class="nv">objects</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">views</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">web</span> <span class="nv">views</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">UI</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">scripts</span> <span class="ss">(</span><span class="nv">test</span> <span class="nv">script</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span><span class="nv">unit</span> <span class="nv">tests</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">Dockerfile</span>
<span class="o">-</span> <span class="nv">Makefle</span>
<span class="o">-</span> <span class="nv">README</span>.<span class="nv">md</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">setup</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span>
</code></pre></div>
<p>This structure can be seen here in the <a href="https://github.com/schmidtbri/websocket-ml-model-deployment">github repository</a>.</p>
<h1>Websockets</h1>
<p>Websockets are fundamentally different from normal HTTP connections.
They are full-duplex, which means that the client and server can
exchange data in both directions. Websocket connections are also
long-lived, which means that the connection stays open even when no
messages are being exchanged. Lastly, websocket connections are
event-based, which means that messages from the server are handled by
the client in an "event handler" function that is registered to an event
type. The same happens in the server code, which handles events from the
client by registering handlers. There are four default events that are
built into the Websocket protocol: open, message, error, and close.
Apart from these event types, we are free to add our own event types and
exchange messages through them.</p>
<h1>Installing the Model</h1>
<p>To begin working on a Websocket service that can host any ML model,
we'll need a model to work with. For this, we'll use the same model that
we've used in the previous blog posts, the iris_model package. The
package can be installed directly from the git repository where it is
hosted with this command:</p>
<div class="highlight"><pre><span></span><code>pip install git+https://github.com/schmidtbri/ml-model-abc-improvements
</code></pre></div>
<p>This command should install the model code and parameters, along with
all of its dependencies. To make sure everything is working correctly,
we can make a prediction with the model in an interactive Python
session:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="kn">from</span> <span class="nn">iris_model.iris_predict</span> <span class="kn">import</span> <span class="n">IrisModel</span>
<span class="o">>>></span> <span class="n">model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="o">>>></span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">({</span><span class="s2">"sepal_length"</span><span class="p">:</span><span class="mf">1.1</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">1.2</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">1.3</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.4</span><span class="p">})</span>
<span class="p">{</span><span class="s1">'species'</span><span class="p">:</span> <span class="s1">'setosa'</span><span class="p">}</span>
</code></pre></div>
<p>Now that we have a working model in the Python environment, we'll need
to point the service to it. To do this, we'll add the IrisModel class to
the configuration in the config.py file:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">Config</span><span class="p">(</span><span class="nb">dict</span><span class="p">):</span>
<span class="n">models</span> <span class="o">=</span> <span class="p">[{</span>
<span class="s2">"module_name"</span><span class="p">:</span> <span class="s2">"iris_model.iris_predict"</span><span class="p">,</span>
<span class="s2">"class_name"</span><span class="p">:</span> <span class="s2">"IrisModel"</span>
<span class="p">}]</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/config.py#L4-L15">here</a>.</p>
<p>This configuration gives us flexibility when adding and removing models
from the service. The service is able to host any number of models, as
long as they are installed in the environment and added to the
configuration. The module_name and class_name fields in the
configuration point to a class that implements the MLModel interface,
which allows the service to make predictions with the model.</p>
<p>As in previous blog posts, we\'ll use a singleton object to manage the
ML model objects that will be used to make predictions. The class that
the singleton object is instantiated from is called ModelManager. The
class is responsible for instantiating MLModel objects, managing the
instances, returning information about the MLModel objects, and
returning references to the objects when needed. The code for the
ModelManager class can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/model_manager.py">here</a>.
A complete explanation of the ModelManager class can be found in <a href="https://medium.com/@brianschmidt_78145/using-the-ml-model-base-class-7b984edf47c5">this
blog
post</a>.</p>
<h1>Defining the Service</h1>
<p>The websocket service is built around the <a href="https://flask.palletsprojects.com/en/1.1.x/">Flask
framework</a>,
which can be extended to support Websockets with the
<a href="https://flask-socketio.readthedocs.io/en/latest/">flask_socketio</a>
extension. The Flask application is initialized in the __init__.py
file of the package like this:</p>
<div class="highlight"><pre><span></span><code><span class="n">app</span> <span class="o">=</span> <span class="n">Flask</span><span class="p">(</span><span class="vm">__name__</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/__init__.py#L16">here</a>.</p>
<p>Now that we have an application object, we can load the configuration
into it:</p>
<div class="highlight"><pre><span></span><code><span class="k">if</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"APP_SETTINGS"</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">app</span><span class="o">.</span><span class="n">config</span><span class="o">.</span><span class="n">from_object</span><span class="p">(</span><span class="s2">"model_websocket_service.config.</span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">'APP_SETTINGS'</span><span class="p">]))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/__init__.py#L18-L20">here</a>.</p>
<p>The configuration is loaded according to the value in the APP_SETTINGS
environment variable. This allows us to change the setting based on the
environment we are running in. Now that we have the app configured we
can initialize the Flask extensions we'll be using:</p>
<div class="highlight"><pre><span></span><code><span class="n">bootstrap</span> <span class="o">=</span> <span class="n">Bootstrap</span><span class="p">(</span><span class="n">app</span><span class="p">)</span>
<span class="n">socketio</span> <span class="o">=</span> <span class="n">SocketIO</span><span class="p">(</span><span class="n">app</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/__init__.py#L22-L23">here</a>.</p>
<p>The Bootstrap extensions will be used to build a user interface and the
SocketIO extension will be used to handle the Websocket connections and
events. With the extensions loaded, we can now import the code that
handles the Websocket events, REST requests, and renders the views of
the UI:</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">model_websocket_service.endpoints</span>
<span class="kn">import</span> <span class="nn">model_websocket_service.views</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/__init__.py#L26-L27">here</a>.</p>
<p>Lastly, we will instantiate the ModelManager singleton at application
startup. This function is executed by the Flask framework before the
application starts serving requests. The models that will be loaded are
retrieved from the configuration object that we loaded above.</p>
<div class="highlight"><pre><span></span><code><span class="nd">@app</span><span class="o">.</span><span class="n">before_first_request</span>
<span class="k">def</span> <span class="nf">instantiate_model_manager</span><span class="p">():</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">model_manager</span><span class="o">.</span><span class="n">load_models</span><span class="p">(</span><span class="n">configuration</span><span class="o">=</span><span class="n">app</span><span class="o">.</span><span class="n">config</span><span class="p">[</span><span class="s2">"MODELS"</span><span class="p">])</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/__init__.py#L30-L36">here</a>.</p>
<p>With this code, we set up the basic Flask application that will handle
the Websocket events.</p>
<h1>Websocket Event Handler</h1>
<p>With the application set up, we can now work on the code that handles
the Websocket events. This code is in the <a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/endpoints.py">endpoints.py
module</a>.
To begin, we'll import the Flask app object and the socketio extension
object from the package:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">model_websocket_service</span> <span class="kn">import</span> <span class="n">app</span><span class="p">,</span> <span class="n">socketio</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/endpoints.py#L8">here</a>.</p>
<p>A websocket handler is just a function that is decorated with the
\@socketio.on() decorator. The decorator registers the function as a
Websocket event handler with the Flask framework, which will call the
function whenever an event of the type described in the decorator is
received by the application. We'll use the decorator here to handle
events of type "prediction_request", which will handle the prediction
requests that clients send to the server.</p>
<div class="highlight"><pre><span></span><code><span class="nd">@socketio</span><span class="o">.</span><span class="n">on</span><span class="p">(</span><span class="s1">'prediction_request'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">message</span><span class="p">(</span><span class="n">message</span><span class="p">):</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">prediction_request_schema</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">message</span><span class="p">)</span>
<span class="k">except</span> <span class="n">ValidationError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">response_data</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">type</span><span class="o">=</span><span class="s2">"DESERIALIZATION_ERROR"</span><span class="p">,</span> <span class="n">message</span><span class="o">=</span><span class="nb">str</span><span class="p">(</span><span class="n">e</span><span class="p">))</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">error_response_schema</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">response_data</span><span class="p">)</span>
<span class="n">emit</span><span class="p">(</span><span class="s1">'prediction_error'</span><span class="p">,</span> <span class="n">response</span><span class="p">)</span>
<span class="k">return</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/endpoints.py#L80-L90">here</a>.</p>
<p>The first thing we do when receiving a message from the client is to try
to deserialize it with the <a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/schemas.py#L64-L69">PredictionRequest
schema</a>.
This schema contains the inputs to the model predict() method and also
the model's qualified name. If the deserialization fails, we'll respond
to the client by emitting a prediction error message back to the client
using the <a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/schemas.py#L55-L61">ErrorResponse
schema</a>.
The emit() function is provided by the socketio extension and is used to
send events to the client from the server.</p>
<p>Now that we have a deserialized prediction request from a client, we'll
try to get a reference to the model from the model manager. The service
will emit an ErrorResponse object back to the client system if it fails
to find the model that is requested by the client.</p>
<div class="highlight"><pre><span></span><code><span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">model_object</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model</span><span class="p">(</span><span class="n">qualified_name</span><span class="o">=</span><span class="n">data</span><span class="p">[</span><span class="s2">"model_qualified_name"</span><span class="p">])</span>
<span class="k">if</span> <span class="n">model_object</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">response_data</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="o">=</span><span class="n">data</span><span class="p">[</span><span class="s2">"model_qualified_name"</span><span class="p">],</span> <span class="nb">type</span><span class="o">=</span><span class="s2">"ERROR"</span><span class="p">,</span> <span class="n">message</span><span class="o">=</span><span class="s2">"Model not found."</span><span class="p">)</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">error_response_schema</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">response_data</span><span class="p">)</span>
<span class="n">emit</span><span class="p">(</span><span class="s1">'prediction_error'</span><span class="p">,</span> <span class="n">response</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/endpoints.py#L92-L101">here</a>.</p>
<p>If the model is found, then this code will be executed:</p>
<div class="highlight"><pre><span></span><code><span class="k">else</span><span class="p">:</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">model_object</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="s2">"input_data"</span><span class="p">])</span>
<span class="n">response_data</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="o">=</span><span class="n">model_object</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">,</span> <span class="n">prediction</span><span class="o">=</span><span class="n">prediction</span><span class="p">)</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">prediction_response_schema</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">response_data</span><span class="p">)</span>
<span class="n">emit</span><span class="p">(</span><span class="s1">'prediction_response'</span><span class="p">,</span> <span class="n">response</span><span class="p">)</span>
<span class="k">except</span> <span class="n">MLModelSchemaValidationException</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">response_data</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="o">=</span><span class="n">model_object</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="s2">"SCHEMA_ERROR"</span><span class="p">,</span> <span class="n">message</span><span class="o">=</span><span class="nb">str</span><span class="p">(</span><span class="n">e</span><span class="p">))</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">error_response_schema</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">response_data</span><span class="p">)</span>
<span class="n">emit</span><span class="p">(</span><span class="s1">'prediction_error'</span><span class="p">,</span> <span class="n">response</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">response_data</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="o">=</span><span class="n">model_object</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="s2">"ERROR"</span><span class="p">,</span> <span class="n">message</span><span class="o">=</span><span class="s2">"Could not make a prediction."</span><span class="p">)</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">error_response_schema</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">response_data</span><span class="p">)</span>
<span class="n">emit</span><span class="p">(</span><span class="s1">'prediction_error'</span><span class="p">,</span> <span class="n">response</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/endpoints.py#L102-L119">here</a>.</p>
<p>If the prediction is made successfully by the model, a
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/schemas.py#L72-L77">PredictionResponse
object</a>
is serialized and emitted back to the client through the
'prediction_response' event type. If the model raises an
MLModelSchemaValidationException error, the error is serialized and sent
back by emitting an <a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/76992cd67785476788c50add221d498310952ac9/model_websocket_service/schemas.py#L55-L61">ErrorResponse
object</a>
back to the client. If any other type of exception is raised, a
ErrorResponse object is created and sent back to the client.</p>
<p>The Websocket handler that we built in this section is the only one that
we need to add to the service in order to expose any machine learning
models to clients of the Websocket service. The handler is able to
forward prediction requests to any model that is loaded in the
ModelManager singleton. The handler is also able to handle any
exceptions raised by the model and return the error back to the client.</p>
<h2>REST Endpoints</h2>
<p>In order to make the Websocket service easy to use, we will be adding
two REST endpoints that expose data about the models that are being
hosted by the service. Even though the models can be reached directly by
connecting to the Websocket endpoint and sending prediction request
events, knowing what models are available and data to send into each
model is helpful for users of the service.</p>
<p>The first REST endpoint queries the ModelManager for information about
all of the models in it and returns the information as a JSON data
structure to the client.</p>
<div class="highlight"><pre><span></span><code><span class="nd">@app</span><span class="o">.</span><span class="n">route</span><span class="p">(</span><span class="s2">"/api/models"</span><span class="p">,</span> <span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s1">'GET'</span><span class="p">])</span>
<span class="k">def</span> <span class="nf">get_models</span><span class="p">():</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">models</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">()</span>
<span class="n">response_data</span> <span class="o">=</span> <span class="n">model_collection_schema</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="nb">dict</span><span class="p">(</span><span class="n">models</span><span class="o">=</span><span class="n">models</span><span class="p">))</span>
<span class="k">return</span> <span class="n">response_data</span><span class="p">,</span> <span class="mi">200</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/endpoints.py#L22-L41">here</a>.</p>
<p>The second REST endpoint is used to return metadata about a specific
model hosted by the service. The metadata returned includes the input
and output schemas that the model uses for it's prediction function.</p>
<div class="highlight"><pre><span></span><code><span class="nd">@app</span><span class="o">.</span><span class="n">route</span><span class="p">(</span><span class="s2">"/api/models/<qualified_name>/metadata"</span><span class="p">,</span> <span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s1">'GET'</span><span class="p">])</span>
<span class="k">def</span> <span class="nf">get_metadata</span><span class="p">(</span><span class="n">qualified_name</span><span class="p">):</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">metadata</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model_metadata</span><span class="p">(</span><span class="n">qualified_name</span><span class="o">=</span><span class="n">qualified_name</span><span class="p">)</span>
<span class="k">if</span> <span class="n">metadata</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">response_data</span> <span class="o">=</span> <span class="n">model_metadata_schema</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">metadata</span><span class="p">)</span>
<span class="k">return</span> <span class="n">Response</span><span class="p">(</span><span class="n">response_data</span><span class="p">,</span> <span class="n">status</span><span class="o">=</span><span class="mi">200</span><span class="p">,</span> <span class="n">mimetype</span><span class="o">=</span><span class="s1">'application/json'</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">response</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">type</span><span class="o">=</span><span class="s2">"ERROR"</span><span class="p">,</span> <span class="n">message</span><span class="o">=</span><span class="s2">"Model not found."</span><span class="p">)</span>
<span class="n">response_data</span> <span class="o">=</span> <span class="n">error_response_schema</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">response</span><span class="p">)</span>
<span class="k">return</span> <span class="n">Response</span><span class="p">(</span><span class="n">response_data</span><span class="p">,</span> <span class="n">status</span><span class="o">=</span><span class="mi">400</span><span class="p">,</span> <span class="n">mimetype</span><span class="o">=</span><span class="s1">'application/json'</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/endpoints.py#L44-L77">here</a>.</p>
<h1>Using the Service</h1>
<p>In order to test the Websocket server we wrote a short python script
that connects through a websocket, sends a prediction request, and
receives and displays a prediction response. The script can be found in
the <a href="https://github.com/schmidtbri/websocket-ml-model-deployment/tree/master/scripts">scripts
folder</a>.</p>
<p>The script's main function connects to localhost on port 80 and sends a
single message to the prediction_request channel:</p>
<div class="highlight"><pre><span></span><code><span class="n">sio</span> <span class="o">=</span> <span class="n">socketio</span><span class="o">.</span><span class="n">Client</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
<span class="n">sio</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span><span class="s1">'http://0.0.0.0:80'</span><span class="p">)</span>
<span class="n">data</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'model_qualified_name'</span><span class="p">:</span> <span class="s1">'iris_model'</span><span class="p">,</span> <span class="s1">'input_data'</span><span class="p">:</span> <span class="p">{</span><span class="s1">'sepal_length'</span><span class="p">:</span> <span class="mf">1.1</span><span class="p">,</span> <span class="s1">'sepal_width'</span><span class="p">:</span> <span class="mf">1.1</span><span class="p">,</span> <span class="s1">'petal_length'</span><span class="p">:</span> <span class="mf">1.1</span><span class="p">,</span> <span class="s1">'petal_width'</span><span class="p">:</span> <span class="mf">1.1</span><span class="p">}}</span>
<span class="n">sio</span><span class="o">.</span><span class="n">emit</span><span class="p">(</span><span class="s1">'prediction_request'</span><span class="p">,</span> <span class="n">data</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/scripts/test_prediction.py#L4-L11">here</a>.</p>
<p>To receive a prediction response from the server, we register a function
that will be called on every message in the "prediction_response"
channel:</p>
<div class="highlight"><pre><span></span><code><span class="nd">@sio</span><span class="o">.</span><span class="n">on</span><span class="p">(</span><span class="s1">'prediction_response'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">on_message</span><span class="p">(</span><span class="n">data</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s1">'Prediction response: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">data</span><span class="p">)))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/scripts/test_prediction.py#L14-L16">here</a>.</p>
<p>To use the script, we first start the server with these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">APP_SETTINGS</span><span class="o">=</span>ProdConfig
gunicorn --worker-class eventlet -w <span class="m">1</span> -b <span class="m">0</span>.0.0.0:80 model_websocket_service:app
</code></pre></div>
<p>Then we can run the script with this command:</p>
<div class="highlight"><pre><span></span><code>python scripts/test_prediction.py
</code></pre></div>
<p>The script will send the prediction request and then print the response
from the server to the screen:</p>
<div class="highlight"><pre><span></span><code>Prediction response: {'prediction': {'species': 'setosa'}, 'model_qualified_name': 'iris_model'}
</code></pre></div>
<h1>Building a User Interface</h1>
<p>In order to show how to use the Websocket service in a real-world client
application we built a simple website around the Websocket and REST
endpoints that were described above. The user interface leverages the
models and metadata REST endpoints to display information about the
models being hosted by the service, and it uses the Websocket endpoint
to make predictions with the models.</p>
<p>This user interface is similar to the one we built for <a href="https://medium.com/@brianschmidt_78145/using-the-ml-model-base-class-7b984edf47c5">this blog
post</a>,
where we showed how to deploy models behind a Flask REST service. We are
reusing a lot of the same code here.</p>
<h2>Flask Views</h2>
<p>The Flask framework supports rendering HTML web pages through the
<a href="https://jinja.palletsprojects.com/en/2.11.x/">Jinja</a>
templating engine. We created an HTML template that displays the model
available through the service. The <a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/views.py#L11-L20">view
code</a>
uses the ModelManager object to get a list of the model being hosted,
then renders the list to an HTML document that is returned to the
client's web browser:</p>
<p><img alt="Index View" src="https://www.tekhnoal.com/index_view.png" width="100%"></p>
<p>In order to show a model's metadata, we built a view that queries the
model object directly and renders an HTML view with the information:</p>
<p><img alt="Metadata View" src="https://www.tekhnoal.com/metadata_view.png" width="100%"></p>
<p>Both of these views are rendered in the service and do not use the REST
endpoints to retrieve the information about the models.</p>
<h2>Dynamic Web Form</h2>
<p>The last webpage we'll build for the application is special because it
renders a dynamically -generated form that is created from the model's
input schema. The webpage uses the model's metadata REST endpoint to get
the input schema of the model and uses the <a href="https://github.com/brutusin/json-forms">brutusin forms
package</a> to render
the form in the browser.</p>
<p>The form accepts input from the user and sends it to the server as a
Websocket event of type 'prediction_request'. The webpage also has a
Websocket event listener that is able to render all of the
'prediction_response' and 'prediction_error' Websocket events that the
server emits back to the client. The code for this webpage can be found
<a href="https://github.com/schmidtbri/websocket-ml-model-deployment/blob/master/model_websocket_service/templates/predict.html">here</a>.</p>
<p><img alt="Predict View" src="https://www.tekhnoal.com/predict_view.png" width="100%"></p>
<h1>Closing</h1>
<p>The Websocket protocol is a simple way to build more interactive web
pages that has wide support in modern browsers. By deploying ML models
in a Websocket service, we're able to integrate predictions from the
models into web applications quickly and easily. As in previous blog
posts, the service is built so that it is able to host any ML model that
implements the MLModel interface. Deploying a new ML model is as simple
as installing the python package, and adding the model to the
configuration of the service. Combining the Websocket protocol with
machine learning models is quick and easy if the code is written in the
right way.</p>A MapReduce ML Model Deployment2020-02-23T09:25:00-05:002020-02-23T09:25:00-05:00Brian Schmidttag:www.tekhnoal.com,2020-02-23:/map-reduce-ml-model-deployment.html<p>Because of the growing need to process large amounts of data across many computers, the Hadoop project was started in 2006. Hadoop is a set of software components that help to solve large scale data processing problems using clusters of computers. Hadoop supports mass data storage through the HDFS component and large scale data processing through the MapReduce component. Hadoop clusters have become a central part of the infrastructure of many companies because of their usefulness. In this blog post, we'll focus on the MapReduce component of Hadoop since we will be deploying a machine learning model, which is a compute-intensive process. MapReduce is a programming framework for data processing which is useful for processing large amounts of distributed data. MapReduce is able to handle errors and failures in the computation. MapReduce is also inherently parallel in nature but abstracts out that fact, making the code look like single-process code.</p><p>This blog post builds on the ideas started in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog posts</a>.</p>
<p>In this blog post I'll show how to deploy the same ML model that l
deployed as a batch job in this <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog
post</a>,
as a task queue in this <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">blog
post</a>,
inside an AWS Lambda in this <a href="https://www.tekhnoal.com/lambda-ml-model-deployment.html">blog
post</a>,
as a Kafka streaming application in this <a href="https://www.tekhnoal.com/streaming-ml-model-deployment.html">blog
post</a>,
and a gRPC service in this <a href="https://www.tekhnoal.com/grpc-ml-model-deployment.html">blog
post</a>.</p>
<p>The code in this blog post can be found in this <a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment">github
repo</a>.</p>
<h1>Introduction</h1>
<p>Because of the growing need to process large amounts of data across many
computers, the Hadoop project was started in 2006. Hadoop is a set of
software components that help to solve large scale data processing
problems using clusters of computers. Hadoop supports mass data storage
through the HDFS component and large scale data processing through the
MapReduce component. Hadoop clusters have become a central part of the
infrastructure of many companies because of their usefulness.</p>
<p>In this blog post, we'll focus on the MapReduce component of Hadoop
since we will be deploying a machine learning model, which is a
compute-intensive process. MapReduce is a programming framework for data
processing which is useful for processing large amounts of distributed
data. MapReduce is able to handle errors and failures in the
computation. MapReduce is also inherently parallel in nature but
abstracts out that fact, making the code look like single-process code.</p>
<p>Hadoop and MapReduce are used to process large data sets almost
exclusively. Even though machine learning models are trained over large
data sets, we'll focus on using MapReduce to execute predictions. Hadoop
and MapReduce should be considered when a prediction batch job needs to
be executed on millions or billions of records. This blog post is
similar to <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">a previous blog
post</a>
that deployed an ML model as a batch job, but that post was focused on
small scale batch jobs that could run quickly on single machines.</p>
<p>Because the results of a batch prediction job are stored and accessed
later by clients, the user can't interact with the model directly. This
means that the client that is using the predictions produced by the
model is not able to ask for predictions directly from the ML model
software component, and must access the data set produced by the batch
job to get predictions from the model.</p>
<h1>Package Structure</h1>
<p>To begin, I set up the project structure for the job package:</p>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">data</span> <span class="ss">(</span><span class="nv">data</span> <span class="nv">files</span> <span class="nv">used</span> <span class="k">for</span> <span class="nv">testing</span> <span class="nv">the</span> <span class="nv">job</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_map_reduce_job</span> <span class="ss">(</span><span class="nv">python</span> <span class="nv">package</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">map</span> <span class="nv">reduce</span> <span class="nv">job</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">config</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">ml_model_map_reduce_job</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">ml_model_manager</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span> <span class="nv">unit</span> <span class="nv">tests</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">Makefle</span>
<span class="o">-</span> <span class="nv">mrjob</span>.<span class="nv">conf</span> <span class="ss">(</span><span class="nv">configuration</span> <span class="nv">file</span> <span class="k">for</span> <span class="nv">MapReduce</span> <span class="nv">framework</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">README</span>.<span class="nv">md</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">setup</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span>
</code></pre></div>
<p>This structure can be seen here in the <a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment">github
repository</a>.</p>
<h1>Building MapReduce Jobs</h1>
<p>A MapReduce job is made up of two basic steps: the map step and the
reduce step. Both steps are implemented as simple functions that receive
data, process it and return the results. The map step is responsible for
implementing filtering and sorting and the reduce step is responsible
for calculating aggregate results. The MapReduce system is responsible
for starting, managing, and stopping the code in the map and reduce
functions, for serializing and deserializing the data, and for managing
the redundancy and fault tolerance of the execution of the map and
reduce functions.</p>
<p>The MapReduce implementation provided by Hadoop is able to do data
processing with map and reduce functions implemented in many different
programming languages by using the <a href="https://hadoop.apache.org/docs/r1.2.1/streaming.html">streaming
interface</a>.
In this blog post, we'll use this interface to run a model prediction
job using Python. This simplifies the deployment of the model greatly,
since we don't need to rewrite the model's prediction code in order to
deploy it to a Hadoop cluster. We'll be using the <a href="https://mrjob.readthedocs.io/en/latest/index.html">mrjob python
package</a>
to write the MapReduce job.</p>
<h1>Installing the Model</h1>
<p>In order to write a MapReduce job that is able to handle any machine
learning model, we'll start by installing a model into the environment.
For this we can use the same model we've used before, the iris_model
package. This package can be installed from a git repository with this
command:</p>
<div class="highlight"><pre><span></span><code>pip install git+https://github.com/schmidtbri/ml-model-abc-improvements
</code></pre></div>
<p>Now that we have the model installed in the environment, we can try it
out by opening a python interpreter and entering this code:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">iris_model.iris_predict</span> <span class="kn">import</span> <span class="n">IrisModel</span>
<span class="o">>>></span> <span class="n">model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="o">>>></span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">({</span><span class="s2">"sepal_length"</span><span class="p">:</span><span class="mf">1.1</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">1.2</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">1.3</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.4</span><span class="p">})</span>
<span class="p">{</span><span class="s1">'species'</span><span class="p">:</span> <span class="s1">'setosa'</span><span class="p">}</span>
</code></pre></div>
<p>To load the model inside of the MapReduce job, we'll point at the
IrisModel class in a configuration file. The configuration file looks
like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">Config</span><span class="p">(</span><span class="nb">dict</span><span class="p">):</span>
<span class="n">models</span> <span class="o">=</span> <span class="p">[{</span>
<span class="s2">"module_name"</span><span class="p">:</span> <span class="s2">"iris_model.iris_predict"</span><span class="p">,</span>
<span class="s2">"class_name"</span><span class="p">:</span> <span class="s2">"IrisModel"</span>
<span class="p">}]</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/model_mapreduce_job/config.py#L4-L12">here</a>.</p>
<p>This configuration will be used by the job to dynamically load the model
packages. The module_name and class_name fields allow the job to
import the class that contains the implementation of the model's
prediction algorithm. The models list can contain pointers to many
models, so there are no limitations to how many models can be hosted by
the MapReduce job.</p>
<h1>Managing Models</h1>
<p>As in previous blog posts, we'll use a singleton object to manage the ML
model objects that will be used to make predictions. The class that the
singleton object is instantiated from is called "ModelManager". The
class is responsible for instantiating MLModel objects, managing the
instances, returning information about the MLModel objects, and
returning references to the objects when needed. The code for the
ModelManager class can be found
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/model_mapreduce_job/model_manager.py">here</a>.
For a full explanation of the code in the class, read this <a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog
post</a>.</p>
<h1>MLModelMapReduceJob Class</h1>
<p>We now have the model package installed and the ModelManager class to
manage it, so we can start to write the MapReduce job itself. The
MapReduce job is defined as a subclass of the MRJob base class which
defines map() and reduce() methods that implement the functionality of
the job. To start, we'll load the right configuration by accessing the
APP_SETTINGS environment variable:</p>
<div class="highlight"><pre><span></span><code><span class="n">configuration</span> <span class="o">=</span> <span class="nb">__import__</span><span class="p">(</span><span class="s2">"model_mapreduce_job"</span><span class="p">)</span><span class="o">.</span>
<span class="fm">__getattribute__</span><span class="p">(</span><span class="s2">"config"</span><span class="p">)</span><span class="o">.</span>
<span class="fm">__getattribute__</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"APP_SETTINGS"</span><span class="p">])</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/model_mapreduce_job/ml_model_map_reduce_job.py#L12-L15">here</a>.</p>
<p>With the configuration loaded, we'll instantiate the ModelManager
singleton which will hold the references to the model objects that we
want to host in this MapReduce job:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">model_manager</span><span class="o">.</span><span class="n">load_models</span><span class="p">(</span><span class="n">Config</span><span class="o">.</span><span class="n">models</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/model_mapreduce_job/ml_model_map_reduce_job.py#L17-L19">here</a>.</p>
<p>By putting this initialization at the top of the module, we can be sure
that the models are initialized one time only, when the module is loaded
by the python interpreter.</p>
<p>Now we can write the class that makes up the MapReduce job:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MLModelMapReduceJob</span><span class="p">(</span><span class="n">MRJob</span><span class="p">):</span>
<span class="n">INPUT_PROTOCOL</span> <span class="o">=</span> <span class="n">JSONValueProtocol</span>
<span class="n">OUTPUT_PROTOCOL</span> <span class="o">=</span> <span class="n">JSONProtocol</span>
<span class="n">DIRS</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'../model_mapreduce_job'</span><span class="p">]</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/model_mapreduce_job/ml_model_map_reduce_job.py#L22-L29">here</a>.</p>
<p>The INPUT_PROTOCOL and OUTPUT_PROTOCOL class properties define the
input and output
<a href="https://mrjob.readthedocs.io/en/latest/guides/writing-mrjobs.html#protocols">protocols</a>
of the MapReduce steps. A protocol is a piece of code that reads and
writes data to the filesystem, it is useful to abstract out the map and
reduce steps from the format in which the data is stored. The DIRS class
property tells the MrJob package that the code in this module depends on
code inside of the "model_map_reduce" directory, this causes MrJob to
copy the code whenever it creates a deployment package for this job.
These options help to simplify the code and deployment of the job.</p>
<p>The job class needs to be initialized, so we'll add a __init__()
method:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="nb">super</span><span class="p">(</span><span class="n">MLModelMapReduceJob</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">options</span><span class="o">.</span><span class="n">model_qualified_name</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"'</span><span class="si">{}</span><span class="s2">' not found in the ModelManager instance."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">options</span><span class="o">.</span><span class="n">model_qualified_name</span><span class="p">))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/model_mapreduce_job/ml_model_map_reduce_job.py#L31-L38">here</a>.</p>
<p>The __init__ method first calls the MrJob base class' __init__
method so that it can do framework-level initialization. Next, we ask
the ModelManager singleton for an instance of the model that we want to
host in the MapReduce job. The qualified name of the model is accessed
from the self.options.model_qualified_name variable, which is set by a
command line option. Lastly, we check that a model object was actually
returned by the ModelManager and raise an exception if it wasn't.</p>
<p>Next, the MapReduce job must be able to run on any model that is inside
of the ModelManager instance. To support this, we will add a command
line option to the job that accepts the qualified name of the model we
want to run:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">configure_args</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="nb">super</span><span class="p">(</span><span class="n">MLModelMapReduceJob</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="n">configure_args</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">add_passthru_arg</span><span class="p">(</span><span class="s1">'--model_qualified_name'</span><span class="p">,</span> \
<span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s1">'Qualified name of the model.'</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/model_mapreduce_job/ml_model_map_reduce_job.py#L40-L43">here</a>.</p>
<p>This function allows us to extend the command line options already
supported by the MrJob framework. The command line argument passes
through the framework and is stored in the self.options object, which we
used in the code in the __init__ method to select the model we want
to use for the job.</p>
<p>Now that we have an initialized job class, we can write the code that
actually does the work of the MapReduce job. The mapper function looks
like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">mapper</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="kc">None</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="kc">None</span>
<span class="k">yield</span> <span class="n">data</span><span class="p">,</span> <span class="n">prediction</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/model_mapreduce_job/ml_model_map_reduce_job.py#L45-L55">here</a>.</p>
<p>This function is very simple, it receives a dictionary in the "data"
argument, makes a prediction with the model, and returns a tuple of the
prediction input and output. The data argument is a dictionary because
we used the "JSONValueProtocol" as the INPUT_PROTOCOL for this job.
This protocol deserializes a JSON string into a native Python object. By
using this protocol, we saved ourselves the trouble of having to
deserialize the input to JSON in the mapper step. If the model fails to
make a prediction, then None is returned as the prediction. The
OUTPUT_PROTOCOL option is set to "JSONProtocol", which serializes the
key-value pair to two JSON strings separated by a tab character.</p>
<p>The output of the mapper step is always a key-value pair in which the
key must be unique across the inputs of the step. If any input is
repeated, the mapper step will make a prediction on it, but the
MapReduce framework will only return one result for the key to the next
step. This behavior sets up a limitation on our model: it must always
produce the same prediction given the same input, which is to say that
the model must make predictions deterministically. If the model is not
deterministic, the MapReduce framework will choose the first prediction
made for the input record. This may not matter in some situations but
may break any steps that use the results of this step if this behavior
is not handled correctly.</p>
<p>This MapReduce job does not need a reduce step since we only need to
make predictions and return the results. However, this job can be
combined with other MapReduce jobs that do use reduce steps to make more
a complex data processing pipeline.</p>
<h1>Testing the Job</h1>
<p>Now that we have the code for the MapReduce job, we will test it locally
against a small data file. Because of the input and output protocol
options, the model is able to accept JSON files as input and it will
produce JSON files as output. Here is an example of the JSON that we
will feed to the job:</p>
<div class="highlight"><pre><span></span><code><span class="p">{</span><span class="w"> </span><span class="nt">"sepal_length"</span><span class="p">:</span><span class="w"> </span><span class="mf">5.0</span><span class="p">,</span><span class="w"> </span><span class="nt">"sepal_width"</span><span class="p">:</span><span class="w"> </span><span class="mf">3.2</span><span class="p">,</span><span class="w"> </span><span class="nt">"petal_length"</span><span class="p">:</span><span class="w"> </span><span class="mf">1.2</span><span class="p">,</span><span class="w"> </span><span class="nt">"petal_width"</span><span class="p">:</span><span class="w"> </span><span class="mf">0.2</span><span class="p">}</span><span class="w"></span>
<span class="p">{</span><span class="w"> </span><span class="nt">"sepal_length"</span><span class="p">:</span><span class="w"> </span><span class="mf">5.5</span><span class="p">,</span><span class="w"> </span><span class="nt">"sepal_width"</span><span class="p">:</span><span class="w"> </span><span class="mf">3.5</span><span class="p">,</span><span class="w"> </span><span class="nt">"petal_length"</span><span class="p">:</span><span class="w"> </span><span class="mf">1.3</span><span class="p">,</span><span class="w"> </span><span class="nt">"petal_width"</span><span class="p">:</span><span class="w"> </span><span class="mf">0.2</span><span class="p">}</span><span class="w"></span>
<span class="err">...</span><span class="w"></span>
</code></pre></div>
<p>The data file can be found
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/data/input.ldjson">here</a>.</p>
<p>To execute the job locally, these commands need to be run:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">APP_SETTINGS</span><span class="o">=</span>ProdConfig
python model_mapreduce_job/ml_model_map_reduce_job.py <span class="se">\</span>
--model_qualified_name iris_model ./data/input.ldjson > data/output.ldjson
</code></pre></div>
<p>After the job runs, the output of the map step will be in the /data
folder. The input json string and resulting prediction will be on one
line of the file separated by a tab character. One input line had JSON
with a schema that the model could not accept, so the output should
contain a null prediction for that input. The --model_qualified_name
command line argument tells the job to use the iris_model model from
the ModelManager when running the job.</p>
<h1>Deploying to AWS</h1>
<p>The mrjob package supports running jobs in the <a href="https://aws.amazon.com/emr/">AWS Elastic Map
Reduce</a> (EMR) service. To run
the model job, we'll need an account in AWS. To interact with AWS, we'll
need to install the boto3 and awscli python packages:</p>
<div class="highlight"><pre><span></span><code>pip install boto3 awscli
</code></pre></div>
<p>Next we'll configure the API access keys. A set of access keys can be
generated and configured by <a href="https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html">following these
instructions</a>.
The configuration will look like this:</p>
<div class="highlight"><pre><span></span><code>aws configure
AWS Access Key ID <span class="o">[</span>*******************<span class="o">]</span>: xxxxxxxxxxxxxxxxxx
AWS Secret Access Key <span class="o">[</span>******************<span class="o">]</span>:xxxxxxxxxxxxxxxxxxx
Default region name <span class="o">[</span>us-east-2<span class="o">]</span>: us-east-1
Default output format <span class="o">[</span>None<span class="o">]</span>:
</code></pre></div>
<p>In order to run the model job in AWS EMR, we'll first need to configure
a default role for the job to assume. A simple way to do this is already
supported in the AWS CLI tool. The command looks like this:</p>
<div class="highlight"><pre><span></span><code>aws emr create-default-roles
</code></pre></div>
<p>In order to set up the execution environment in the nodes before we run
the model prediction code we'll need to execute a few commands. The
mrjob package supports this through a configuration file called
mrjob.conf. The config file is written in YAML and looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">runners</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">emr</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="nt">bootstrap</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">sudo yum update -y</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">sudo yum install git -y</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">sudo pip-3.6 install -r ./requirements.txt#</span><span class="w"></span>
<span class="w"> </span><span class="nt">setup</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">export PYTHONPATH=$PYTHONPATH:model_mapreduce_job/#</span><span class="w"></span>
<span class="w"> </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">export APP_SETTINGS=ProdConfig</span><span class="w"></span>
</code></pre></div>
<p>The file can be found
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/mrjob.conf">here</a>.</p>
<p>The file is able to hold configuration for several types of runners, for
now we'll only configure the EMR runner. The bootstrap section holds
commands that will be executed one time, when the cluster node is first
created. In this section we're updating the yum package manager,
installing the git client, and installing all of the python dependencies
we need to run the model package from the
<a href="https://github.com/schmidtbri/map-reduce-ml-model-deployment/blob/master/requirements.txt">requirements.txt</a>
file in the project.</p>
<p>The setup section holds commands that will be executed whenever the
MapReduce job starts up. In this section, we are setting up the
PYTHONPATH environment variable that the python interpreter will need in
order to find the code files that make up the job. We are also setting
the APP_SETTINGS environment variable that tells the job which
environment it is running in, for now we're running the job with the
ProdConfiguration settings.</p>
<p>Now that we have the credentials and configuration set up, we can run
the job in AWS. The command looks like this:</p>
<div class="highlight"><pre><span></span><code>python model_mapreduce_job/ml_model_map_reduce_job.py <span class="se">\</span>
--conf-path<span class="o">=</span>./mrjob.conf -r emr --iam-service-role EMR_DefaultRole <span class="se">\ </span>
--model_qualified_name iris_model ./data/input.ldjson
</code></pre></div>
<p>The mrjob package will create an S3 bucket for the job, upload the code
and data to the S3 bucket, create an EMR cluster for the job, and run
the job. The results of the job will be stored into the same S3 bucket.</p>
<h1>Closing</h1>
<p>By using the MapReduce framework, we are able to make a large number of
predictions on a cluster of computers. Because of the simple design of
the MapReduce framework, a lot of the complexities of running a job on
many computers are abstracted out. This deployment option for machine
learning models enables us to deploy model prediction jobs against truly
massive data sets.</p>
<p>By building the prediction job so that it uses the MLModel interface,
the deployment of a model as a MapReduce job is greatly simplified. The
MapReduce job that we built in this blog post is able to host any
machine learning model that uses the MLModel interface which makes the
code highly reusable. Once again, the MLModel interface allowed us to
abstract out the complexities of building a machine learning model from
the complexities of deploying a machine learning model.</p>
<p>One of the drawbacks of the implementation is the fact that it only
accepts LDJSON encoded files as input to the job. This is for the sake
of simplicity, since having the field names along with the data makes
the code easier to understand. An improvement to the code would be to
enable other protocols so that we can use other file types with the job.
Furthermore, it would be easy to make the choice of input and output
protocols a command line option that can be chosen at execution time.</p>A gRPC Service ML Model Deployment2020-01-20T09:27:00-05:002020-01-20T09:27:00-05:00Brian Schmidttag:www.tekhnoal.com,2020-01-20:/grpc-ml-model-deployment.html<p>With the rise of service oriented architectures and microservice architectures, the gRPC](https://grpc.io/) system has become a popular choice for building services. gRPC is a fairly new system for doing inter-service communication through Remote Procedure Calls (RPC) that started in Google in 2015. A remote procedure call is an abstraction that allows a developer to make a call to a function that runs in a separate process, but that looks like it executes locally. gRPC is a standard for defining the data exchanged in an RPC call and the API of the function through <a href="https://developers.google.com/protocol-buffers">protocol buffers</a>. gRPC also supports many other features, such as simple and streaming RPC invocations, authentication, and load balancing.</p><p>This blog post builds on the ideas started in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog posts</a>.</p>
<p>In this blog post I'll show how to deploy the same ML model that l
deployed as a batch job in this <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog
post</a>,
as a task queue in this <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">blog post</a>,
inside an AWS Lambda in this <a href="https://www.tekhnoal.com/lambda-ml-model-deployment.html">blog post</a>,
and a Kafka streaming application in this <a href="https://www.tekhnoal.com/streaming-ml-model-deployment.html">blog post</a>.</p>
<p>The code in this blog post can be found in this <a href="https://github.com/schmidtbri/grpc-ml-model-deployment">github repo</a>.</p>
<h1>Introduction</h1>
<p>With the rise of service oriented architectures and microservice
architectures, the <a href="https://grpc.io/">gRPC</a> system has
become a popular choice for building services. gRPC is a fairly new
system for doing inter-service communication through Remote Procedure
Calls (RPC) that started in Google in 2015. A remote procedure call is
an abstraction that allows a developer to make a call to a function that
runs in a separate process, but that looks like it executes locally.
gRPC is a standard for defining the data exchanged in an RPC call and
the API of the function through <a href="https://developers.google.com/protocol-buffers">protocol
buffers</a>.
gRPC also supports many other features, such as simple and streaming RPC
invocations, authentication, and load balancing.</p>
<p>Protocol buffers are defined through an interface definition language,
and the code that actually does the serialization/deserialization is
then generated from the definition. Once a protocol buffer definition
file is created, the protocol buffer definition can be compiled into
many different programming languages through a compiler. This allows
gRPC to be a cross-language standard for a common exchange format
between services.</p>
<p>gRPC services are coded in much the same way as a regular web service
but have several differences that will affect the service we'll build in
this blog post. First, protocol buffers are statically typed, which
makes the serialized data packages smaller but allows for less
flexibility in the code of the service. Second, protocol buffers must be
compiled to source code, which makes it harder to evolve services that
use them. Lastly, a protocol buffer is a binary data structure that is
optimized for size and processing speed, whereas a JSON data structure
is a string-based data structure optimized for simplicity and
readability. In <a href="https://dev.to/plutov/benchmarking-grpc-and-rest-in-go-565">performance
comparisons</a>,
protocol buffers have been found to be many times faster than JSON.</p>
<p>In previous blog posts, we've used JSON exclusively, to keep things
simple. JSON allowed the services and applications to deserialize the
data structure and send it directly to the model without having to worry
about the contents of the data structure. This is not possible with gRPC
since the service requires explicit knowledge of the schema of the
models incoming and outgoing data.</p>
<h1>Package Structure</h1>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">model_grpc_service</span> <span class="ss">(</span><span class="nv">python</span> <span class="nv">package</span> <span class="k">for</span> <span class="nv">service</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">config</span>.<span class="nv">py</span> <span class="nv">configuration</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">application</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">ml_model_grpc_endpoint</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">MLModel</span> <span class="nv">gRPC</span> <span class="nv">endpoint</span> <span class="nv">class</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_manager</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">model</span> <span class="nv">manager</span> <span class="nv">singleton</span> <span class="nv">class</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">service</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">service</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">scripts</span>
<span class="o">-</span> <span class="nv">client</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">single</span> <span class="nv">prediction</span> <span class="nv">test</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">generate_proto</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span><span class="nv">unit</span> <span class="nv">tests</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">Dockerfile</span>
<span class="o">-</span> <span class="nv">Makefle</span>
<span class="o">-</span> <span class="nv">model_service</span>.<span class="nv">proto</span> <span class="ss">(</span><span class="nv">protocol</span> <span class="nv">buffer</span> <span class="nv">definition</span> <span class="nv">of</span> <span class="nv">gRPC</span> <span class="nv">service</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_service_pb2</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">python</span> <span class="nv">protocol</span> <span class="nv">buffer</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_service_pb2_grpc</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">python</span> <span class="nv">gRPC</span> <span class="nv">service</span> <span class="nv">bindings</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_service_template</span>.<span class="nv">proto</span> <span class="ss">(</span><span class="nv">protocol</span> <span class="nv">buffer</span> <span class="nv">template</span> <span class="nv">file</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">README</span>.<span class="nv">md</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">setup</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span>
</code></pre></div>
<p>This structure can be seen in the <a href="https://github.com/schmidtbri/grpc-ml-model-deployment">github
repository</a>.</p>
<h1>Installing the Model</h1>
<p>In order to create a gRPC service for ML models we'll first install a
model package into the environment. We'll use the iris_model package,
which has been used in
<a href="https://www.tekhnoal.com/lambda-ml-model-deployment.html">several</a>
<a href="https://www.tekhnoal.com/streaming-ml-model-deployment.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog
posts</a>.
The model package itself was created in <a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">this blog
post</a>.
The model package can be installed from its git repository with this
command:</p>
<div class="highlight"><pre><span></span><code>pip install git+https://github.com/schmidtbri/ml-model-abc-improvements
</code></pre></div>
<p>Now that we have the model package in the environment, we can add it to
the config.py module:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">Config</span><span class="p">(</span><span class="nb">dict</span><span class="p">):</span>
<span class="n">models</span> <span class="o">=</span> <span class="p">[{</span>
<span class="s2">"module_name"</span><span class="p">:</span> <span class="s2">"iris_model.iris_predict"</span><span class="p">,</span>
<span class="s2">"class_name"</span><span class="p">:</span> <span class="s2">"IrisModel"</span>
<span class="p">}]</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_grpc_service/config.py#L4-L12">here</a>.</p>
<p>This configuration class is used by the service in all environments. The
module_name and class_name fields allow the application to find the
MLModel class that implements the prediction functionality of the
iris_model package. The list can hold information for many models, so
there's no limitation to how many models can be hosted by the service.</p>
<p>The reason that we need to install the model package before we can write
any other code is because the model's input and output schemas are
needed to be able to define the gRPC service's API.</p>
<h1>Generating a Protocol Buffer Definition</h1>
<p>Since we can't code the gRPC service until we have a .proto file with
the definition of the API of the service, our first task is to generate
.proto file from the models that will be hosted by the service. In order
to automatically generate the file from the iris_model's input and
output schemas we'll use the <a href="https://jinja.palletsprojects.com/en/2.10.x/">Jinja2 templating
tool</a>. Jinja2
is a templating tool that allows documents to be generated by combining
a template file and a data structure, it allows a developer to isolate
the unchanging parts of a document in the template, and keeps the parts
that change in the data structure. First we'll create a template, and
after that we'll add the schema information to it to generate a .proto
file for the service.</p>
<h2>The Template File</h2>
<p>First we'll create the <a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_service_template.proto">template
file</a>
from which we'll generate the .proto file:</p>
<div class="highlight"><pre><span></span><code>syntax = "proto3";
package model_grpc_service;
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_service_template.proto#L1-L3">here</a>.</p>
<p>At the top of the template, we declare that we'll use the proto3 format,
and the name of the package is "model_grpc_service". Next, we'll
declare some data structures:</p>
<div class="highlight"><pre><span></span><code>message empty {}
message model {
string qualified_name = 1;
string display_name = 2;
string description = 3;
sint32 major_version = 4;
sint32 minor_version = 5;
string input_type = 6;
string output_type = 7;
string predict_operation = 8;
}
message model_collection {
repeated model models = 1;
}
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_service_template.proto#L5-L20">here</a>.</p>
<p>These data structures will be used by an operation that will be declared
further down in the template. The data structures hold information about
the models that are hosted by the service, including the names of the
input and output types and the name of the prediction operation for the
model. The model_collection type holds a list of model objects.</p>
<p>Next, we'll generate an input type for the models hosted by the service:</p>
<div class="highlight"><pre><span></span><code><span class="cp">{%</span> <span class="k">for</span> <span class="nv">model</span> <span class="k">in</span> <span class="nv">models</span> <span class="cp">%}</span><span class="x"></span>
<span class="x">message </span><span class="cp">{{</span> <span class="nv">model.qualified_name</span> <span class="cp">}}</span><span class="x">_input { </span>
<span class="x"> </span><span class="cp">{%</span> <span class="k">for</span> <span class="nv">field</span> <span class="k">in</span> <span class="nv">model.input_schema</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> </span><span class="cp">{{</span> <span class="nv">field.type</span> <span class="cp">}}</span><span class="x"> </span><span class="cp">{{</span> <span class="nv">field.name</span> <span class="cp">}}</span><span class="x"> = </span><span class="cp">{{</span> <span class="nv">field.index</span> <span class="cp">}}</span><span class="x">;</span>
<span class="x"> </span><span class="cp">{%</span> <span class="k">endfor</span> <span class="cp">%}</span><span class="x"></span>
<span class="x">}</span>
<span class="cp">{%</span> <span class="k">endfor</span> <span class="cp">%}</span><span class="x"></span>
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_service_template.proto#L22-L26">here</a>.</p>
<p>This template code uses the qualified name of a model and the schema of
the input of the model to generate a protocol buffer type that matches
the model's input. The name of the input type for a model always follows
this pattern: "<model_qualified_name>_input". Each field in the
input schema of the model is translated to the equivalent field type in
a protocol buffer and is given the same name. Lastly, an index is
generated and assigned to the field.</p>
<p>Next, we'll do the same for the output schema of the model:</p>
<div class="highlight"><pre><span></span><code><span class="cp">{%</span> <span class="k">for</span> <span class="nv">model</span> <span class="k">in</span> <span class="nv">models</span> <span class="cp">%}</span><span class="x"></span>
<span class="x">message </span><span class="cp">{{</span> <span class="nv">model.qualified_name</span> <span class="cp">}}</span><span class="x">_output { </span>
<span class="x"> </span><span class="cp">{%</span> <span class="k">for</span> <span class="nv">field</span> <span class="k">in</span> <span class="nv">model.output_schema</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> </span><span class="cp">{{</span> <span class="nv">field.type</span> <span class="cp">}}</span><span class="x"> </span><span class="cp">{{</span> <span class="nv">field.name</span> <span class="cp">}}</span><span class="x"> = </span><span class="cp">{{</span> <span class="nv">field.index</span> <span class="cp">}}</span><span class="x">;</span>
<span class="x"> </span><span class="cp">{%</span> <span class="k">endfor</span> <span class="cp">%}</span><span class="x"></span>
<span class="x">}</span>
<span class="cp">{%</span> <span class="k">endfor</span> <span class="cp">%}</span><span class="x"></span>
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_service_template.proto#L27-L31">here</a>.</p>
<p>Now we can start to define the service's API:</p>
<div class="highlight"><pre><span></span><code><span class="x">service ModelgRPCService {</span>
<span class="x"> rpc get_models (empty) returns (model_collection) {}</span>
<span class="x"> </span><span class="cp">{%</span> <span class="k">for</span> <span class="nv">model</span> <span class="k">in</span> <span class="nv">models</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> rpc </span><span class="cp">{{</span> <span class="nv">model.qualified_name</span> <span class="cp">}}</span><span class="x">_predict (</span><span class="cp">{{</span> <span class="nv">model.qualified_name</span> <span class="cp">}}</span><span class="x">_input) returns (</span><span class="cp">{{</span> <span class="nv">model.qualified_name</span> <span class="cp">}}</span><span class="x">_output) {}</span>
<span class="x"> </span><span class="cp">{%</span> <span class="k">endfor</span> <span class="cp">%}</span><span class="x"></span>
<span class="x">}</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_service_template.proto#L33-L38">here</a>.</p>
<p>This code defines the operations that the service implements. The first
operation is called "get_models" and it uses the first set of protobuf
data structures that we defined above. This operation is simple since it
does not change with the models that are being hosted by the gRPC
service. It accepts the "empty" type since it does not require any
inputs, and it returns the "model_collection" type.</p>
<p>Next, we will generate a set of prediction operations, one for each
model hosted by the service. The name of the predict operation always
follows this pattern: "<model_qualified_name>_predict". The model's
input and output types are added to the operation by name.</p>
<h2>Using the Template File</h2>
<p>This template file is now ready to be used, so we'll create a python
script that will take it and add information about the models that we
actually want to host in the service. The script to do this is in the
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/scripts/generate_proto.py">generate_proto.py
script</a>.</p>
<p>This code will make use of the ModelManager class that has been used in
<a href="https://www.tekhnoal.com/lambda-ml-model-deployment.html">several</a>
<a href="https://www.tekhnoal.com/streaming-ml-model-deployment.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog
posts</a>.
The ModelManager class is responsible for loading models from
configuration, maintaining references to the model objects, and
returning information about the models. In this section we'll use the
get_models() and get_model_metadata() operations to access the
information needed to generate the protocol buffer definition.</p>
<p>The script starts by instantiating the ModelManager and loading the
models from the configuration:</p>
<div class="highlight"><pre><span></span><code><span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">model_manager</span><span class="o">.</span><span class="n">load_models</span><span class="p">(</span><span class="n">Config</span><span class="o">.</span><span class="n">models</span><span class="p">)</span>
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/scripts/generate_proto.py#L9-L10">here</a>.</p>
<p>Then the script loads the Jinja2 template file:</p>
<div class="highlight"><pre><span></span><code><span class="n">template_loader</span> <span class="o">=</span> <span class="n">jinja2</span><span class="o">.</span><span class="n">FileSystemLoader</span><span class="p">(</span><span class="n">searchpath</span><span class="o">=</span><span class="s2">"./"</span><span class="p">)</span>
<span class="n">template_env</span> <span class="o">=</span> <span class="n">jinja2</span><span class="o">.</span><span class="n">Environment</span><span class="p">(</span><span class="n">loader</span><span class="o">=</span><span class="n">template_loader</span><span class="p">)</span>
<span class="n">template</span> <span class="o">=</span> <span class="n">template_env</span><span class="o">.</span><span class="n">get_template</span><span class="p">(</span><span class="s2">"model_service_template.proto"</span><span class="p">)</span>
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/scripts/generate_proto.py#L23-L25">here</a>.</p>
<p>Now that the template is loaded, we can generate the data structure that
will be passed to the template:</p>
<div class="highlight"><pre><span></span><code><span class="n">models</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">model</span> <span class="ow">in</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">():</span>
<span class="n">model_details</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model_metadata</span><span class="p">(</span><span class="n">qualified_name</span><span class="o">=</span><span class="n">model</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">])</span>
<span class="n">models</span><span class="o">.</span><span class="n">append</span><span class="p">({</span>
<span class="s2">"qualified_name"</span><span class="p">:</span> <span class="n">model_details</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">],</span>
<span class="s2">"input_schema"</span><span class="p">:</span> <span class="p">[{</span>
<span class="s2">"index"</span><span class="p">:</span> <span class="nb">str</span><span class="p">(</span><span class="n">index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">),</span>
<span class="s2">"name"</span><span class="p">:</span> <span class="n">field_name</span><span class="p">,</span>
<span class="s2">"type"</span><span class="p">:</span> <span class="n">type_mappings</span><span class="p">[</span><span class="n">model_details</span><span class="p">[</span><span class="s2">"input_schema"</span><span class="p">][</span><span class="s2">"properties"</span><span class="p">][</span><span class="n">field_name</span><span class="p">][</span><span class="s2">"type"</span><span class="p">]]</span>
<span class="p">}</span> <span class="k">for</span> <span class="n">index</span><span class="p">,</span> <span class="n">field_name</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">model_details</span><span class="p">[</span><span class="s2">"input_schema"</span><span class="p">][</span><span class="s2">"properties"</span><span class="p">])],</span>
<span class="s2">"output_schema"</span><span class="p">:</span> <span class="p">[</span>
<span class="p">{</span>
<span class="s2">"index"</span><span class="p">:</span> <span class="nb">str</span><span class="p">(</span><span class="n">index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">),</span>
<span class="s2">"name"</span><span class="p">:</span> <span class="n">field_name</span><span class="p">,</span>
<span class="s2">"type"</span><span class="p">:</span> <span class="n">type_mappings</span><span class="p">[</span><span class="n">model_details</span><span class="p">[</span><span class="s2">"output_schema"</span><span class="p">]</span> <span class="p">[</span><span class="s2">"properties"</span><span class="p">][</span><span class="n">field_name</span><span class="p">][</span><span class="s2">"type"</span><span class="p">]]</span>
<span class="p">}</span> <span class="k">for</span> <span class="n">index</span><span class="p">,</span> <span class="n">field_name</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">model_details</span><span class="p">[</span><span class="s2">"output_schema"</span><span class="p">][</span><span class="s2">"properties"</span><span class="p">])]</span>
<span class="p">})</span>
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/scripts/generate_proto.py#L28-L47">here</a>.</p>
<p>The code builds a dictionary for each model that contains the qualified
name, input schema, and output schema of each model in the ModelManager.
The python data types are converted to the equivalent protocol buffer
types as it goes along. The resulting dictionary is the data structure
that is used by the Jinja2 template defined above to generate a protocol
buffer definition.</p>
<p>Lastly, we'll render the template with the information we just extracted
from the models and then save the generated file to disk:</p>
<div class="highlight"><pre><span></span><code><span class="n">output_text</span> <span class="o">=</span> <span class="n">template</span><span class="o">.</span><span class="n">render</span><span class="p">(</span><span class="n">models</span><span class="o">=</span><span class="n">models</span><span class="p">)</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">output_file</span><span class="p">,</span> <span class="s2">"w"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">output_text</span><span class="p">)</span>
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/scripts/generate_proto.py#L50-L53">here</a>.</p>
<p>Now that we have the template and the script that uses the template
completed, we can try to generate a protocol buffer definition for the
service. The command to do this goes like this:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
python scripts generate_proto.py --output_file<span class="o">=</span>model_service.proto
</code></pre></div>
<p>The file generated by the command above is called "model_service.proto"
and it can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_service.proto">here</a>.
The protocol buffer definition contains the types needed for the
get_models operation as well as the operation itself. It also contains
the types and operations needed to interact with the iris_model, which
were automatically extracted from the information provided by the model.</p>
<p>By using a template and script approach to generating a protocol buffer
definition we are able to host any number of models inside of the gRPC
service. This is possible because every model that will be hosted is
required to expose its input and output schema through the MLModel
interface.</p>
<h1>Defining the Service</h1>
<p>Now that we have a protocol buffer definition for the gRPC service we
can actually start writing the code to implement the service itself. To
do this, we first need to compile the protocol buffer into its python
implementation. This is done with this command:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
python -m grpc_tools.protoc --proto_path<span class="o">=</span>. --python_out<span class="o">=</span>. --grpc_python_out<span class="o">=</span>. model_service.proto
</code></pre></div>
<p>This command generates two files: the model_service_pb2.py file and
the model_service_pb2_grpc.py file. The model_service_pb2.py file
contains the python data structures that will serialize and deserialize
from native python types to the protocol buffer binary format. The
model_service_pb2_grpc.py file contains the bindings that will allow
us to write a service that implements the operations defined in the
protocol buffer definition and also to write client code that can call
the implementations.</p>
<p>We'll start by creating a python file that contains the main service
codebase. We'll also implement the get_models operation in this file
since it is not a dynamic endpoint which depends on the presence of a
model to execute.</p>
<p>The gRPC service is defined as a class that inherits from a "Servicer"
class that was generated by the protoc compiler:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">ModelgRPCServiceServicer</span><span class="p">(</span><span class="n">model_service_pb2_grpc</span><span class="o">.</span><span class="n">ModelgRPCServiceServicer</span><span class="p">):</span>
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_grpc_service/service.py#L20-L21">here</a>.</p>
<p>Within the class, each operation is defined as a method with the same
name as the operation in the .proto file. The get_models operation is
defined like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">get_models</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">request</span><span class="p">,</span> <span class="n">context</span><span class="p">):</span>
<span class="n">model_data</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">()</span>
<span class="n">models</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">m</span> <span class="ow">in</span> <span class="n">model_data</span><span class="p">:</span>
<span class="n">response_model</span> <span class="o">=</span> <span class="n">model</span><span class="p">(</span><span class="n">qualified_name</span><span class="o">=</span><span class="n">m</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">],</span>
<span class="n">display_name</span><span class="o">=</span><span class="n">m</span><span class="p">[</span><span class="s2">"display_name"</span><span class="p">],</span>
<span class="n">description</span><span class="o">=</span><span class="n">m</span><span class="p">[</span><span class="s2">"description"</span><span class="p">],</span>
<span class="n">major_version</span><span class="o">=</span><span class="n">m</span><span class="p">[</span><span class="s2">"major_version"</span><span class="p">],</span>
<span class="n">minor_version</span><span class="o">=</span><span class="n">m</span><span class="p">[</span><span class="s2">"minor_version"</span><span class="p">],</span>
<span class="n">input_type</span><span class="o">=</span><span class="s2">"</span><span class="si">{}</span><span class="s2">_input"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">]),</span>
<span class="n">output_type</span><span class="o">=</span><span class="s2">"</span><span class="si">{}</span><span class="s2">_output"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">]),</span>
<span class="n">predict_operation</span><span class="o">=</span><span class="s2">"</span><span class="si">{}</span><span class="s2">_predict"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">]))</span>
<span class="n">models</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">response_model</span><span class="p">)</span>
<span class="n">response_models</span> <span class="o">=</span> <span class="n">model_collection</span><span class="p">()</span>
<span class="n">response_models</span><span class="o">.</span><span class="n">models</span><span class="o">.</span><span class="n">extend</span><span class="p">(</span><span class="n">models</span><span class="p">)</span>
<span class="k">return</span> <span class="n">response_models</span>
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_grpc_service/service.py#L33-L52">here</a>.</p>
<p>The operation does not receive any data in the request and returns a
model_collection data structure in the response. The model_collection
data structure was defined in the .proto file and compiled into a python
class by the protoc compiler. In order to fill the model_collection, we
iterate through the data returned by the ModelManager creating a list of
model objects as we go along. We then create the model_collection from
the list and return it to the client.</p>
<h1>MLModelgRPCEndpoint Class</h1>
<p>In order for the service to host any model that uses the MLModel base
class, we'll need to create a class that translates the protocol buffer
data structures into the native python data structures used by the
models. This class will be instantiated for every model that is hosted
by the service.</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MLModelgRPCEndpoint</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_grpc_service/ml_model_grpc_endpoint.py#L12-L13">here</a>.</p>
<p>When the service is initiated, we'll create one instance of this class
for every model. The __init__ method is looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">model_qualified_name</span><span class="p">):</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"'</span><span class="si">{}</span><span class="s2">' not found in ModelManager instance."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="p">))</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"Initializing endpoint for model: </span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_grpc_service/ml_model_grpc_endpoint.py#L15-L30">here</a>.</p>
<p>The __init__ method has one argument called "model_qualified_name"
which tells the endpoint class which model it will be hosting. The
__init__ method gets a reference to the ModelManager object that is
managed by the service, then it gets a reference to the model object
from the ModelManager object using the model_qualified_name argument.
Lastly, before finishing we check that the model instance is actually
available in the ModelManager.</p>
<p>Now that we have an instance of the endpoint for the MLModel object, we
need to write a method that will make the predict method available as a
gRPC endpoint. We'll do this by defining the __call__ method on the
endpoint class. When a <a href="https://www.journaldev.com/22761/python-callable-__call__">__call__
method</a>
is attached to a class, it turns all instances of the class into
callables, which allows instances of the class to be used like
functions. This will be useful later when we need to initialize a
dynamic number of endpoints in the gRPC service.</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__call__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">request</span><span class="p">,</span> <span class="n">context</span><span class="p">):</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">MessageToDict</span><span class="p">(</span><span class="n">request</span><span class="p">,</span> <span class="n">preserving_proto_field_name</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="n">output_protobuf_name</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">_output"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">)</span>
<span class="n">output_protobuf</span> <span class="o">=</span> <span class="n">MLModelgRPCEndpoint</span><span class="o">.</span><span class="n">_get_protobuf</span><span class="p">(</span><span class="n">output_protobuf_name</span><span class="p">)</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">output_protobuf</span><span class="p">(</span><span class="o">**</span><span class="n">prediction</span><span class="p">)</span>
<span class="k">return</span> <span class="n">response</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_grpc_service/ml_model_grpc_endpoint.py#L32-L50">here</a>.</p>
<p>The method uses the MessageToDict function from the protobuf package to
turn a protocol buffer data structure into a Python dictionary. The
dictionary is then passed into the model's predict method and a
prediction is returned.</p>
<p>Now that we have a prediction, we have to find the right protocol buffer
data structure to return the prediction result to the client. To do
this, a special method called
"<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_grpc_service/ml_model_grpc_endpoint.py#L52-L54">_get_protobuf</a>"
is used which goes into the model_service_pb2.py module where the
python protocol buffer definitions are stored, and dynamically import
the correct class for the output of the model. For example, the
iris_model's output protocol buffer definition is called
"iris_model_output". This lookup is possible because the output
protocol buffer of a model is always named according to the same
pattern. In the last step, we hand over the model's prediction to the
protocol buffer class which initializes itself with the prediction data
and return the resulting object.</p>
<h1>Creating gRPC Endpoints Dynamically</h1>
<p>Now that we have a class that can handle any model object, we need to
connect it to the service. To do this, we'll create an __init__
method in the service class that will execute when the service starts
up:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model_manager</span><span class="o">.</span><span class="n">load_models</span><span class="p">(</span><span class="n">configuration</span><span class="o">=</span><span class="n">Config</span><span class="o">.</span><span class="n">models</span><span class="p">)</span>
<span class="k">for</span> <span class="n">model</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">():</span>
<span class="n">endpoint</span> <span class="o">=</span> <span class="n">MLModelgRPCEndpoint</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="o">=</span><span class="n">model</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">])</span>
<span class="n">operation_name</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">_predict"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">model</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">])</span>
<span class="nb">setattr</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">operation_name</span><span class="p">,</span> <span class="n">endpoint</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/model_grpc_service/service.py#L23-L31">here</a>.</p>
<p>The __init__ method first instantiates the ModelManager class and
loads the models listed in the configuration. Once the models are in
memory, we create an endpoint object for each one in a loop. For each
model, we create an MLModelgRPCEndpoint object which is given the
model's qualified name. Then we generate the model's operation name
which matches the operation name for the model's predict operation
listed in the .proto file. For example, the iris_model's predict
operation is named "iris_model_predict". Lastly, we use the operation
name and dynamically set an attribute on the service class that attaches
the newly created endpoint to the class. This last step allows the
service to find the right endpoint for the operation when a call for a
prediction from a certain model is received. The fact that each endpoint
object is callable allows the service to call the endpoint object as if
it was a method of the class even though the endpoint is actually
another class.</p>
<h1>Using the Service</h1>
<p>We now have a complete service that we can test out. To do this we'll
execute these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">APP_SETTINGS</span><span class="o">=</span>ProdConfig
python model_grpc_service/service.py
</code></pre></div>
<p>In order to test out the service, I created a simple script that sends a
single gRPC request to the service. The script is found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/scripts/client.py">here</a>.
To send a request to the get_models operation, the code looks like
this:</p>
<div class="highlight"><pre><span></span><code><span class="k">with</span> <span class="n">grpc</span><span class="o">.</span><span class="n">insecure_channel</span><span class="p">(</span><span class="s2">"localhost:50051"</span><span class="p">)</span> <span class="k">as</span> <span class="n">channel</span><span class="p">:</span>
<span class="n">stub</span> <span class="o">=</span> <span class="n">ModelgRPCServiceStub</span><span class="p">(</span><span class="n">channel</span><span class="p">)</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">stub</span><span class="o">.</span><span class="n">get_models</span><span class="p">(</span><span class="n">empty</span><span class="p">())</span>
<span class="nb">print</span><span class="p">(</span><span class="n">response</span><span class="p">)</span>
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/grpc-ml-model-deployment/blob/master/scripts/client.py#L9-L15">here</a>.</p>
<p>To send a test request to the iris_model_predict operation of the
service, execute this command:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
python scripts/client.py --iris_model_predict
</code></pre></div>
<p>The script will contact the service running locally, make a prediction
with some sample data and print out the prediction result.</p>
<h1>Closing</h1>
<p>In this blog post we've shown how to deploy an ML model inside a gRPC
service. As gRPC becomes more popular, the option of deploying ML models
as gRPC services is becoming more attractive. As in previous blog posts,
we've built the service so that it can support any number of ML models,
as long as they implement the ML Model interface. This is one more type
of deployment that we implemented without having to modify the
iris_model package. The ability to deploy an ML model in different ways
without having to rewrite any part of the model code is very valuable
and ensures good software engineering practices.</p>
<p>By using gRPC to deploy an MLModel, we're able to take advantage of all
of the features of gRPC. These benefits include lightweight and fast
serialization of messages and built in support for streaming. The
ability to document a service API using protocol buffers also simplifies
the documentation and roll out of a new service. Lastly, the ability to
compile service and client codebases from the protocol buffer
definitions allows us to avoid many common errors.</p>
<p>In previous blog posts, deploying a new model was as simple as
installing the model package into the environment and adding it to the
configuration of the application. The schema of the model's inputs and
outputs did not affect the application code at all. In the code of this
blog post, we have to do more work because of the nature of protocol
buffers, since the generated code in the project is specific to a set of
models. Because of this, adding a new model to the gRPC service requires
us to generate a new .proto file from the model's input and output
schemas, generate python code from the .proto file, and finally add the
model to the configuration of the service. The extra steps make it more
complex to deploy the service.</p>
<p>In the future, the service could be improved by handling more complex
schemas, since currently the schema mapping between native python types
and protocol buffers only supports simple data structures. Another way
to improve the service is to add support for streaming endpoints for
each model. Lastly, protocol buffers have a mechanism for evolving
message schemas, the code could be improved by safely evolving the shema
of the service through this mechanism when the model schema changes. </p>A Streaming ML Model Deployment2019-12-29T09:26:00-05:002019-12-29T09:26:00-05:00Brian Schmidttag:www.tekhnoal.com,2019-12-29:/streaming-ml-model-deployment.html<p>In general, when a client communicates with a software service two patterns are available: synchronous and asynchronous communication. When doing synchronous communication, a message is sent to the service which blocks the sender until the operation is done and the result is returned to the client. With an asynchronous message, the service receives the message and does not block the sender of the message while it does the processing. We’ve already seen an asynchronous deployment for a machine learning model in a previous blog post. In this blog post, we’ll show a similar type of deployment that is useful in different situations. We’ll be focusing on deploying an ML model as part of a stream processing system.</p><p>This blog post builds on the ideas started in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog posts</a>.</p>
<p>In this blog post I'll show how to deploy the same ML model that l
deployed as a batch job in this <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog
post</a>,
as a task queue in this <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">blog
post</a>,
and inside an AWS Lambda in this <a href="https://www.tekhnoal.com/lambda-ml-model-deployment.html">blog
post</a>.</p>
<p>The code in this blog post can be found in this <a href="https://github.com/schmidtbri/streaming-ml-model-deployment">github
repo</a>.</p>
<h1>Introduction</h1>
<p>In general, when a client communicates with a software service two
patterns are available: synchronous and asynchronous communication. When
doing synchronous communication, a message is sent to the service which
blocks the sender until the operation is done and the result is returned
to the client. With an asynchronous message, the service receives the
message and does not block the sender of the message while it does the
processing. We've already seen an asynchronous deployment for a machine
learning model in a <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">previous blog
post</a>.
In this blog post, we'll show a similar type of deployment that is
useful in different situations. We'll be focusing on deploying an ML
model as part of a stream processing system.</p>
<p><a href="https://en.wikipedia.org/wiki/Stream_processing">Stream processing</a>
is a data processing paradigm that treats a dataset as an unending
stream of ordered records. A stream processor works by receiving a
record from a data stream, processing it, and putting it in another data
stream. This approach is different from <a href="https://en.wikipedia.org/wiki/Batch_processing">batch
processing</a>,
in which a process sees a data set as a batch of records that are
processed together in one processing run. Stream processing is
inherently asynchronous, since a producer of records does not have to
coordinate with the process that consumes the records.</p>
<p>In order for a stream processor to receive messages from producers, a
<a href="https://en.wikipedia.org/wiki/Message_broker">message broker</a> is
often used. In this case, the message broker acts as middleware that
enables producers and consumers to communicate without being explicitly
aware of each other. The message broker allows the system to be more
<a href="https://en.wikipedia.org/wiki/Service_loose_coupling_principle">decoupled</a>
than in other types of software architectures.</p>
<p>In a <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">previous blog
post</a>,
we used Redis as a message broker to deploy a model inside a task queue.
One thing that is different about the current blog post and that one is
the lack of a result backend, since we are not going to store the
results of a prediction into a result store for later retrieval. The ML
model stream processor we'll build will pick up data used for prediction
from the message broker and put the resulting predictions back into the
message broker. Instead of Redis, we'll be using Kafka as the message
broker.</p>
<h1>Software Architecture</h1>
<p><img alt="Architecture" src="https://www.tekhnoal.com/architecture.png" width="100%"></p>
<p>The model stream processor application we will build will communicate
with other software components through topics on a message broker. A
topic is a channel of communication that exists in a message broker. A
software service can "produce" messages to a topic and also "consume"
messages from a topic. Each model will need three topics for its own
use: an input topic from which it will receive data used to make
predictions, an output topic to which it will write the prediction
results, and an error topic to which it will write any input messages
that caused an error to occur. The error topic is essentially an
<a href="https://www.enterpriseintegrationpatterns.com/patterns/messaging/InvalidMessageChannel.html">invalid message
channel</a>
for the model.</p>
<h1>Kafka for Stream Processing</h1>
<p>To show how to deploy an ML model as a stream processor, we'll be using
<a href="https://en.wikipedia.org/wiki/Apache_Kafka">Kafka</a> as the
message broker service. Over the last few years, Kafka has become an
important tool for doing stream processing because of its high
performance and rich tool ecosystem.</p>
<p>To connect to Kafka from python, we'll use the <a href="https://aiokafka.readthedocs.io/en/stable/">aiokafka python
library</a>. This
library can be used to produce and consume messages on kafka as well as
other operations. The aiokafka library uses the <a href="https://realpython.com/async-io-python/">asyncio
library</a> to
improve the performance of the application. Asyncio is a new library in
python that helps to write concurrent code that performs IO-bound
operations in a more performant manner. The async/await syntax will
appear in the code of this blog post, I won't go out of my way to
explain it since there are many better places to learn about this
programming paradigm.</p>
<h1>Package Structure</h1>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">model_stream_processor</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">app</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">application</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">config</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">configuration</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">application</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">ml_model_stream_processor</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">MLModel</span> <span class="nv">stream</span> <span class="nv">processor</span> <span class="nv">class</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_manager</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">model</span> <span class="nv">manager</span> <span class="nv">singleton</span> <span class="nv">class</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">scripts</span>
<span class="o">-</span> <span class="nv">create_topics</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">script</span> <span class="k">for</span> <span class="nv">automating</span> <span class="nv">topic</span> <span class="nv">creation</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">receive_messages</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">script</span> <span class="k">for</span> <span class="nv">receiving</span> <span class="nv">messages</span> <span class="nv">from</span> <span class="nv">a</span> <span class="nv">topic</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">send_messages</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">script</span> <span class="k">for</span> <span class="nv">sending</span> <span class="nv">messages</span> <span class="nv">to</span> <span class="nv">a</span> <span class="nv">topic</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span><span class="nv">unit</span> <span class="nv">test</span> <span class="nv">suite</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">Makefile</span>
<span class="o">-</span> <span class="nv">README</span>.<span class="nv">md</span>
<span class="o">-</span> <span class="nv">docker</span><span class="o">-</span><span class="nv">compose</span>.<span class="nv">yml</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">setup</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span>
</code></pre></div>
<p>This structure can be seen in the <a href="https://github.com/schmidtbri/streaming-ml-model-deployment">github
repository</a>.</p>
<h1>MLModelStreamProcessor Class</h1>
<p>To be able to have an MLModel that sends and receives data from Kafka
topics, we'll write a class that wraps around an MLModel instance. The
class will take care of finding and connecting to Kafka brokers,
serializing and deserializing the messages from Kafka, and detecting
errors.</p>
<p>We'll start by creating the class:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MLModelStreamProcessor</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="sd">"""Processor class for MLModel stream processors."""</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/ml_model_stream_processor.py#L12-L13">here</a>.</p>
<p>The __init__() method of the class contains a lot of the
functionality of the class:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">model_qualified_name</span><span class="p">,</span> <span class="n">loop</span><span class="p">,</span> <span class="n">bootstrap_servers</span><span class="p">):</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"'</span><span class="si">{}</span><span class="s2">' not found in ModelManager instance."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="p">))</span>
<span class="n">base_topic_name</span> <span class="o">=</span> <span class="s2">"model_stream_processor.</span><span class="si">{}</span><span class="s2">.</span><span class="si">{}</span><span class="s2">.</span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">model_qualified_name</span><span class="p">,</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">major_version</span><span class="p">,</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">minor_version</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">consumer_topic</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">.inputs"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">base_topic_name</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">producer_topic</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">.outputs"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">base_topic_name</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">error_producer_topic</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">.errors"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">base_topic_name</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_consumer</span> <span class="o">=</span> <span class="n">AIOKafkaConsumer</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">consumer_topic</span><span class="p">,</span> <span class="n">loop</span><span class="o">=</span><span class="n">loop</span><span class="p">,</span>
<span class="n">bootstrap_servers</span><span class="o">=</span><span class="n">bootstrap_servers</span><span class="p">,</span> <span class="n">group_id</span><span class="o">=</span><span class="vm">__name__</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_producer</span> <span class="o">=</span> <span class="n">AIOKafkaProducer</span><span class="p">(</span><span class="n">loop</span><span class="o">=</span><span class="n">loop</span><span class="p">,</span>
<span class="n">bootstrap_servers</span><span class="o">=</span><span class="n">bootstrap_servers</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/ml_model_stream_processor.py#L15-L54">here</a>.</p>
<p>When the processor class is first instantiated, the first thing it does
is to get an instance of the ModelManager class and then to get an
instance of the model it will manage from it. The model is identified by
the qualified_name, which should be unique for the model we're trying
to deploy. The __init__ method also accepts an asyncio loop that is
created once for the whole application, and also the name of the kafka
bootstrap server to use. Before we try to finish initializing the stream
processor, we check that the model instance actually exists within the
ModelManager singleton, if the model can't be found we'll raise an
exception.</p>
<p>After that, we generate the kafka topic names for the three topics that
each model needs. The topic names are generated from scratch and cannot
be parameterized. The base_topic_name is the same for all three topics
and contains the name of the stream processing application, the
qualified name of the model, and the model's major and minor versions.
Then we can generate the three unique names of the topics we'll need for
the model from the base_topic_name. The consumer topic will contain
input data for the model, the producer topic will contain the output of
the model for successful predictions, and the error producer topic will
contain all of the input messages that caused errors in the model.</p>
<p>Once this is done, we are finally able to create the consumer and
producer object that we'll use to write and read from Kafka. These
objects are created once and reused throughout the lifecycle of the
stream processor. The producer and consumer classes are provided by the
aiokafka package.</p>
<p>Even though we have an initialized stream processor object once we
finish executing the __init__() method of the class, we still need
to start the producer and consumer object within the stream processor
object. The start method is used at application startup to connect the
stream processor to the Kafka topics that it will use:</p>
<div class="highlight"><pre><span></span><code><span class="k">async</span> <span class="k">def</span> <span class="nf">start</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">await</span> <span class="bp">self</span><span class="o">.</span><span class="n">_consumer</span><span class="o">.</span><span class="n">start</span><span class="p">()</span>
<span class="k">await</span> <span class="bp">self</span><span class="o">.</span><span class="n">_producer</span><span class="o">.</span><span class="n">start</span><span class="p">()</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/ml_model_stream_processor.py#L61-L65">here</a>.</p>
<p>Once the stream processor class is initialized and started, we need to
process messages:</p>
<div class="highlight"><pre><span></span><code><span class="k">async</span> <span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">async</span> <span class="k">for</span> <span class="n">message</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_consumer</span><span class="p">:</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">message</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="n">serialized_prediction</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">prediction</span><span class="p">)</span><span class="o">.</span><span class="n">encode</span><span class="p">()</span>
<span class="k">await</span> <span class="bp">self</span><span class="o">.</span><span class="n">_producer</span><span class="o">.</span><span class="n">send_and_wait</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">producer_topic</span><span class="p">,</span> <span class="n">serialized_prediction</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">await</span> <span class="bp">self</span><span class="o">.</span><span class="n">_producer</span><span class="o">.</span><span class="n">send_and_wait</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">error_producer_topic</span><span class="p">,</span> <span class="n">message</span><span class="o">.</span><span class="n">value</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/ml_model_stream_processor.py#L67-L77">here</a>.</p>
<p>The process() method uses an async for loop to continuously process
messages from the input Kafka topic. The message is then deserialized
using JSON, and the resulting data structure is sent to the model's
predict() method. The prediction result is then serialized to a JSON
string and encoded to a byte array. Lastly, the prediction is written to
the output Kafka topic. If any exceptions are raised during this
process, the input message that caused the error is written to the error
Kafka topic so that we can try to reprocess it later (or try some other
error handling method).</p>
<p>Just like the Kafka producer and consumers are started in the start()
method of the class, we need a stop() method so that they can be shut
down gracefully:</p>
<div class="highlight"><pre><span></span><code><span class="k">async</span> <span class="k">def</span> <span class="nf">stop</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">await</span> <span class="bp">self</span><span class="o">.</span><span class="n">_consumer</span><span class="o">.</span><span class="n">stop</span><span class="p">()</span>
<span class="k">await</span> <span class="bp">self</span><span class="o">.</span><span class="n">_producer</span><span class="o">.</span><span class="n">stop</span><span class="p">()</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/ml_model_stream_processor.py#L79-L83">here</a>.</p>
<h1>Installing the Model</h1>
<p>Now that we have a streaming processor class, we can install a model
package that will be hosted by the class. To do this, we'll use the
iris_model package that we built in a <a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous blog
post</a>.
The model package can be installed from its git repository with this
command:</p>
<div class="highlight"><pre><span></span><code>pip install git+https://github.com/schmidtbri/ml-model-abc-improvements
</code></pre></div>
<p>Now we can add the model's details to the config.py module so that we
can dynamically load the model into the application later:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">Config</span><span class="p">(</span><span class="nb">dict</span><span class="p">):</span>
<span class="n">models</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span>
<span class="s2">"module_name"</span><span class="p">:</span> <span class="s2">"iris_model.iris_predict"</span><span class="p">,</span>
<span class="s2">"class_name"</span><span class="p">:</span> <span class="s2">"IrisModel"</span>
<span class="p">}</span>
<span class="p">]</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/config.py#L4-L12">here</a>.</p>
<p>This configuration class is used by the application in all environments.
The module_name and class_name fields allow the application to find
the MLModel class that implements the prediction functionality of the
iris_model package.</p>
<h1>Streaming Application</h1>
<p>In order to use the MLModelStreamProcessor class, we need to write code
that will dynamically instantiate it from configuration for each MLModel
class that will be hosted by the application. We'll do this in the
app.py module:</p>
<div class="highlight"><pre><span></span><code><span class="n">configuration</span> <span class="o">=</span> <span class="nb">__import__</span><span class="p">(</span><span class="s2">"model_stream_processor"</span><span class="p">)</span><span class="o">.</span> \
<span class="fm">__getattribute__</span><span class="p">(</span><span class="s2">"config"</span><span class="p">)</span><span class="o">.</span> \
<span class="fm">__getattribute__</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"APP_SETTINGS"</span><span class="p">])</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">model_manager</span><span class="o">.</span><span class="n">load_models</span><span class="p">(</span><span class="n">Config</span><span class="o">.</span><span class="n">models</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/app.py#L13-L20">here</a>.</p>
<p>The application starts by importing a configuration class, using a
special environment variable called "APP_SETTINGS", the configuration
class is imported from the config.py module. The application also
instantiates the ModelManager singleton that hosts the models. A full
explanation of the ModelManager class can be found in
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">previous</a>
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment">blog
posts</a>.</p>
<p>Next we'll create the function that actually starts and runs the
application:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
<span class="n">loop</span> <span class="o">=</span> <span class="n">asyncio</span><span class="o">.</span><span class="n">get_event_loop</span><span class="p">()</span>
<span class="n">asyncio</span><span class="o">.</span><span class="n">set_event_loop</span><span class="p">(</span><span class="n">loop</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/app.py#L23-L27">here</a>.</p>
<p>The main() function of the application first starts up an asyncio event
loop that will be shared by all of the stream processors in the
application. The loop allows the streaming processors to efficiently
cooperate to do IO-bound tasks like writing to the network.</p>
<p>Once we have an event loop, we can start instantiating the streaming
processors:</p>
<div class="highlight"><pre><span></span><code><span class="n">stream_processors</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">model</span> <span class="ow">in</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">():</span>
<span class="n">stream_processors</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">MLModelStreamProcessor</span><span class="p">(</span>
<span class="n">model_qualified_name</span><span class="o">=</span><span class="n">model</span><span class="p">[</span><span class="s2">"qualified_name"</span><span class="p">],</span>
<span class="n">loop</span><span class="o">=</span><span class="n">loop</span><span class="p">,</span>
<span class="n">bootstrap_servers</span><span class="o">=</span><span class="n">configuration</span><span class="o">.</span><span class="n">bootstrap_servers</span><span class="p">))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/app.py#L29-L34">here</a>.</p>
<p>Each stream processor is responsible for hosting one MLModel object from
the ModelManager singleton that we initialized above.</p>
<p>The stream processors are not started up and connected to a Kafka topic
yet, so we start them up like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">for</span> <span class="n">stream_processor</span> <span class="ow">in</span> <span class="n">stream_processors</span><span class="p">:</span>
<span class="n">loop</span><span class="o">.</span><span class="n">run_until_complete</span><span class="p">(</span><span class="n">stream_processor</span><span class="o">.</span><span class="n">start</span><span class="p">())</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/app.py#L36-L37">here</a>.</p>
<p>Each stream processor is started by calling the start method(). Since
the method is asynchronous, it is called by using the
run_until_complete() method of the asyncio loop.</p>
<div class="highlight"><pre><span></span><code><span class="k">try</span><span class="p">:</span>
<span class="k">for</span> <span class="n">stream_processor</span> <span class="ow">in</span> <span class="n">stream_processors</span><span class="p">:</span>
<span class="n">loop</span><span class="o">.</span><span class="n">run_until_complete</span><span class="p">(</span><span class="n">stream_processor</span><span class="o">.</span><span class="n">process</span><span class="p">())</span>
<span class="k">except</span> <span class="ne">KeyboardInterrupt</span><span class="p">:</span>
<span class="n">logging</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"Process interrupted."</span><span class="p">)</span>
<span class="k">finally</span><span class="p">:</span>
<span class="k">for</span> <span class="n">stream_processor</span> <span class="ow">in</span> <span class="n">stream_processors</span><span class="p">:</span>
<span class="n">loop</span><span class="o">.</span><span class="n">run_until_complete</span><span class="p">(</span><span class="n">stream_processor</span><span class="o">.</span><span class="n">stop</span><span class="p">())</span>
<span class="n">loop</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="n">logging</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"Successfully shutdown the processors."</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/app.py#L39-L48">here</a>.</p>
<p>When all of the stream processors are started up, we are ready to
process messages from Kafka. To do this, we call the process() method of
each stream processor with the asyncio loop. The loop will run the
processors forever, unless a keyboard interrupt is received. When an
interrupt happens, each processor is stopped by calling the stop()
method, then we close the asyncio loop itself, and then we can exit the
application.</p>
<p>The application is started from the command line with this code at the
bottom of the module:</p>
<div class="highlight"><pre><span></span><code><span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s2">"__main__"</span><span class="p">:</span>
<span class="n">main</span><span class="p">()</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/model_stream_processor/app.py#L51-L52">here</a>.</p>
<p>Now that we have an application that can run the stream processor
classes, we can test things against a Kafka broker instance.</p>
<h1>Setting Up a Development Environment</h1>
<p>To set up a development environment we'll use docker images with the
docker compose tool. The docker images come from the <a href="https://hub.docker.com/u/confluentinc/">official
dockerhub repository</a> of
Confluent, which is the company that manages the Kafka project. The
<a href="https://docs.docker.com/compose/">docker-compose tool</a> is
useful for building development environments because it automates a lot
of steps that would need to be performed manually.</p>
<p>The docker-compose.yml file in the project root contains configuration
for three services:</p>
<ul>
<li><a href="https://zookeeper.apache.org/">zookeeper</a>, a service for maintaining shared configuration and doing synchronization</li>
<li><a href="https://kafka.apache.org/">kafka</a>, the message broker, which depends on zookeeper</li>
<li><a href="https://www.confluent.io/confluent-control-center/">confluent control center</a>, a user interface service useful for debugging</li>
</ul>
<p>The <a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/docker-compose.yml">docker-compose.yml</a>
file contains the docker image information, configuration options, and
network settings for each service. It also contains dependency
information for each service so that they are started in the right
order.</p>
<p>To start up the three services, we need to execute this command from the
root of the project:</p>
<div class="highlight"><pre><span></span><code>docker-compose up
</code></pre></div>
<p>To see if everything came up correctly, execute this command in another
shell:</p>
<div class="highlight"><pre><span></span><code>docker-compose ps
</code></pre></div>
<p>If everything looks good, there should be three docker images running
and the confluent control center UI should be accessible at this URL:
<a href="http://localhost:9021/clusters">http://localhost:9021/</a>.</p>
<h1>Creating Kafka Topics</h1>
<p>In order to more easily create the topics needed to deploy the stream
processor for a model, I created a simple command line tool. The tool
reads the configuration of the streaming application, generates the
correct topic names, connects to the kafka broker and creates the topics
for each model. The tool can be found in the scripts folder in the
<a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/scripts/create_topics.py">create_topics.py</a>
module.</p>
<p>To use the tool, execute these commands from the root of the project:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
python scripts/create_topics.py --bootstrap_servers<span class="o">=</span>localhost:9092
</code></pre></div>
<p>The first command set the PYTHONPATH environment variable so that the
configuration module can be found, the second command executes the CLI
tool that creates the topics.</p>
<p>Now we can go into the confluent control center UI and see the topics
that were just created:</p>
<p><img alt="Topics" src="https://www.tekhnoal.com/topics.png" width="100%"></p>
<p>Since the configuration points at the iris_model package, there are now
three topics for that model's stream processor. If more models are
listed in the configuration of the application, more topics would be
created by the tool.</p>
<h1>Running the Application</h1>
<p>Now that we have the broker and topics for the stream processor, we can
start up the application send some messages to the model.</p>
<p>First, we'll start the application with these commands in a new command
shell:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">APP_SETTINGS</span><span class="o">=</span>ProdConfig
<span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
python model_stream_processor/app.py
</code></pre></div>
<p>The streaming processor for the iris_model wrote these messages to the
log:</p>
<div class="highlight"><pre><span></span><code>INFO:model_stream_processor:Initializing stream processor <span class="k">for</span> model: iris_model
INFO:model_stream_processor:iris_model stream processor: Consuming messages from topic..
INFO:model_stream_processor:iris_model stream processor: Producing messages to topics...
INFO:model_stream_processor:iris_model stream processor: Starting consumer and producer.
</code></pre></div>
<p>The stream processor is now ready to receive messages in the "inputs"
topic. To more easily send messages to a topic, I built a simple CLI
tool that reads messages from stdin and send them to the topic, the tool
is in the <a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/scripts/send_messages.py">send_messages.py
module</a>.
To use the tool, execute this command in a new command shell:</p>
<div class="highlight"><pre><span></span><code>python scripts/send_messages.py --topic<span class="o">=</span>model_stream_processor.iris_model.0.1.inputs --bootstrap_servers<span class="o">=</span>localhost:9092
</code></pre></div>
<p>The tool will start and wait for input from the command line, every time
the ENTER key is pressed the contents of stdin will be sent to the
"inputs" topic.</p>
<p>To be able to see the output messages produced by the stream processor I
built a similar CLI tool that consumes messages from a topic and prints
them to the screen. The tool is in the <a href="https://github.com/schmidtbri/streaming-ml-model-deployment/blob/master/scripts/receive_messages.py">receive_messages.py
module</a>.
To use it, execute this command in a new command shell:</p>
<div class="highlight"><pre><span></span><code>python scripts/receive_messages.py --topic<span class="o">=</span>model_stream_processor.iris_model.0.1.outputs --bootstrap_servers<span class="o">=</span>localhost:9092
</code></pre></div>
<p>Now we're ready to send some messages to the stream processor. To do
this, type the following JSON string into the send_messages command
that we started above:</p>
<div class="highlight"><pre><span></span><code>{"sepal_length": 1.1, "sepal_width": 1.2, "petal_length": 1.3, "petal_width": 1.4}
</code></pre></div>
<p>The receive_messages command should print out the prediction message
from the model stream processor:</p>
<div class="highlight"><pre><span></span><code>{"species": "setosa"}
</code></pre></div>
<p>The last thing we can test is the error handling of the stream
processor. To do this we have to listen to the "errors" topic of the
stream processor. We can do this by executing the receive_messages
command with the "errors" topic as an option:</p>
<div class="highlight"><pre><span></span><code>python scripts/receive_messages.py --topic<span class="o">=</span>model_stream_processor.iris_model.0.1.errors --bootstrap_servers<span class="o">=</span>localhost:9092
</code></pre></div>
<p>To cause an error in the stream processor we can send in a malformed
JSON string to the send_messages command that should still be running:</p>
<div class="highlight"><pre><span></span><code>{"sepal_length": 1.1, "sepal_width": 1.2, "petal_length": 1.3, "petal_width": 1.4
</code></pre></div>
<p>The stream processor will catch the exception and send the input that
caused the error to the "errors" topic. We can see the message that
caused the error in the confluent control center UI:</p>
<p><img alt="Error Message" src="https://www.tekhnoal.com/error_message.png" width="100%"></p>
<h1>Closing</h1>
<p>In this blog post, we've shown how to deploy an ML model inside a
streaming application. This type of deployment is becoming more and more
useful in recent times, as the popularity of stream processing and Kafka
grows. As in previous blog posts, we've built an application that can
support any number of ML models that implement the MLModel interface The
only requirement for deployment is that the model package is installed
in the environment and the configuration of the application is updated.
The flexibility of this approach has allowed us to deploy the
iris_model ML model in five different applications without any
modification of the model code itself.</p>
<p>Another benefit of the stream processing application shown in this blog
post is the fact that we are using an asyncio-compatible Kafka client
library. By using asynchronous programming, we are able to greatly
increase the performance of the code. In
<a href="https://stackabuse.com/asynchronous-vs-synchronous-python-performance-analysis/">tests</a>,
asynchronous python code is able to significantly outperform normal
synchronous code. The performance boost is most pronounced when working
with file IO and network IO applications, which our streaming processor
application will definitely benefit from.</p>
<p>To keep things simple, we used JSON strings in the messages we sent
through Kafka. However, there are more efficient standards for
serializing data which we could have used. For example, the confluent
schema registry works with Avro schemas, and Avro is <a href="https://www.confluent.io/blog/avro-kafka-data/">well
supported</a>
in the Kafka ecosystem. Another way we can improve in the project are
the CLI tools that were built to test the application. They are very
simple and don't support many of the options that would be needed for a
real production application. For example the create_topics.py script
only creates topics with a replication factor of one. We can improve
this tool by adding more of the options supported by Kafka's <a href="https://kafka.apache.org/quickstart#quickstart_createtopic">topic
creation CLI tool</a>.</p>An AWS Lambda ML Model Deployment2019-11-10T09:25:00-05:002019-11-10T09:25:00-05:00Brian Schmidttag:www.tekhnoal.com,2019-11-10:/lambda-ml-model-deployment.html<p>In the last few years, a new cloud computing paradigm has emerged: serverless computing. This new paradigm flips the normal way of provisioning resources in a cloud environment on its head. Whereas a normal application is deployed onto pre-provisioned servers that are running before they are needed, a serverless application's codebase is deployed and the servers are assigned to run the application as demand for the application rises.</p><p>This blog post builds on the ideas started in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog posts</a>.</p>
<p>I also showed how to deploy the same ML model used in this blog post as
a batch job in this <a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog
post</a>,
and in a task queue in this <a href="https://www.tekhnoal.com/task-queue-ml-model-deployment.html">blog
post</a>.</p>
<p>The code in this blog post can be found in this <a href="https://github.com/schmidtbri/lambda-ml-model-deployment">github
repo</a>.</p>
<h1>Introduction</h1>
<p>In the last few years, a new cloud computing paradigm has emerged:
serverless computing. This new paradigm flips the normal way of
provisioning resources in a cloud environment on its head. Whereas a
normal application is deployed onto pre-provisioned servers that are
running before they are needed, a serverless application's codebase is
deployed and the servers are assigned to run the application as demand
for the application rises.</p>
<p>Although "serverless" can have several different interpretations, the
one that is most commonly used by developers is Functions as a Service
(FaaS). In this context, a function is a small piece of software that
does one thing, and hosting a function as a service means that the cloud
provider manages the server on which the function runs and allocates the
resources needed to run the function. Another interesting application of
the serverless paradigm are databases that are run and managed by cloud
providers, some examples of this are AWS Aurora, and Google Cloud
Datastore. However, these services don't run code that is provided by
the user, so they are not as interesting for deploying an ML model.</p>
<p>Serverless functions provide several benefits over
traditionally-deployed software. Serverless functions are inherently
elastic since they run only when an event triggers them, this makes them
easier to deploy and manage. They are also cheaper to run for the same
reason, since charges for execution time of a serverless function only
accrue when it is actually running. Lastly, using serverless functions
makes software engineers more productive, since a lot of deployment
details are abstracted out by the cloud provider, greatly simplifying
the deployment process.</p>
<p>Serverless functions have some drawbacks as well. The resources assigned
to a function are reclaimed by the cloud provider after a period of
inactivity, which means that the next time the function is executed
extra latency will be incurred when the resources are reassigned to the
function. Cloud providers often have limitations on the resources that a
function can take in a given period of time, which means that serverless
function might not be a good fit for certain workloads. Lastly, access
to the underlying server that is running the function is not available,
which limits the ability to control certain aspects of the execution
environment.</p>
<p>In this blog post, I will show how to deploy a machine learning model on
<a href="https://aws.amazon.com/lambda/">AWS Lambda,</a> which is the
AWS serverless function offering. The code for this blog post can run
locally, but to go through all of the scenarios explained it's necessary
to get an AWS account. We'll also show how to integrate the lambda with
AWS API Gateway, which will make the model hosted by the lambda
accessible through a REST API. To interact with the AWS API, the <a href="https://aws.amazon.com/cli/">AWS
CLI</a> package needs to be
installed as well.</p>
<h1>Serverless Framework</h1>
<p>The <a href="https://serverless.com/">serverless framework</a> is a
software framework for developing applications that use the serverless
FaaS model for deployment. The framework provides a command line
interface (CLI) that can operate across different cloud providers and
helps software engineers to develop, deploy, test, and monitor
serverless functions. We'll be using the serverless framework to work
with the AWS Lambda service.</p>
<p>In order to use the serverless framework, we need to first <a href="https://nodejs.org/en/download/">install the
node.js runtime</a>. After
this, we can install the serverless framework with this command:</p>
<div class="highlight"><pre><span></span><code>npm install -g serverless
</code></pre></div>
<p>After this, we need to get an AWS account and add permissions to allow
the framework to create resources, instructions can be found
<a href="https://serverless.com/framework/docs/providers/aws/guide/credentials/">here</a>.</p>
<h1>Package Structure</h1>
<p>To begin, I set up the project structure for the application package:</p>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">model_lambda</span> <span class="ss">(</span> <span class="nv">python</span> <span class="nv">package</span> <span class="k">for</span> <span class="nv">model</span> <span class="nv">lambda</span> <span class="nv">app</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">web_api</span> <span class="ss">(</span> <span class="nv">package</span> <span class="k">for</span> <span class="nv">handling</span> <span class="nv">http</span> <span class="nv">requests</span><span class="o">/</span><span class="nv">responses</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">controllers</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">schemas</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">config</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">lambda_handler</span>.<span class="nv">py</span> <span class="ss">(</span> <span class="nv">lambda</span> <span class="nv">entry</span> <span class="nv">point</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_manager</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">scripts</span>
<span class="o">-</span> <span class="nv">openapi</span>.<span class="nv">py</span> <span class="ss">(</span><span class="nv">script</span> <span class="k">for</span> <span class="nv">generating</span> <span class="nv">an</span> <span class="nv">OpenAPI</span> <span class="nv">specification</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span> <span class="nv">unit</span> <span class="nv">tests</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">application</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">Makefle</span>
<span class="o">-</span> <span class="nv">README</span>.<span class="nv">md</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">serverless</span>.<span class="nv">yaml</span> <span class="ss">(</span> <span class="nv">configuration</span> <span class="k">for</span> <span class="nv">serverless</span> <span class="nv">framework</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">setup</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span>
</code></pre></div>
<p>This structure can be seen in the <a href="https://github.com/schmidtbri/lambda-ml-model-deployment">github
repository</a>.</p>
<h1>Lambda Handler</h1>
<p>The AWS Lambda service is event-oriented which means that it runs code
in responses to events. The entry point for the code is a function
called the lambda handler. The lambda handler function is expected to
receive two parameters: event and context. The event parameter is
usually a dictionary that contains the details of the event that
triggered the execution of the lambda. The context parameter is a
dictionary that holds information about the function execution and the
execution environment.</p>
<p>To begin, we'll add an entry point for the lambda in the
lambda_function.py module:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">lambda_handler</span><span class="p">(</span><span class="n">event</span><span class="p">,</span> <span class="n">context</span><span class="p">):</span>
<span class="sd">"""Lambda handler function."""</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/lambda_function.py#L14-L15">here</a>.</p>
<p>We'll be adding code to the handler function later.</p>
<h1>Model Manager Class</h1>
<p>In order to manage a collection MLModel objects in the lambda, we'll
reuse a piece of code that we've used before in a <a href="https://www.tekhnoal.com/using-ml-model-abc.html">previous blog
post</a>.
In the previous post, I wrote a class called "ModelManager" that is
responsible for instantiating MLModel classes from configuration,
returning information about the model objects being managed, and return
references to the model objects upon request. We can reuse the class in
this project since we'll need the same functionality.</p>
<p>The ModelManager class has three methods: get_models() which returns
list of models under management, get_model_metadata() which returns
metadata about a single model, and get_model() which returns a
reference to a model under management. The code for the ModelManager
class can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/model_manager.py">here</a>.
For a full explanation of the code in the class, please read the
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">original blog
post</a>.</p>
<p>In order to use the ModelManager class within the model lambda we have
to first instantiate it, then call the load_model() method to load
MLModels objects we want to host in the lambda. Since the model classes
will load their parameters from disk when they are instantiated, it's
important that we only do this one time, when the lambda starts up. We
can do this by adding this code at the top of the lambda_handler.py
module:</p>
<div class="highlight"><pre><span></span><code><span class="c1"># instantiating the model manager class</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="c1"># loading the MLModel objects from configuration</span>
<span class="n">model_manager</span><span class="o">.</span><span class="n">load_models</span><span class="p">(</span><span class="n">configuration</span><span class="o">=</span><span class="n">Config</span><span class="o">.</span><span class="n">models</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/lambda_function.py#L7-L11">here</a>.</p>
<p>By putting this initialization at the top of the lambda function module,
we can be sure that the models are initialized one time only. The
configuration is loaded from the config.py module found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/config.py">here</a>.</p>
<h1>REST Endpoints</h1>
<p>An AWS Lambda function is able to handle events from <a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-services.html">several
sources</a>
in the AWS ecosystem. In this blog post, we'll build a simple web
service that can serve predictions from the models that are hosted by
the lambda. To do this, we'll add an API Gateway as an event source to
the lambda function later. To be able to handle the HTTP requests sent
by the API Gateway, we'll copy the code from a <a href="https://www.tekhnoal.com/using-ml-model-abc.html">previous blog
post</a>
used to build a Flask web service. The code that defines the REST
endpoints is isolated inside of <a href="https://github.com/schmidtbri/lambda-ml-model-deployment/tree/master/model_lambda/web_api">a
subpackage</a>
inside of the model_lambda package, since we want to easily adapt the
model lambda for other types of integrations.</p>
<p>The data models accepted by the REST endpoints will be the same. We'll
use the marshmallow schema package to define the schemas of the objects
accepted by and returned from the endpoints. The schemas can be found in
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/web_api/schemas.py">this
module</a>.
Since the API Gateway is handling all of the functionality normally
handled by a web application framework, we'll avoid using Flask for
building the application. However, we still have to define controller
functions that receive requests and return responses to a client. To do
this we'll reuse the controllers from the <a href="https://www.tekhnoal.com/using-ml-model-abc.html">previous blog
post</a>
and rewrite them a bit to remove the Flask dependency. The new
controller functions can be found in <a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/web_api/controllers.py">this
module</a>.</p>
<p>The web_api package within the model_lambda application is built along
the same lines as a web application. It is built in this way so that it
isolates the functionality to one package within the application. Now
that we have the ability to receive HTTP requests and return HTTP
responses, we have to integrate it with the AWS Lambda service, we'll do
this in the next section.</p>
<h1>Handling API Gateway Events</h1>
<p>The AWS Lambda service integrates with other systems by using event
types. For this blog post, we'll be integrating with an <a href="https://aws.amazon.com/api-gateway/.">AWS API
Gateway</a>, to do this
we'll need to handle AWS API Gateway
<a href="https://docs.aws.amazon.com/lambda/latest/dg/with-on-demand-https.html">events</a>.
The Lambda service sends events to our lambda by encoding all
information about an HTTP request into a dictionary data structure and
calling the lambda handler function with the dictionary as the "event"
parameter. In order to integrate our REST endpoint code with the API
Gateway, we'll need to recognize the event type, route the request to
the right REST endpoint, encode the HTTP response into a dictionary, and
return it to the Lambda service. The Lambda service will then return the
response to the API Gateway which will create the actual HTTP response
that will go back to the client.</p>
<p>To recognize the API Gateway event type, we'll check for a few fields in
the event dictionary:</p>
<div class="highlight"><pre><span></span><code><span class="k">if</span> <span class="n">event</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"resource"</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> \
<span class="ow">and</span> <span class="n">event</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"path"</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> \
<span class="ow">and</span> <span class="n">event</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"httpMethod"</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/lambda_function.py#L16-L19">here</a>.</p>
<p>Once we're sure that we have an API Gateway event, we can choose which
REST endpoint to route the request to:</p>
<div class="highlight"><pre><span></span><code><span class="k">if</span> <span class="n">event</span><span class="p">[</span><span class="s2">"resource"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"/api/models"</span> <span class="ow">and</span> <span class="n">event</span><span class="p">[</span><span class="s2">"httpMethod"</span><span class="p">]</span><span class="n">c</span><span class="o">==</span> <span class="s2">"GET"</span><span class="p">:</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">get_models</span><span class="p">()</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/lambda_function.py#L21-L23">here</a>.</p>
<p>If the API Gateway event is a request for the "models" endpoint with the
GET verb, we'll route it to the get_models() controller function. This
will return a list of the model available for prediction to the API
Gateway, which will then return it as an HTTP response to the client
system.</p>
<p>Next, we'll route to the metadata endpoint:</p>
<div class="highlight"><pre><span></span><code><span class="k">elif</span> <span class="n">event</span><span class="p">[</span><span class="s2">"resource"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"/api/models/</span><span class="si">{qualified_name}</span><span class="s2">/metadata"</span> \
<span class="ow">and</span> <span class="n">event</span><span class="p">[</span><span class="s2">"httpMethod"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"GET"</span><span class="p">:</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">get_metadata</span><span class="p">(</span><span class="n">qualified_name</span><span class="o">=</span><span class="n">event</span><span class="p">[</span><span class="s2">"pathParameters"</span><span class="p">][</span><span class="s2">"qualified_name"</span><span class="p">])</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/lambda_function.py#L25-L27">here</a>.</p>
<p>The get_metadata() function requires a parameter called
"qualified_name" which is the unique name of the model that the client
wants the metadata for. This parameter is parsed for us from the path of
the request by the API Gateway, and is sent in the "pathParameters"
field in the event dictionary.</p>
<p>Next, we'll route to the "predict" endpoint:</p>
<div class="highlight"><pre><span></span><code><span class="k">elif</span> <span class="n">event</span><span class="p">[</span><span class="s2">"resource"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"/api/models/</span><span class="si">{qualified_name}</span><span class="s2">/predict"</span> \
<span class="ow">and</span> <span class="n">event</span><span class="p">[</span><span class="s2">"httpMethod"</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"POST"</span> \
<span class="ow">and</span> <span class="n">event</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"pathParameters"</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> \
<span class="ow">and</span> <span class="n">event</span><span class="p">[</span><span class="s2">"pathParameters"</span><span class="p">]</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"qualified_name"</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">response</span> <span class="o">=</span> <span class="n">predict</span><span class="p">(</span><span class="n">qualified_name</span><span class="o">=</span><span class="n">event</span><span class="p">[</span><span class="s2">"pathParameters"</span><span class="p">][</span><span class="s2">"qualified_name"</span><span class="p">],</span> <span class="n">request_body</span><span class="o">=</span><span class="n">event</span><span class="p">[</span><span class="s2">"body"</span><span class="p">])</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/lambda_function.py#L29-L34">here</a>.</p>
<p>This endpoint takes a little more effort since it also requires that the
body of the request be sent to the predict() function.</p>
<p>Lastly, we'll raise an error for any resources in the API Gateway that
we can't handle:</p>
<div class="highlight"><pre><span></span><code><span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"This lambda cannot handle this resource."</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/lambda_function.py#L36-L37">here</a>.</p>
<p>This last statement raises an exception if the lambda can't handle the
resource that the API Gateway is requesting. This should never happen if
the API Gateway is created correctly, since only the three resources
listed above will be added to the API Gateway when we create it.</p>
<p>Now that the REST endpoint code has handled the request and created a
response, we have to encode it into a dictionary that the Lambda service
will send back to the API Gateway:</p>
<div class="highlight"><pre><span></span><code><span class="k">return</span> <span class="p">{</span>
<span class="s2">"isBase64Encoded"</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
<span class="s2">"statusCode"</span><span class="p">:</span> <span class="n">response</span><span class="o">.</span><span class="n">status</span><span class="p">,</span>
<span class="s2">"headers"</span><span class="p">:</span> <span class="p">{</span><span class="s2">"Content-Type"</span><span class="p">:</span> <span class="n">response</span><span class="o">.</span><span class="n">mimetype</span><span class="p">},</span>
<span class="s2">"body"</span><span class="p">:</span> <span class="n">response</span><span class="o">.</span><span class="n">data</span>
<span class="p">}</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/lambda_function.py#L39-L44">here</a>.</p>
<p>Lastly, we close the lambda handler by throwing an exception if we can't
identify the event type:</p>
<div class="highlight"><pre><span></span><code><span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"This lambda cannot handle this event type."</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/model_lambda/lambda_function.py#L46-L47">here</a>.</p>
<p>The code in this section forms an adapter layer between the Lambda
service and the web application that we want to build. For the sake of
good engineering practices, we isolate the code that deals with
interfacing with the AWS Lambda service and the code that handles the
HTTP requests and responses. By building the code this way, we have a
much easier time writing unit tests for the code.</p>
<h1>Adding Serverless Configuration</h1>
<p>The serverless framework provides a command for starting a python lambda
project, we'll skip using this command since we already created the
lambda handler code inside of the model_lambda packages. We'll create
the settings file that the serverless framework works with by hand. The
file name is serverless.yml and it should be in the root of the project.</p>
<p>To begin we'll add a few basic things to the file:</p>
<div class="highlight"><pre><span></span><code><span class="n">service</span><span class="o">:</span><span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">service</span><span class="w"></span>
<span class="n">provider</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">name</span><span class="o">:</span><span class="w"> </span><span class="n">aws</span><span class="w"></span>
<span class="w"> </span><span class="n">runtime</span><span class="o">:</span><span class="w"> </span><span class="n">python3</span><span class="o">.</span><span class="mi">7</span><span class="w"></span>
<span class="n">stage</span><span class="o">:</span><span class="w"> </span><span class="n">dev</span><span class="w"></span>
<span class="n">region</span><span class="o">:</span><span class="w"> </span><span class="n">us</span><span class="o">-</span><span class="n">east</span><span class="o">-</span><span class="mi">2</span><span class="w"></span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/serverless.yml#L1-L8">here</a>.</p>
<p>These values will be used by the serverless framework to create a
service. A service can contain one or more functions plus any other
resources needed to support them. The name of the service is
"model-service", the provider will be AWS and the function runtime will
be python 3.7. The default stage will be "dev" and the default region
will be us-east-2. The values can be changed at deployment time.</p>
<p>Now we can add a function to the service:</p>
<div class="highlight"><pre><span></span><code><span class="nb">functions</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="n">model</span><span class="o">-</span><span class="n">lambda</span><span class="p">:</span><span class="w"></span>
<span class="w"> </span><span class="n">handler</span><span class="p">:</span><span class="w"> </span><span class="n">model_lambda</span><span class="p">.</span><span class="n">lambda_function</span><span class="p">.</span><span class="n">lambda_handler</span><span class="w"></span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/serverless.yml#L16-L18">here</a>.</p>
<p>The function will be named "model-lambda", and the handler points at the
location of the lambda_handler function that we put into the
lambda_function module. The lambda_handler function is located within
the lambda_function module, which is located in the model_lambda
package.</p>
<p>These lines are the only ones needed to get the basic settings in place
for the lambda. In the next sections we'll add more lines to the
serverless.yml file to handle other things.</p>
<h1>Building a Deployment Package</h1>
<p>The serverless framework can help us to build a deployment package for
the model-lambda, but to do this we need to add an extension called
"serverless-python-requirements". This extension allows the serverless
framework to create deployment packages that include all of the python
dependencies for the model-lambda code. The extension uses the
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/requirements.txt">requirements.txt</a>
file in the root of the project. To install the extension, use this
command:</p>
<div class="highlight"><pre><span></span><code>sls plugin install -n serverless-python-requirements
</code></pre></div>
<p>This command will add a node_modules folder to the project folder, and
some other files to keep track of the node.js dependencies of the
extension.</p>
<p>In order for the serverless framework to make use of the extension for
this project, we have to add this line to the serverless.yml file:</p>
<div class="highlight"><pre><span></span><code>plugins:
- serverless-python-requirements
</code></pre></div>
<p>This code can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/serverless.yml#L38-L39">here</a>.</p>
<p>Once serverless can find the extension, we can modify the way that the
extension will create the deployment package by adding these lines to
the serverless.yml file:</p>
<div class="highlight"><pre><span></span><code><span class="n">custom</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">pythonRequirements</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">dockerizePip</span><span class="o">:</span><span class="w"> </span><span class="kc">true</span><span class="w"></span>
<span class="w"> </span><span class="n">slim</span><span class="o">:</span><span class="w"> </span><span class="kc">true</span><span class="w"></span>
<span class="w"> </span><span class="n">noDeploy</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">apispec</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">PyYAML</span><span class="w"></span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/serverless.yml#L41-L47">here</a>.</p>
<p>The dockerizePip options makes the serverless-python-requirements
extension do the installation of the packages within the <a href="https://github.com/lambci/docker-lambda">docker-lambda
image</a> which will
guarantee that the deployment package will work in the lambda service.
The slim options causes the extension to not put several unneeded file
types in the deployment package, such as "*.__pycache__" files.</p>
<p>The noDeploy list of packages will cause the build process to ignore
those packages, in this case we don't need the apispec and PyYAML
packages in the lambda.</p>
<p>Once we have all of this set up, we can test the creation of the
deployment package by using this command:</p>
<div class="highlight"><pre><span></span><code>sls package
</code></pre></div>
<p>After executing this command, the serverless framework will create a new
folder called ".serverless" inside of the project root. This folder
contains several different files that will be used when deploying the
service to AWS. The file we are interested in is called
"model-service.zip", this file is the deployment package which will be
used to create the lambda. When we open this file we'll see that the
serverless framework actually packaged almost all of the files in the
project folder into the deployment package, most of which are not
needed. To prevent this we'll add these lines to the serverless.yml
file:</p>
<div class="highlight"><pre><span></span><code><span class="kd">package</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="n">exclude</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"**/**"</span><span class="w"></span>
<span class="w"> </span><span class="k">include</span><span class="o">:</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="s2">"model_lambda/**"</span><span class="w"></span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/serverless.yml#L10-L14">here</a>.</p>
<p>These lines tell the serverless framework to only add the code in the
model_lambda python package to the lambda deployment package. This step
is important because the AWS Lambda service has a limit on the size of
deployment packages.</p>
<p>Having written scripts that build lambda deployment packages for lambdas
that have scikit-learn and numpy before, I can say that the
serverless-python-requirements extension makes everything much simpler.
The addition of the docker image for compiling source Python packages
makes everything even better since it guarantees that the deployment
package will work correctly in the AWS Lambda python environment. By
leveraging on the serverless framework and the
serverless-python-requirements extension to do this for us, we've
avoided writing a lot of code for deploying the lambda.</p>
<h1>Deploying the Model Lambda</h1>
<p>Now that we have the deployment package in hand, we can try to create
the lambda in AWS. To do this, we execute this command:</p>
<div class="highlight"><pre><span></span><code>sls deploy
</code></pre></div>
<p>This command will interact with the AWS API to create the lambda, using
a CloudFormation template. If we log in to the AWS console, we can see
the lambda listed in the user interface of the AWS Lambda service:</p>
<p><img alt="Lambda UI" src="https://www.tekhnoal.com/lambda_ui.png" width="100%"></p>
<p>We can execute the lambda in the cloud with this command:</p>
<div class="highlight"><pre><span></span><code>serverless invoke -f model-lambda -s dev -r us-east-1 -p tests/data/api_gateway_list_models_event.json
</code></pre></div>
<p>The command executes the lambda through the AWS API using a test event
from the unit tests folder.</p>
<h1>Adding a RESTful Interface</h1>
<p>Now that we have a lambda working inside of the AWS Lambda service, we
need to connect it to an event source. The serverless framework supports
this by adding an "events" array to the lambda function in
serverless.yml file:</p>
<div class="highlight"><pre><span></span><code>events:
- http:
path: api/models
method: get
- http:
path: api/models/{qualified_name}/metadata
method: get
request:
parameters:
paths:
qualified_name: true
- http:
path: api/models/{qualified_name}/predict
method: post
request:
parameters:
paths:
qualified_name: true
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/serverless.yml#L19-L36">here</a>.</p>
<p>The three events above correspond to three AWS API Gateway resources
that will trigger a lambda execution when they receive requests. After
adding these events, we can execute the deploy command again to create
the API Gateway:</p>
<div class="highlight"><pre><span></span><code>sls deploy
</code></pre></div>
<p>The API Gateway and it's resources are added to the CloudFormation
template that serverless manages for the service, and serverless uses
the AWS API to create the API Gateway and route the events to the
lambda.</p>
<p>The deploy command returned the URL of the new API Gateway endpoints, so
to test out the new API Gateway I simply executed this command:</p>
<div class="highlight"><pre><span></span><code>curl https://ra2nrqnhrj.execute-api.us-east-1.amazonaws.com/dev/api/models
</code></pre></div>
<p>As expected, the endpoint returned information about the iris_model
MLModel that is configured. Note that the endpoint is not secured, so
it's not a good idea to keep the API Gateway running for a long time. To
delete the AWS resources we've been working with, execute this command:</p>
<div class="highlight"><pre><span></span><code>sls remove
</code></pre></div>
<p>Even though we can create an API Gateway by using the serverless
framework, the serverless.yml file is missing a lot of information that
is provided by an OpenAPI specification. In order to properly document
the API, I created an OpenAPI specification for the API we created, it
can be found
<a href="https://github.com/schmidtbri/lambda-ml-model-deployment/blob/master/openapi_specification.yaml">here</a>.</p>
<h1>Closing</h1>
<p>A benefit of deploying an ML model on an AWS Lambda is the simplicity of
the deployment. By removing the need to manage servers, the path to
deploying an ML model is much faster and simpler. Another benefit is the
number of integrations that AWS provides for the Lambda service. In this
blog post, we showed how to integrate the lambda with an API Gateway to
create a RESTful service, but there are many other options available.</p>
<p>A drawback of the lambda service is that is suffers from cold start
latency. A coldstart happens when a lambda is executed in response to an
event after not being used for 15 minutes, when this happens, the lambda
takes extra time to respond to the request. <a href="https://mikhail.io/serverless/coldstarts/aws/">This blog
post</a> goes
into the details of this problem. The cold start problem becomes even
more pronounced with a lambda that is hosting an ML model because the
model parameters need to be deserialized when the lambda first starts
up, which adds to the cold start time.</p>
<p>Another problem that we might face when deploying an ML model inside a
lambda is the limits on the deployment package size. The AWS Lambda
service limit for the deployment package size is 50 MB. When packaging
model files along with the deployment package we might go beyond that
limit very easily. This can be fixed by having the lambda pick up the
model files from an S3 bucket. I will show details for a simple and
general way to do this in a later blog post.</p>
<p>An interesting way to improve the code is to make it possible to
integrate other data sources in AWS with the model lambda. For example,
we can have the Lambda listen for events coming from a Simple Queueing
Service queue, make a prediction and put the prediction result in
another SQS queue. Another option is to do a similar integration with
the <a href="https://aws.amazon.com/kinesis/data-streams/">AWS Kinesis</a>
service for doing streaming analytics. Both of these services can be
integrated with AWS Lambda easily.</p>A Task Queue ML Model Deployment2019-10-24T09:24:00-05:002019-10-24T09:24:00-05:00Brian Schmidttag:www.tekhnoal.com,2019-10-24:/task-queue-ml-model-deployment.html<p>When building software, we may come across situations in which we want to execute a long-running operation behind the scenes while keeping the main execution path of the code running. This is useful when the software needs to remain responsive to a user, and the long running operation would get in the way. These types of operations often involve contacting another service over the network or writing data to IO. For example, when a web service needs to send an email, often the best way to do it is to launch a task in the background that will actually send the email, and return a response to the client immediately.</p><p>This blog post builds on the ideas started in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog posts</a>.</p>
<p>The code in this blog post can be found in this <a href="https://github.com/schmidtbri/task-queue-ml-model-deployment">github repo</a>.</p>
<h1>Introduction</h1>
<p>When building software, we may come across situations in which we want
to execute a long-running operation behind the scenes while keeping the
main execution path of the code running. This is useful when the
software needs to remain responsive to a user, and the long running
operation would get in the way. These types of operations often involve
contacting another service over the network or writing data to IO. For
example, when a web service needs to send an email, often the best way
to do it is to launch a task in the background that will actually send
the email, and return a response to the client immediately.</p>
<p>These types of tasks are often handled in a task queue, which can also
be called a <a href="https://en.wikipedia.org/wiki/Job_queue">job queue</a>. A task
queue is a service that receives requests to perform tasks, and handles
finding the resources necessary for the task, and scheduling the task.
It can also store the results of the tasks for later retrieval. Tasks
usually execute asynchronously, which means that the client does not
wait for the result of the task, but synchronous execution can also be
supported.</p>
<p>A task queue can also execute tasks on many different physical
computers, which makes it a distributed system. To handle communication
between many machines, a task queue often makes use of a <a href="https://en.wikipedia.org/wiki/Message_broker">message
broker</a>
service to handle message passing between the worker processes that
execute the tasks and the clients of the tasks. A message broker service
acts as a middle man, receiving, storing, routing, and sending messages
between many different services. A message router service is an
implementation of the
<a href="https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern">publish-subscribe</a>
pattern. The benefits of using this pattern is that the services that
communicate over the message broker remain decoupled from each other.</p>
<p>A task queue can be useful for machine learning model deployments, since
a machine learning model may take some time to make a prediction and
return a result. Most often, the ML prediction algorithm itself is
CPU-bound, which means that it is limited by the availability of CPU
time. This means that a task queue is usually not necessary for the
deployment of the ML model itself, but for dealing with the loading of
data that the prediction algorithm may need to make a prediction which
is an IO-bound process. Another situation in which a task queue may be
useful is when we need to make thousands of predictions and return them
as a result; in this case it would be useful to launch an asynchronous
task that will take care of the predictions behind the scenes and then
come back later to access the results.</p>
<h1>Task Queueing With Celery</h1>
<p>Celery is a python package that handles most of the complexity of
distributing and executing tasks across different processes. Celery is
able to use many different types of message brokers to distribute tasks,
for this blog post we'll use the Redis message broker. In order to
access task results, Celery supports several kinds of result storage
backends, for this blog we'll also use Redis to store the prediction
results of the model. As in previous blog posts, we'll be deploying the
iris_model package, which was developed as an example and has now been
deployed several times.</p>
<p>Since we are now dealing with more than one service and we are
communicating data between several different processes over a network,
it's useful to visualize the activity of the task queue with a software
architecture diagram:</p>
<p><img alt="Software Architecture" src="https://www.tekhnoal.com/software_architecture.png" width="100%"></p>
<p>The client application installs the Celery application package and sends
task requests through the tasks that are defined in it, whenever a task
needs to be executed, it sends a message to the task broker with any
parameters that the task needs to execute. The message broker receives
messages and holds them until they are picked up by the worker
processes. The workers are running the Celery application and pick up
messages from the message broker, when a task is completed, they store
the results to the result storage backend.</p>
<h1>Package Structure</h1>
<p>To begin, I set up the project structure for the application package:</p>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">model_task_queue</span> <span class="ss">(</span> <span class="nv">python</span> <span class="nv">package</span> <span class="k">for</span> <span class="nv">task</span> <span class="nv">queue</span> <span class="nv">app</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">__main__</span>.<span class="nv">py</span> <span class="ss">(</span> <span class="nv">command</span> <span class="nv">line</span> <span class="nv">entry</span> <span class="nv">point</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">celery</span>.<span class="nv">py</span> <span class="ss">(</span> <span class="nv">celery</span> <span class="nv">application</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">config</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">ml_model_task</span>.<span class="nv">py</span> <span class="ss">(</span> <span class="nv">task</span> <span class="nv">class</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">scripts</span>
<span class="o">-</span> <span class="nv">simple_test</span>.<span class="nv">py</span> <span class="ss">(</span> <span class="nv">single</span> <span class="nv">prediction</span> <span class="nv">test</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">continuous_test</span>.<span class="nv">py</span> <span class="ss">(</span> <span class="nv">multiple</span> <span class="nv">prediction</span> <span class="nv">test</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span> <span class="nv">unit</span> <span class="nv">tests</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">Makefle</span>
<span class="o">-</span> <span class="nv">README</span>.<span class="nv">md</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">setup</span>.<span class="nv">py</span>
</code></pre></div>
<p>This structure can be seen here in the <a href="https://github.com/schmidtbri/task-queue-ml-model-deployment">github
repository</a>.</p>
<h1>Model Async Task</h1>
<p>Creating an asynchronous task with the Celery package is simple, it's as
easy as putting a function decorator on a function. An example of how to
do this can be found in the <a href="https://docs.celeryproject.org/en/latest/getting-started/first-steps-with-celery.html#application">Celery startup
guide</a>.
The function decorator allows the client application to call the
function just like a local function, while having the actual execution
of the code happen asynchronously in a worker process running in a
different computer. In the client code, the function acts as a facade
that hides the complexities of parameter serialization/deserialization,
network communication and other complexities of the distributed nature
of the task queue.</p>
<p>The function decorator is a simple way to get started with Celery tasks,
but we have some special requirements that make it hard to create Celery
tasks this way. For example, Celery task functions don't maintain state
between requests. If we had to instantiate an MLModel object for every
task request, the model parameters would have to be loaded and
deserialized over and over for each request. To get around this
limitation we'll have to code the ML model async task in such a way that
it can maintain an instance of an MLModel object in memory between
requests. A way to do this can be found in the Celery documentation
<a href="https://docs.celeryproject.org/en/latest/userguide/tasks.html#custom-task-classes">here</a>.</p>
<p>Following the example in the documentation, we'll define a class that
inherits from the celery.Task base class:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">celery</span> <span class="kn">import</span> <span class="n">Task</span>
<span class="k">class</span> <span class="nc">MLModelPredictionTask</span><span class="p">(</span><span class="n">Task</span><span class="p">):</span>
<span class="sd">"""Celery Task for making ML Model predictions."""</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/model_task_queue/ml_model_task.py#L3-L9">here</a>.</p>
<p>Now we'll define the task class' __init__ method:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">module_name</span><span class="p">,</span> <span class="n">class_name</span><span class="p">):</span>
<span class="sd">"""Class constructor."""</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="o">=</span> <span class="kc">None</span>
<span class="n">model_module</span> <span class="o">=</span> <span class="n">importlib</span><span class="o">.</span><span class="n">import_module</span><span class="p">(</span><span class="n">module_name</span><span class="p">)</span>
<span class="n">model_class</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">model_module</span><span class="p">,</span> <span class="n">class_name</span><span class="p">)</span>
<span class="k">if</span> <span class="nb">issubclass</span><span class="p">(</span><span class="n">model_class</span><span class="p">,</span> <span class="n">MLModel</span><span class="p">)</span> <span class="ow">is</span> <span class="kc">False</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">ValueEror</span><span class="p">(</span><span class="s2">"MLModelPredictionTask can only be used with subtypes of MLModel."</span><span class="p">)</span>
<span class="c1"># saving the reference to the class to avoid having to import it again</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model_class</span> <span class="o">=</span> <span class="n">model_class</span>
<span class="c1"># adding a name to the task object</span>
<span class="bp">self</span><span class="o">.</span><span class="n">name</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{}</span><span class="s2">.</span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="vm">__name__</span><span class="p">,</span> <span class="n">model_class</span><span class="o">.</span><span class="n">qualified_name</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/model_task_queue/ml_model_task.py#L11-L26">here</a>.</p>
<p>The __init__() method accepts two parameters: the name of the module
where we can find the MLModel-derived class, and the name of the class
in that module that implements the prediction functionality. The
__init__() method then calls the __init__() method of the Celery
Task base class to make sure that all of the required initialization
code is executed correctly. Then the "_model" property is set to None
(for now). After this, we dynamically import the MLModel class from the
environment, and check that it is a subclass of MLModel. Next, we save a
reference to the class in the "_model_class" property of the new task
object but we do not instantiate the model class itself, the reason for
this is explained below. Lastly, we set a unique name for the Celery
task based on the name of the MLModelPredictionTask class' module and
the qualified name of the MLModel class that is being hosted inside of
this instance of the MLModelPredictionTask class. The name of the task
is set dynamically so that we are able to host many different models
within the same celery application, while guaranteeing that the tasks
will have unique names.</p>
<p>Next, we have the initialize() method is responsible for instantiating
the model class, and saving the reference as a property of the
MLModelPredictionTask object:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">initialize</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">model_object</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model_class</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="o">=</span> <span class="n">model_object</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/model_task_queue/ml_model_task.py#L28-L31">here</a>.</p>
<p>Lastly, the run() method is responsible for doing the work of the async
task:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">run</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">initialize</span><span class="p">()</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/model_task_queue/ml_model_task.py#L33-L37">here</a>.</p>
<p>The run() method checks if the model class is instantiated before it
attempts to make a prediction. If it is not instantiated, it calls the
initialize() method to create the model object before making a
prediction with it. The run() method is the one that defines the actual
functionality of the Celery task.</p>
<p>In
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">previous</a>
<a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog
posts</a>,
the instantiation of the model class happens in the __init__()
method of the class that is managing the model object. After this, we
can use the model class to make a prediction. We have to take a
different approach in this application because we need to keep the model
class from being instantiated in the client application that is using
the asynchronous task. This happens because the client application
instantiates and manages an instance of the task class in its own
process space, and uses it to communicate with the worker processes that
are actually doing the work. To keep the model class from being
instantiated in the client application, the run() method is actually
responsible for initializing the model class instead of the
__init__() method. The only downside to this approach is that when
the worker process instantiates the task class, it will not have an
instance of the model class in memory, it will only be created the first
time that a prediction is made.</p>
<h1>Celery Application</h1>
<p>Now that we have a Celery task that can host an MLModel-based class, we
can start building a Celery application that hosts the tasks. To do
this, we first have to instantiate a task registry to hold the
instantiated tasks:</p>
<p>First, we will install a machine learning model that will be hosted by
the Celery application. For this we'll use the iris_model package that
I've already shown in
<a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog
posts</a>:</p>
<div class="highlight"><pre><span></span><code>pip install git+https://github.com/schmidtbri/ml-model-abc-improvements#egg<span class="o">=</span>iris_model
</code></pre></div>
<p>Then, we'll create a configuration class for the application:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">Config</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="sd">"""Configuration for all environments."""</span>
<span class="n">models</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span>
<span class="s2">"module_name"</span><span class="p">:</span> <span class="s2">"iris_model.iris_predict"</span><span class="p">,</span>
<span class="s2">"class_name"</span><span class="p">:</span> <span class="s2">"IrisModel"</span>
<span class="p">}</span>
<span class="p">]</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/model_task_queue/config.py#L4-L12">here</a>.</p>
<p>The configuration class defines property called "models" that is a list
of dictionaries, each dictionary containing two keys. The "module_name"
key points at a module that contains an MLModel-derived class, and the
"class_name" key contains the name of the class. By storing the
locations of the classes in this way, adding a new MLModel class to the
application is as simple as adding an entry to the list. The
configuration above points at the IrisModel class that we just installed
in the iris_model package. This class is meant to hold configuration
that is shared by all of the environments.</p>
<p>In the same file we also store configuration for different environments,
here is the configuration class for the production environment:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">ProdConfig</span><span class="p">(</span><span class="n">Config</span><span class="p">):</span>
<span class="sd">"""Configuration for the prod environment."""</span>
<span class="n">broker_url</span> <span class="o">=</span> <span class="s1">'redis://localhost:6379/0'</span>
<span class="n">result_backend</span> <span class="o">=</span> <span class="s1">'redis://localhost:6379/0'</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/model_task_queue/config.py#L15-L19">here</a>.</p>
<p>The configuration is pointing at a redis service on the localhost for
now. Now that we have configuration taken care of, we can start building
the Celery application. To do this we start by instantiating a task
registry:</p>
<div class="highlight"><pre><span></span><code><span class="n">registry</span> <span class="o">=</span> <span class="n">TaskRegistry</span><span class="p">()</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/model_task_queue/celery.py#L14">here</a>.</p>
<p>Next, we add tasks to the task registry:</p>
<div class="highlight"><pre><span></span><code><span class="k">for</span> <span class="n">model</span> <span class="ow">in</span> <span class="n">Config</span><span class="o">.</span><span class="n">models</span><span class="p">:</span>
<span class="n">registry</span><span class="o">.</span><span class="n">register</span><span class="p">(</span><span class="n">MLModelPredictionTask</span><span class="p">(</span><span class="n">module_name</span><span class="o">=</span><span class="n">model</span><span class="p">[</span><span class="s2">"module_name"</span><span class="p">],</span> <span class="n">class_name</span><span class="o">=</span><span class="n">model</span><span class="p">[</span><span class="s2">"class_name"</span><span class="p">]))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/model_task_queue/celery.py#L17-L19">here</a>.</p>
<p>The loop iterates through the list of models in the configuration,
instantiates a MLModelPredictionTask for each model, and registers the
new task with the task registry object we defined above.</p>
<p>Celery tasks are usually automatically registered in a task registry as
soon as they are instantiated, but we have a special situation because
of the dynamic and configuration-driven nature of the Celery
application. The manual registration of the task shown above is needed
because we don't know how many tasks we will be hosting in the
application, we only know this once the application starts up and reads
the configuration.</p>
<p>Now that we have a task registry with tasks in it, we can create the
Celery application object:</p>
<div class="highlight"><pre><span></span><code><span class="n">app</span> <span class="o">=</span> <span class="n">Celery</span><span class="p">(</span><span class="vm">__name__</span><span class="p">,</span> <span class="n">tasks</span><span class="o">=</span><span class="n">registry</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/model_task_queue/celery.py#L22-L23">here</a>.</p>
<p>The name of the application is pulled from the module name, and the
tasks parameter is the task registry object we defined above.</p>
<p>Lastly, we need to point the Celery application to a broker and result
backend so that the clients and workers can communicate. These settings
are loaded from the configuration classes we've already defined:</p>
<div class="highlight"><pre><span></span><code><span class="n">app</span><span class="o">.</span><span class="n">config_from_object</span><span class="p">(</span><span class="s2">"model_task_queue.config.</span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">'APP_SETTINGS'</span><span class="p">]))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/model_task_queue/celery.py#L26">here</a>.</p>
<p>The name of the environment is loaded from an environment variable
called "APP_SETTINGS". The environment variable is then used to load
the correct configuration object from the config.py file.</p>
<h1>Using the Task</h1>
<p>To use the iris_model task in the Celery application we just built,
we'll need to start up an instance of redis to serve as the message
broker and storage backend for the task queue. To do this, we can use a
docker image with this command:</p>
<div class="highlight"><pre><span></span><code>docker run -d -p <span class="m">6379</span>:6379 redis
</code></pre></div>
<p>Now that we have a redis instance to communicate with, we can start a
Celery worker process:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">OBJC_DISABLE_INITIALIZE_FORK_SAFETY</span><span class="o">=</span>YES
<span class="nb">export</span> <span class="nv">APP_SETTINGS</span><span class="o">=</span>ProdConfig
<span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
python3 -m model_task_queue --loglevel INFO
</code></pre></div>
<p>The OBJC_DISABLE_INITIALIZE_FORK_SAFETY environment variable is
needed in MacOS to allow Celery to fork processes when handling task
execution. The APP_SETTINGS environment variable is needed so that the
Celery application will load the right configuration. The PYTHONPATH
environment allows the Python interpreter to find the dependencies of
the Celery application. The last command start the Celery worker process
by calling the script in the __main__.py module.</p>
<p>Next, we can try out the task itself in a python interactive session:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="kn">import</span> <span class="nn">os</span>
<span class="o">>>></span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"APP_SETTINGS"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"ProdConfig"</span>
<span class="o">>>></span> <span class="kn">from</span> <span class="nn">model_task_queue.celery</span> <span class="kn">import</span> <span class="n">app</span>
<span class="o">>>></span> <span class="n">task</span> <span class="o">=</span> <span class="n">app</span><span class="o">.</span><span class="n">tasks</span><span class="p">[</span><span class="s2">"model_task_queue.ml_model_task.iris_model"</span><span class="p">]</span>
<span class="o">>>></span> <span class="n">task</span><span class="o">.</span><span class="vm">__dict__</span>
<span class="p">{</span><span class="s1">'_model'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'_model_class'</span><span class="p">:</span> <span class="o"><</span><span class="k">class</span>
<span class="err">'</span><span class="nc">iris_model</span><span class="o">.</span><span class="n">iris_predict</span><span class="o">.</span><span class="n">IrisModel</span><span class="s1">'>, '</span><span class="n">name</span><span class="s1">':</span>
<span class="s1">'model_task_queue.ml_model_task.iris_model'</span><span class="p">,</span> <span class="s1">'_exec_options'</span><span class="p">:</span>
<span class="p">{</span><span class="s1">'queue'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'routing_key'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'exchange'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span>
<span class="s1">'priority'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'expires'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'serializer'</span><span class="p">:</span> <span class="s1">'json'</span><span class="p">,</span>
<span class="s1">'delivery_mode'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'compression'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'time_limit'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span>
<span class="s1">'soft_time_limit'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'immediate'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">'mandatory'</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span>
<span class="s1">'ignore_result'</span><span class="p">:</span> <span class="kc">False</span><span class="p">}}</span>
</code></pre></div>
<p>When using the celery task, we first need to instantiate the Celery
application object that is hosting the task. This happens when we import
the model_task_queue.celery module. Once we have the application
object, we can query the app.tasks dictionary for the model task we are
interested in. The name of the task is dynamically generated from the
qualified name of the model that it is hosting.</p>
<p>As can be seen above, when the task is first instantiated, it does not
have an object reference in the _model property. This is as we
intended, since we are using the Celery application as a client and we
don't want the task to instantiate the model class which would cause the
model to be deserialized in the client process.</p>
<p>Now that we have an instance of the task, we can try to execute it:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="n">result</span> <span class="o">=</span> <span class="n">task</span><span class="o">.</span><span class="n">delay</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="p">{</span> <span class="s2">"sepal_length"</span><span class="p">:</span> <span class="mf">5.0</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">3.2</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.2</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">0.2</span><span class="p">})</span>
<span class="o">>>></span> <span class="n">result</span><span class="o">.</span><span class="n">ready</span><span class="p">()</span>
<span class="kc">True</span>
<span class="o">>>></span> <span class="n">result</span><span class="o">.</span><span class="n">get</span><span class="p">()</span>
<span class="p">{</span><span class="s1">'species'</span><span class="p">:</span> <span class="s1">'setosa'</span><span class="p">}</span>
</code></pre></div>
<p>We use the task.delay() method to call the task asynchronously, getting
back a result object that can be used to get a result once the task is
completed. The ready() method of the result can be used to check on the
status of the result of the task. Once it is completed, the result can be
retrieved from the result backend with the get() method.</p>
<p>If the task throws an exception, the result will also throw an exception
when it is accessed:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="n">result</span> <span class="o">=</span> <span class="n">task</span><span class="o">.</span><span class="n">delay</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="p">{</span> <span class="s2">"sepal_length"</span><span class="p">:</span> <span class="mf">5.0</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">3.2</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.2</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="s2">"asdfg"</span><span class="p">})</span>
<span class="o">>>></span> <span class="n">result</span><span class="o">.</span><span class="n">ready</span><span class="p">()</span>
<span class="kc">True</span>
<span class="o">>>></span> <span class="n">result</span><span class="o">.</span><span class="n">get</span><span class="p">()</span>
<span class="n">Traceback</span> <span class="p">(</span><span class="n">most</span> <span class="n">recent</span> <span class="n">call</span> <span class="n">last</span><span class="p">):</span>
<span class="o">...</span>
<span class="n">ml_model_abc</span><span class="o">.</span><span class="n">MLModelSchemaValidationException</span><span class="p">:</span> <span class="n">Failed</span> <span class="n">to</span> <span class="n">validate</span> <span class="nb">input</span> <span class="n">data</span><span class="p">:</span> <span class="n">Key</span> <span class="s1">'petal_width'</span> <span class="n">error</span><span class="p">:</span> <span class="n">asdfg</span> <span class="n">should</span> <span class="n">be</span> <span class="n">instance</span> <span class="n">of</span> <span class="s1">'float'</span>
</code></pre></div>
<p>Because the "petal_width" field contains data that does not meet the
schema of the iris model, the model threw an exception of type
MLModelSchemaValidationException. The exception was caught by the celery
worker, serialized, and transported back to the client.</p>
<h1>Test Script</h1>
<p>In order to test the Celery application, we'll code a script that will
make use of the iris_model task asynchronously. To use the application,
we import the Celery application from the module where it is defined:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">model_task_queue.celery</span> <span class="kn">import</span> <span class="n">app</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/scripts/concurrent_test.py#L4">here</a>.</p>
<p>Next, we'll define a function that start a task, wait for it to
complete, and return the prediction result:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">request_task</span><span class="p">(</span><span class="n">data</span><span class="p">):</span>
<span class="n">task</span> <span class="o">=</span> <span class="n">app</span><span class="o">.</span><span class="n">tasks</span><span class="p">[</span><span class="s2">"model_task_queue.ml_model_task.iris_model"</span><span class="p">]</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">task</span><span class="o">.</span><span class="n">delay</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="c1"># waiting for the task to complete</span>
<span class="k">while</span> <span class="n">result</span><span class="o">.</span><span class="n">ready</span><span class="p">()</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">True</span><span class="p">:</span>
<span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="n">prediction</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">timeout</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="k">return</span> <span class="n">prediction</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/scripts/concurrent_test.py#L7-L17">here</a>.</p>
<p>Lastly, we'll define a function that uses the function above to test the
iris_model task concurrently:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">run_test</span><span class="p">():</span>
<span class="n">data</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span><span class="s2">"sepal_length"</span><span class="p">:</span> <span class="mf">5.0</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">3.2</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.2</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">0.2</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"sepal_length"</span><span class="p">:</span> <span class="mf">5.5</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">3.5</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.3</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">0.2</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"sepal_length"</span><span class="p">:</span> <span class="mf">4.9</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">3.1</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.5</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">0.1</span><span class="p">},</span>
<span class="p">{</span><span class="s2">"sepal_length"</span><span class="p">:</span> <span class="mf">4.4</span><span class="p">,</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">3.0</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.3</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">0.2</span><span class="p">}</span>
<span class="p">]</span>
<span class="k">with</span> <span class="n">Executor</span><span class="p">(</span><span class="n">max_workers</span><span class="o">=</span><span class="mi">4</span><span class="p">)</span> <span class="k">as</span> <span class="n">exe</span><span class="p">:</span>
<span class="n">jobs</span> <span class="o">=</span> <span class="p">[</span><span class="n">exe</span><span class="o">.</span><span class="n">submit</span><span class="p">(</span><span class="n">request_task</span><span class="p">,</span> <span class="n">d</span><span class="p">)</span> <span class="k">for</span> <span class="n">d</span> <span class="ow">in</span> <span class="n">data</span><span class="p">]</span>
<span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="n">job</span><span class="o">.</span><span class="n">result</span><span class="p">()</span> <span class="k">for</span> <span class="n">job</span> <span class="ow">in</span> <span class="n">jobs</span><span class="p">]</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"The tasks returned these predictions: </span><span class="si">{}</span><span class="se">\"</span><span class="s2">.format(results))</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/task-queue-ml-model-deployment/blob/master/scripts/concurrent_test.py#L20-L30">here</a>.</p>
<p>The function sets up a few inputs for the model in the data list. It
then calls the task concurrently using the ThreadPoolExecutor context
manager from the concurrent Python package. The context manager executes
the request_task() function concurrently in four worker processes.</p>
<p>To run the script, we'll need the redis docker image and the worker
process to be running. The script above can be executed from the command
line by using these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span>./
<span class="nb">export</span> <span class="nv">APP_SETTINGS</span><span class="o">=</span>ProdConfig
python3 scripts/concurrent_test.py
</code></pre></div>
<h1>Closing</h1>
<p>In this blog post I showed how to build a task queue application that is
able to host machine learning models. A task queue is very useful in
certain situations for deploying ml models because of capabilities that
it brings to the table.Task queues allow applications to do work
asynchronously behind the scenes without having the main application
being affected.</p>
<p>The ML model deployment strategy I showed in this blog post works in the
same way as the
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">previous</a>
<a href="https://www.tekhnoal.com/etl-job-ml-model-deployment.html">blog
posts</a>
I've published. The Celery application I built does not work with only
one ML model, it works with any ML model that uses the MLModel base
class. The application is also able to host any number of models, and
they are loaded from configuration which means that a new model can be
added to the Celery application without modifying the code. By following
good software engineering design practices, we are able to easily put
machine learning models into production without having to worry about
the implementation details of the models. All of these capabilities stem
from the design of the <a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">MLModel base
class</a>.</p>
<p>Another interesting feature of the Celery package is that we can launch
tasks from a variety of different languages. There are client libraries
for <a href="https://github.com/mher/node-celery">node.js</a> and
<a href="https://github.com/gjedeer/celery-php">PHP</a>. This
flexibility makes it possible to use Python for building and deploying
ML models, and to use other languages for the work that is best suited
for them.</p>
<p>A drawback of this approach is that when the Celery application is built
and deployed, the dependencies of the machine learning models that it is
hosting are installed along with it. This means that if two models
depend on different versions of scikit-learn or pandas, for example,
they won't be able to be installed in the same Celery application. This
limits the usefulness of the Celery application somewhat, since it can't
host models together that have conflicting requirements.</p>
<p>Another drawback of this approach is the extra complexity that it
entails, since it requires message broker service, a result storage
service, and the worker processes to be running for the task queue to be
available to client applications. All of these requirements add extra
complexity to this deployment option.</p>
<p>The Celery application I built is only able to deal with single
prediction requests. Even though this is useful it would make more sense
for the Celery application to be used to run longer prediction jobs that
make thousands of predictions at a time. An improvement that can be made
to the task is to be able to launch prediction tasks that take large
files as input, feed the individual records in the file as inputs to the
model, and store the resulting predictions back into a storage service.
The long-running task can also be instrumented to report its progress
back to the client that requested the predictions.</p>A Batch Job ML Model Deployment2019-09-20T09:24:00-05:002019-09-20T09:24:00-05:00Brian Schmidttag:www.tekhnoal.com,2019-09-20:/etl-job-ml-model-deployment.html<p>In previous blog posts I showed how to develop an ML model in such a way that makes it easy to deploy, and I showed how to create a web app that is able to deploy any model that followed the same design pattern. However, not all deployments of ML model are deployed within web apps. In this blog post I deploy the same model used in the previous blog posts as an ETL job.</p><p>This blog post continues the ideas started in
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">three</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/using-ml-model-abc.html">blog posts</a>.</p>
<p>The code in this blog post can be found in this <a href="https://github.com/schmidtbri/etl-job-ml-model-deployment">github
repo</a>.</p>
<h1>Introduction</h1>
<p>In previous blog posts I showed how to develop an ML model in such a way
that makes it easy to deploy, and I showed how to create a web app that
is able to deploy any model that followed the same design pattern.
However, not all deployments of ML model are deployed within web apps.
In this blog post I deploy the same model used in the previous blog
posts as an ETL job.</p>
<p>An <a href="https://en.wikipedia.org/wiki/Extract,_transform,_load">ETL
job</a>
is a procedure for copying data from a source system into a destination
system, with some processing along the way. The acronym ETL stands for
extract, transform, and load; as in extract from a source system,
transform the data into a format compatible with the destination system,
and load the resulting data into the destination system. ETLs are most
commonly associated with <a href="https://en.wikipedia.org/wiki/Data_warehouse">data
warehousing</a>
systems, in which they are used to take data from a system of record and
transform it to make it useful for reporting.</p>
<p>ETL jobs are useful for making predictions available to end users or to
other systems. The ETL for such an ML model deployment looks like this:
extract features used for prediction from a source system, send the
features to the model for prediction, and save the predictions to a
destination system.</p>
<p>A big distinction between ML models that are deployed in an ETL job and
the Flask web application shown in the <a href="https://www.tekhnoal.com/using-ml-model-abc.html">previous blog
post</a>
is that the ETL job is <em>not</em> a real time system since it is not expected
to return predictions to the client quickly. ETLs are also meant to
process thousands of records at a time, whereas a web app processes one
record (request) at a time. A real-time deployment of an ML model should
be able to return single predictions in less than a second, an ETL
deployment has a looser time constraint but makes many more predictions.</p>
<p>Another distinction between an ETL job deployment and a web service
deployment of an ML model is that an ETL saves predictions to data
storage, and the predictions are then accessed from there by the users
of the predictions. This means that the user of the predictions does not
interact with the model directly, and only has access to the predictions
saved since the ETL last ran. I call this distinction <em>interactive</em> vs.
<em>non-interactive</em> ML models. When an ML model is deployed
non-interactively, the users of the predictions have limitations as to
how they are able to use the model since they don't have direct access
to the model.</p>
<h1>Bonobo for ETL Jobs</h1>
<p>The <a href="https://www.bonobo-project.org/">bonobo package</a> is a
python package for writing ETL jobs, offering a simple pythonic
interface for writing code that loads, transforms, and saves data. The
package works well for small datasets that can be processed in single
processes, but not as useful for larger datasets. Nevertheless, the
package is perfect for small scale data processing. The package has a
strong object-oriented bend to it and it encourages good software
engineering best practices through a well-designed API.</p>
<p>The bonobo package does data processing by running <a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph">directed acyclic
graphs
(DAG)</a>
of operations defined by the user. I won't get into the complex aspects
of what a DAG is in this post, so to define it simply: a DAG of data
processing steps is a set of steps that can be executed in a certain
order in time based on their dependencies. For example, in order to
transform a data record we must first load the record into memory,
therefore the Extract step must be done before the Transform step. Each
step in a DAG is called a "transformation", a transformation can do one
of three things: load data, transform data, or save data.</p>
<h1>ETL Application</h1>
<p>To develop the ETL application with the Bobobo package I first set up
the project structure</p>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">data</span> <span class="ss">(</span><span class="nv">folder</span> <span class="k">for</span> <span class="nv">test</span> <span class="nv">data</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">model_etl</span> <span class="ss">(</span><span class="nv">folder</span> <span class="k">for</span> <span class="nv">application</span> <span class="nv">code</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">etl_job</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">model_node</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span><span class="nv">folder</span> <span class="k">for</span> <span class="nv">unit</span> <span class="nv">tests</span><span class="ss">)</span>
<span class="o">-</span> .<span class="nv">gitignore</span>
<span class="o">-</span> <span class="nv">Makefile</span>
<span class="o">-</span> <span class="nv">README</span>.<span class="nv">md</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span>
</code></pre></div>
<p>This folder structure can be seen
<a href="https://github.com/schmidtbri/etl-job-ml-model-deployment">here</a>
in the github repository.</p>
<p>This folder structure for the ETL application looks very similar to the
one used in the Flask application in the previous blog post. We will be
following the same practices as before, adding documentation, unit tests
and a Makefile to the application to ensure quality code and to make it
easier to use.</p>
<h1>MLModelTransformer Class</h1>
<p>Running a machine learning model prediction step inside an ETL DAG
requires many of the same things as running a model inside a web
application. In the <a href="https://www.tekhnoal.com/using-ml-model-abc.html">previous blog
post</a>
we managed instances of MLModel classes inside a ModelManager singleton
object. The ModelManager object was used by the web application to
maintain a list of MLModel objects, and returned information about them
on request.</p>
<p>When a model makes a prediction, it is making a transformation on an
incoming record and returning a prediction. Therefore, to embed an ML
model inside of a bonobo ETL job, we just need to write a
transformation. We can write a transformation as a class:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MLModelTransformer</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">module_name</span><span class="p">,</span> <span class="n">class_name</span><span class="p">):</span>
<span class="n">model_module</span> <span class="o">=</span> <span class="n">importlib</span><span class="o">.</span><span class="n">import_module</span><span class="p">(</span><span class="n">module_name</span><span class="p">)</span>
<span class="n">model_class</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">model_module</span><span class="p">,</span> <span class="n">class_name</span><span class="p">)</span>
<span class="n">model_object</span> <span class="o">=</span> <span class="n">model_class</span><span class="p">()</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">model_object</span><span class="p">,</span> <span class="n">MLModel</span><span class="p">)</span> <span class="ow">is</span> <span class="kc">False</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"The MLModelNode can only hold references to objects of type MLModel."</span><span class="p">)</span>
<span class="c1"># saving the model reference</span>
<span class="bp">self</span><span class="o">.</span><span class="n">model</span> <span class="o">=</span> <span class="n">model_object</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/etl-job-ml-model-deployment/blob/master/model_etl/model_node.py#L7-L20">here</a>.</p>
<p>The __init__ method receives two parameters: module_name and
class_name. The __init__ method uses these parameters to
dynamically import and instantiate an MLModel class and saves a
reference to the newly created object. The __init__ method also
verifies that the class inherits from the MLModel base class.</p>
<p>Just like the ModelManager class from the Flask app, the
MLModelTransformation class instantiates and maintains a reference to an
MLModel object internally. However, it is not meant to be a singleton
object and it only holds one MLModel object.</p>
<p>The MLModelTransformation class is meant to be plugged into a bonobo DAG
and exchange data with other transformations in the DAG. For that
purpose we implement a __call__ method:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="fm">__call__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="k">try</span><span class="p">:</span>
<span class="k">yield</span> <span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="k">except</span> <span class="n">MLModelSchemaValidationException</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">e</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/etl-job-ml-model-deployment/blob/master/model_etl/model_node.py#L22-L27">here</a>.</p>
<p>The __call__() method makes the class a
<a href="https://www.journaldev.com/22761/python-callable-__call__">callable</a>.
This mechanism is used by the bonobo package to feed data into the DAG
transformation and receive data back. The <em>yield</em> keyword allows bonobo
to run transformations asynchronously. By implementing the transformer
this way, we can compose many different DAGs that use MLModel derived
classes to do data transformations.</p>
<p>Now we can test the MLModelTransformation class to make sure it\'s
working as expected. First, we have to install a model to the
environment, we'll install the iris_model package that was built in a
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">previous blog
post</a>:</p>
<div class="highlight"><pre><span></span><code>pip install git+https://github.com/schmidtbri/ml-model-abc-improvements
</code></pre></div>
<p>Now that we have a model package in the environment, we use a Python
interactive session to instantiate the class and try to make a
prediction:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="kn">from</span> <span class="nn">model_etl.model_node</span> <span class="kn">import</span> <span class="n">MLModelTransformer</span>
<span class="o">>>></span> <span class="n">model_transformer</span> <span class="o">=</span> <span class="n">MLModelTransformer</span><span class="p">(</span><span class="n">module_name</span><span class="o">=</span><span class="s2">"iris_model.iris_predict"</span><span class="p">,</span> <span class="n">class_name</span><span class="o">=</span><span class="s2">"IrisModel"</span><span class="p">)</span>
<span class="o">>>></span> <span class="n">generator</span> <span class="o">=</span> <span class="n">model_transformer</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="p">{</span><span class="s2">"sepal_length"</span><span class="p">:</span> <span class="mf">4.4</span><span class="p">,</span>
<span class="o">...</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="mf">2.9</span><span class="p">,</span> <span class="s2">"petal_length"</span><span class="p">:</span> <span class="mf">1.4</span><span class="p">,</span> <span class="s2">"petal_width"</span><span class="p">:</span> <span class="mf">0.2</span><span class="p">})</span>
<span class="o">>>></span> <span class="n">result</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">generator</span><span class="p">)</span>
<span class="o">>>></span> <span class="n">result</span>
<span class="p">[{</span><span class="s1">'species'</span><span class="p">:</span> <span class="s1">'setosa'</span><span class="p">}]</span>
</code></pre></div>
<p>We first instantiate the transformer class by pointing it at the module
and class in the iris_model package that implement the MLModel base
class. Then we can make a prediction by calling class with a single
dictionary object. The transformers makes predictions by using the yield
keyword, so we have to cast the return value of the transformer into a
list to view it on the screen.</p>
<p>As in the previous blog posts, we are trying to write the code in such a
way as to make it reusable in many situations. The MLModelTransformer
class can be used to load and manage ML model objects in any bonobo ETL,
which saves time and work later. One caveat to this, however, is that
the ETL must feed records to the MLModelTransformer object exactly as
the MLModel expects it, since any schema differences will raise a
MLModelSchemaValidationException from the model within the transformer.
In practice, this means that the IrisModel.predict() method expects to
receive data in a dictionary with several floating point numbers, if the
data source does not provide records with this schema, we have to
transform the incoming data to match it.</p>
<h1>Creating a Graph</h1>
<p>A bonobo application runs an ETL from a Graph object that is defined at
application startup. Any number of transformations can be used, and they
can be arranged into complex DAGs. Every Graph object must contain at
least one extractor to get data from an outside source, and one loader
to save data to an outside destination. The bonobo package provides
several options for accessing data files, we'll use the JSONLD extractor
and loader transformations to define a simple Graph inside a function:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">get_graph</span><span class="p">(</span><span class="o">**</span><span class="n">options</span><span class="p">):</span>
<span class="n">graph</span> <span class="o">=</span> <span class="n">bonobo</span><span class="o">.</span><span class="n">Graph</span><span class="p">()</span>
<span class="n">graph</span><span class="o">.</span><span class="n">add_chain</span><span class="p">(</span>
<span class="n">LdjsonReader</span><span class="p">(</span><span class="n">options</span><span class="p">[</span><span class="s2">"input_file"</span><span class="p">],</span> <span class="n">mode</span><span class="o">=</span><span class="s1">'r'</span><span class="p">),</span>
<span class="n">MLModelTransformer</span><span class="p">(</span><span class="n">module_name</span><span class="o">=</span><span class="s2">"iris_model.iris_predict"</span><span class="p">,</span> <span class="n">class_name</span><span class="o">=</span><span class="s2">"IrisModel"</span><span class="p">),</span>
<span class="n">LdjsonWriter</span><span class="p">(</span><span class="n">options</span><span class="p">[</span><span class="s2">"output_file"</span><span class="p">],</span> <span class="n">mode</span><span class="o">=</span><span class="s1">'w'</span><span class="p">))</span>
<span class="k">return</span> <span class="n">graph</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/etl-job-ml-model-deployment/blob/master/model_etl/graph.py#L8-L14">here</a>.</p>
<p>The function receives two file names as parameters. The input file name
is used to instantiate a LDjsonReader object that will load data from a
local JSON file, and the output file name is used to instantiate an
LdjsonWriter to write data to a local JSON file. The MLModelTransformer
class is instantiated by pointing it at the IrisModel class.</p>
<p>We can now instantiate the graph from an interactive Python session:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="kn">from</span> <span class="nn">model_etl.etl_job</span> <span class="kn">import</span> <span class="n">get_graph</span>
<span class="o">>>></span> <span class="n">graph</span> <span class="o">=</span> <span class="n">get_graph</span><span class="p">(</span><span class="s2">"data/input.json"</span><span class="p">,</span> <span class="s2">"data/output.json"</span><span class="p">)</span>
<span class="o">>>></span> <span class="n">graph</span>
<span class="o"><</span><span class="n">bonobo</span><span class="o">.</span><span class="n">structs</span><span class="o">.</span><span class="n">graphs</span><span class="o">.</span><span class="n">Graph</span> <span class="nb">object</span> <span class="n">at</span> <span class="mh">0x10a52ffd0</span><span class="o">></span>
</code></pre></div>
<p>The great thing about this approach to building ETLs is that a different
reader or writer can be easily swapped in to add functionality, while
the core transformations of the ETL remain unchanged. For example, we
can implement a Graph that reads CSV files and writes TSV files in the
same module, and select it at runtime using a parameter.</p>
<h1>Running the ETL Process Locally</h1>
<p>The graph defined in the previous section works well when running it
from an interactive Python session, but it would be better to run in
from the command line. Before writing the code to create simple command
line interface, we need to create some parameters for the input and
output file names:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">get_argument_parser</span><span class="p">(</span><span class="n">parser</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="n">parser</span> <span class="o">=</span> <span class="n">bonobo</span><span class="o">.</span><span class="n">get_argument_parser</span><span class="p">(</span><span class="n">parser</span><span class="o">=</span><span class="n">parser</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s2">"--input_file"</span><span class="p">,</span> <span class="s2">"-i"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s2">"Path of the input file."</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s2">"--output_file"</span><span class="p">,</span> <span class="s2">"-o"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s2">"Path of the output file."</span><span class="p">)</span>
<span class="k">return</span> <span class="n">parser</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/etl-job-ml-model-deployment/blob/master/model_etl/etl_job.py#L7-L14">here</a>.</p>
<p>The function retrieves standard command line parser that is defined by
the bonobo package, and adds two parameters for the input and output
file names. The new parser object is then returned.</p>
<p>To create a CLI interface we define a __main__ function inside of
the etl.py module and use the parser defined above:</p>
<div class="highlight"><pre><span></span><code><span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s1">'__main__'</span><span class="p">:</span>
<span class="n">parser</span> <span class="o">=</span> <span class="n">get_argument_parser</span><span class="p">()</span>
<span class="k">with</span> <span class="n">bonobo</span><span class="o">.</span><span class="n">parse_args</span><span class="p">(</span><span class="n">parser</span><span class="p">)</span> <span class="k">as</span> <span class="n">options</span><span class="p">:</span>
<span class="n">bonobo</span><span class="o">.</span><span class="n">run</span><span class="p">(</span>
<span class="n">get_graph</span><span class="p">(</span><span class="o">**</span><span class="n">options</span><span class="p">),</span>
<span class="n">services</span><span class="o">=</span><span class="p">{}</span>
<span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/etl-job-ml-model-deployment/blob/master/model_etl/etl_job.py#L17-L23">here</a>.</p>
<p>The graph can now be run from the command line with these commands:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span><span class="s2">"</span><span class="si">${</span><span class="nv">PYTHONPATH</span><span class="si">}</span><span class="s2">:./"</span>
python model_etl/etl_job.py --input_file<span class="o">=</span>data/input.json --output_file<span class="o">=</span>data/output.json
</code></pre></div>
<p>First, we add the current directory to the PYTHONPATH environment
variable to ensure that the python modules will be found. Then we can
execute the graph with the command line interface in the etl_job.py
module and the CLI parameters. The input file is included in the
repository
<a href="https://github.com/schmidtbri/etl-job-ml-model-deployment/blob/master/data/input.json">here</a>,
and contains 15 records, which we can see were processed by the three
transformations in the graph. The output will be saved as an LDJSON file
in the data folder of the project.</p>
<p>The ETL graph looks pretty good now, it is able to run from the command
line and we can parametrize the input and output files. However, a
real-world ETL is probably not accessing data from the local hard drive,
so we'll add the ability to access data from other places.</p>
<h1>Accessing Data from a Service</h1>
<p>When testing an ETL job locally, it is easiest to load data from and
save data to the local hard drive. When running the ETL in a production
environment, the ETL code will most likely be accessing data from remote
storage systems. We can easily write an implementation of the
LdjsonReader and LdjsonWriter classes to access files from a remote
system, but this is not a best practice.</p>
<p>To be able to write code once and reuse it in many different situations,
the bonobo package supports dependency injection through service
abstractions. A
<a href="https://en.wikipedia.org/wiki/Service_(systems_architecture)">service</a>
is a software component that provides functionality to other software
components. For example, the os Python package that is part of the
standard library can be thought of as a service, since it provides
access to the operating system. <a href="https://en.wikipedia.org/wiki/Dependency_injection">Dependency
injection</a>
is a software pattern that allows software components to be written in
such a way that makes them easier to reuse in many different situations.
For example, a d instance of the os</p>
<p>In the example we set up for this blog post, we are interested in
accessing files from a remote data source, but without changing the
ETL's Graph. In this way, we can easily change the data source of the
ETL in the future without changing the code of the ETL. To show how to
do this, I will change the local filesystem as the file source for an S3
bucket, without changing the bonobo Graph object.</p>
<p>The bonobo package provides a mechanism for injecting service instances
into a Graph at runtime. Right now, the JSON files are being accessed
through a local filesystem service that is injected by default into
every Graph. To be able to access files from a remote service, we'll
just replace the default filesystem service instance with another
service instance with the same interface that loads files from a remote
source.</p>
<p>As an example, we'll show how to access files stored in S3. To be able
to access files in an S3 bucket, we first have to install the fs-s3fs
package with this command:</p>
<div class="highlight"><pre><span></span><code>pip install fs-s3fs
</code></pre></div>
<p>Now we can instantiate a special type of filesystem that accesses files
from an AWS bucket but has the same interface as a local filesystem. The
<a href="https://www.pyfilesystem.org/">fs package</a> already
provides this functionality when we accessed the files in the example
above, so we know that the code will work with the s3 filesystem.</p>
<p>To inject a service through the bonobo package we define a dictionary
like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">get_services</span><span class="p">(</span><span class="o">**</span><span class="n">options</span><span class="p">):</span>
<span class="k">return</span> <span class="p">{</span>
<span class="s1">'fs'</span><span class="p">:</span> <span class="n">S3FS</span><span class="p">(</span><span class="n">options</span><span class="p">[</span><span class="s2">"bucket"</span><span class="p">],</span>
<span class="n">aws_access_key_id</span><span class="o">=</span><span class="n">options</span><span class="p">[</span><span class="s2">"key"</span><span class="p">],</span>
<span class="n">aws_secret_access_key</span><span class="o">=</span><span class="n">options</span><span class="p">[</span><span class="s2">"secret_key"</span><span class="p">],</span>
<span class="n">endpoint_url</span><span class="o">=</span><span class="n">options</span><span class="p">[</span><span class="s2">"endpoint_url"</span><span class="p">],)</span>
<span class="p">}</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/etl-job-ml-model-deployment/blob/master/model_etl/s3_etl_job.py#L8-L15">here</a>.</p>
<p>The new fs filesystem service replaces the service that is instantiated
by bonobo by default at startup. The extra options needed to connect to
S3 received through keyword arguments, we'll provide them to the
function at runtime.</p>
<p>In order to run the new ETL, we'll create a new CLI interface for it:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">get_argument_parser</span><span class="p">(</span><span class="n">parser</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="n">parser</span> <span class="o">=</span> <span class="n">bonobo</span><span class="o">.</span><span class="n">get_argument_parser</span><span class="p">(</span><span class="n">parser</span><span class="o">=</span><span class="n">parser</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s2">"--input_file"</span><span class="p">,</span> <span class="s2">"-i"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s2">"Path of the input file."</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s2">"--output_file"</span><span class="p">,</span> <span class="s2">"-o"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s2">"Path of the output file."</span><span class="p">)</span>
<span class="c1"># these parameters are added for accessing different S3 services</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s2">"--bucket"</span><span class="p">,</span> <span class="s2">"-b"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s2">"Bucket name in S3 service."</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s2">"--key"</span><span class="p">,</span> <span class="s2">"-k"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s2">"Key to access S3 service."</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s2">"--secret_key"</span><span class="p">,</span> <span class="s2">"-sk"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s2">"Secret key to access the S3 service."</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s2">"--endpoint_url"</span><span class="p">,</span> <span class="s2">"-ep"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s2">"Endpoint URL for S3 service."</span><span class="p">)</span>
<span class="k">return</span> <span class="n">parser</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/etl-job-ml-model-deployment/blob/master/model_etl/s3_etl_job.py#L18-L31">here</a>.</p>
<p>The new command line argument parser still accepts input and output file
names, but now also receives parameters to access the S3 bucket where
the data to be processed is to be found. The parameters are: the key and
secret key to access the bucket, and the endpoint url for contacting the
S3 service.</p>
<p>Lastly, we'll add a __main__ block that will actually run the ETL
job:</p>
<div class="highlight"><pre><span></span><code><span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s1">'__main__'</span><span class="p">:</span>
<span class="n">parser</span> <span class="o">=</span> <span class="n">get_argument_parser</span><span class="p">()</span>
<span class="k">with</span> <span class="n">bonobo</span><span class="o">.</span><span class="n">parse_args</span><span class="p">(</span><span class="n">parser</span><span class="p">)</span> <span class="k">as</span> <span class="n">options</span><span class="p">:</span>
<span class="n">bonobo</span><span class="o">.</span><span class="n">run</span><span class="p">(</span>
<span class="n">get_graph</span><span class="p">(</span><span class="o">**</span><span class="n">options</span><span class="p">),</span>
<span class="n">services</span><span class="o">=</span><span class="n">get_services</span><span class="p">(</span><span class="o">**</span><span class="n">options</span><span class="p">)</span>
<span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/etl-job-ml-model-deployment/blob/master/model_etl/s3_etl_job.py#L34-L40">here</a>.</p>
<p>The bonobo graph that actually runs the ETL does not change at all,
since we are only injecting a new service for accessing the files. This
shows the power of accessing outside resources from your code through
interfaces, since it makes it possible to run the application in many
different contexts without changing the application code itself. In this
case, the code that actually accesses the files that will be processed
is injected at runtime into the DAG.</p>
<p>In order to test the loading and saving of files to S3, we can run a
drop-in replacement service locally. The <a href="https://min.io/">minio
project</a> replicates the S3 API, and also
provides a docker image. To run an instance of minio locally, I used
this command:</p>
<div class="highlight"><pre><span></span><code>docker run -p <span class="m">9000</span>:9000 --name minio -e <span class="s2">"MINIO_ACCESS_KEY=TEST"</span> -e <span class="s2">"MINIO_SECRET_KEY=ASDFGHJKL"</span> -v /Users/brian/Code/etl-job-ml-model-deployment:/data minio/minio server/data
</code></pre></div>
<p>The minio service instance is accessing the local filesystem to serve
files, and I pointed it at the root of the project. When minio is
running in this way, it makes the folders it finds in the local
filesystem available as buckets through its interface. We can see the
files hosted by the minio service by accessing the minio web UI:</p>
<p><img alt="Minio UI" src="https://www.tekhnoal.com/minio_ui.png" width="100%"></p>
<p>Now we can try out the new ETL job by executing this command:</p>
<div class="highlight"><pre><span></span><code><span class="nb">export</span> <span class="nv">PYTHONPATH</span><span class="o">=</span><span class="s2">"</span><span class="si">${</span><span class="nv">PYTHONPATH</span><span class="si">}</span><span class="s2">:./"</span>
python model_etl/s3_etl_job.py --input_file<span class="o">=</span>input.json --output_file<span class="o">=</span>output.json --bucket<span class="o">=</span>data --key<span class="o">=</span>TEST --secret_key<span class="o">=</span>ASDFGHJKL --endpoint_url<span class="o">=</span>http://127.0.0.1:9000/
</code></pre></div>
<p>The command above will run the new ETL, providing it with the
credentials it needs to access the S3 service. This section showed how
by injecting dependencies into the bonobo Graph, we can change the way
the ETL accesses data without having to change the code of the ETL
itself.</p>
<h1>Closing</h1>
<p>In this blog post, I showed how to deploy the iris model developed in a
previous blog post inside of an ETL application. By splitting the
deployment code and the model code into separate packages, I'm able to
reuse the model in many different types of deployments. By structuring
the codebases in this way, I'm able to keep the machine learning code
separate from the deployment code very effectively.</p>
<p>In addition, by creating the MLModelTransformer class that works with
the bonobo package, we can leverage all of the tools that bonobo has for
building ETL applications. For example, the bonobo package provides
functionality to load data from CSV files, JSON files, and databases.
Bonobo also makes it easy to extend its capabilities with custom code
through its highly modular object-oriented design. It also enforces good
coding practices by supporting service dependency injection and
parametrization.</p>
<p>One downside of this example is that this ETL is not meant to handle
large scale data processing since it can only run in a single computer.
A better way to do data processing over data sets that don't fit in the
memory of a single computer is to use Apache Spark. Another drawback of
the Bonobo package is that it does not support joins and aggregations
over the data, since it only allows each incoming record to be processed
individually.</p>
<p>Even though the ETL applications is able to make predictions with the
MLModelTransformer class, it is very common for business logic to also
be needed in a real-world deployment of an ML model. For example, we
might want to prevent the model from making a prediction in certain
locales or jurisdictions for legal reasons. For the sake of simplicity,
I didn't include any business logic in the DAG we defined. The business
logic should not be packaged inside of the MLModel class. We can keep it
separate by creating a separate transformer that implements the business
logic and putting it in the DAG. This way, we can apply the business
logic without mixing it with the machine learning code in the MLModel
class.</p>
<p>Another common situation in a real-world deployment of an ML model is
the need to keep track of the predictions made by the model outside of
the results that are provided to the clients of the system. This is a
special log that the model generates as it is operating. Some of the
contents of the prediction log would be: the inputs used to make a
prediction, internal data that the model generated as it was making a
prediction, and the output sent back to the client system. This is a
more advanced requirement of an ML model deployment that I may expand on
in another blog post.</p>Using the ML Model Base Class2019-07-28T09:12:00-05:002019-07-28T09:12:00-05:00Brian Schmidttag:www.tekhnoal.com,2019-07-28:/using-ml-model-abc.html<p>In previous blog posts I showed how to build a simple base class for abstracting machine learning models and how to create a python package that makes use of the base class. In this blog post I aim to use the ideas from the previous blog posts to build a simple application that uses the MLModel base class to deploy a model.</p><p>This blog post continues the ideas started in two
<a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">previous</a>
<a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">blog posts</a>.</p>
<p>The code in this blog post can be found in this <a href="https://github.com/schmidtbri/using-ml-model-abc">github repo</a>.</p>
<h1>Introduction</h1>
<p>In previous blog posts I showed how to build a simple base class for
abstracting machine learning models and how to create a python package
that makes use of the base class. In this blog post I aim to use the
ideas from the previous blog posts to build a simple application that
uses the MLModel base class to deploy a model. I will be using the
iris_model package built in <a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">this blog
post</a>.</p>
<p>When creating software, interacting with a component through an
abstraction makes the code easier to understand and evolve. In the
vocabulary of <a href="https://en.wikipedia.org/wiki/Software_design_pattern">software design
patterns</a>,
this is called the <a href="https://en.wikipedia.org/wiki/Strategy_pattern">strategy
pattern</a>.
When using the strategy pattern, the implementation details of a
software component (the "strategy") are not decided up front, they are
deferred until later. Instead, the interface between the code that is
using the component and the component itself is designed and put into
code. When it is time to write the code that uses the component, it is
written against the abstract interface, trusting that the component will
provide an implementation that matches the agreed-on interface.
Afterwards, an implementation of the strategy can be implemented as
needed. This approach makes it possible to easily switch between
implementations of the strategy easily. It also makes it possible to
decide which implementation of the strategy to use at runtime, which
makes the software more flexible.</p>
<p>By interacting with machine learning models through the MLModel
abstraction, it becomes possible to build applications that can host any
model that implements the MLModel interface. This way, simple model
deployments become much faster since a custom-made application is not
needed to put a model into production. The application I will show in
this blog post takes advantage of this fact to allow a software engineer
to install and deploy any number of models that implement the MLModel
base class inside a web application.</p>
<p>Overall, I aim to show how to deploy the model code in the iris_model
package into a simple web application. I also want to show how the
MLModel abstraction makes the use of machine learning models much easier
in production software.</p>
<h1>Flask Web Application</h1>
<p>One of the simplest ways to build a web application with python is to
use the <a href="https://www.fullstackpython.com/flask.html">Flask framework</a>.
Flask makes it easy to set up a simple web application that serves web
pages and a RESTful interface.</p>
<p>To begin, I set up the project structure for the application package:</p>
<div class="highlight"><pre><span></span><code><span class="o">-</span> <span class="nv">model_service</span>
<span class="o">-</span> <span class="nv">static</span> <span class="ss">(</span> <span class="nv">folder</span> <span class="nv">containing</span> <span class="nv">the</span> <span class="nv">static</span> <span class="nv">web</span> <span class="nv">assets</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">templates</span> <span class="ss">(</span> <span class="nv">folder</span> <span class="k">for</span> <span class="nv">the</span> <span class="nv">html</span> <span class="nv">templates</span>
<span class="o">-</span> <span class="nv">__init__</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">config</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">endpoints</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">model_manager</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">schemas</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">views</span>.<span class="nv">py</span>
<span class="o">-</span> <span class="nv">scripts</span> <span class="ss">(</span> <span class="nv">folder</span> <span class="nv">containing</span> <span class="nv">scripts</span> <span class="ss">)</span>
<span class="o">-</span> <span class="nv">tests</span> <span class="ss">(</span> <span class="nv">folder</span> <span class="nv">containing</span> <span class="nv">the</span> <span class="nv">unit</span> <span class="nv">test</span> <span class="nv">suite</span><span class="ss">)</span>
<span class="o">-</span> <span class="nv">requirements</span>.<span class="nv">txt</span>
<span class="o">-</span> <span class="nv">test_requirements</span>.<span class="nv">txt</span>
</code></pre></div>
<p>This structure can be seen
<a href="https://github.com/schmidtbri/using-ml-model-abc">here</a>
in the github repository.</p>
<p>The Flask application is set up with this code in the __init__.py
file:</p>
<div class="highlight"><pre><span></span><code><span class="n">app</span> <span class="o">=</span> <span class="n">Flask</span><span class="p">(</span><span class="vm">__name__</span><span class="p">)</span>
<span class="k">if</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"APP_SETTINGS"</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">app</span><span class="o">.</span><span class="n">config</span><span class="o">.</span><span class="n">from_object</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">'APP_SETTINGS'</span><span class="p">])</span>
<span class="n">bootstrap</span> <span class="o">=</span> <span class="n">Bootstrap</span><span class="p">(</span><span class="n">app</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/__init__.py#L8-L12">here</a>.</p>
<p>The Flask application is initiated by instantiating the Flask() class.
The configuration is being imported by the configuration classes found
in the config.py file, there is one configuration class per environment.
The environment name is being imported as the "APP_SETTINGS"
environment variable, which makes it easy to change the configuration of
the app at runtime.</p>
<p>The configuration classes can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/config.py">here</a>.
More information about this pattern for managing and importing
configuration details in Flask applications can be found
<a href="https://flask.palletsprojects.com/en/1.1.x/config/#configuring-from-environment-variables">here</a>.
Lastly, I am using the
<a href="https://pythonhosted.org/Flask-Bootstrap/basic-usage.html">flask_bootstrap</a>
package for adding bootstrap elements to the web pages, this package is
initiated after loading the configuration.</p>
<p>So far, this is a simple Flask application that is not able to manage or
serve machine learning models, in the next section we will start to add
the functionality needed to do this.</p>
<h1>Model Manager Class</h1>
<p>In order to use the iris_model class within the Flask application we
are building, we need to have a way to manage the model object within
the Python process. To do this we will create a ModelManager class that
follows the <a href="https://en.wikipedia.org/wiki/Singleton_pattern">singleton
pattern</a>.
The ModelManager class will be instantiated one time at application
startup. The ModelManager singleton instantiates MLModel classes from
configuration, and returns information about the model objects being
managed as well as references to the model objects.</p>
<p>Let's get started, here is the class declaration:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">ModelManager</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="n">_models</span> <span class="o">=</span> <span class="p">[]</span>
</code></pre></div>
<p>The code above can be found <a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/model_manager.py#L6-L8">here</a>.</p>
<p>The ModelManager class has a private list property called _models that
will contain the references to the model objects that are under
management.</p>
<p>Now we need a way to actually instantiate the model classes, the code to
do this is below:</p>
<div class="highlight"><pre><span></span><code><span class="nd">@classmethod</span>
<span class="k">def</span> <span class="nf">load_models</span><span class="p">(</span><span class="bp">cls</span><span class="p">,</span> <span class="n">configuration</span><span class="p">):</span>
<span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">configuration</span><span class="p">:</span>
<span class="n">model_module</span> <span class="o">=</span> <span class="n">importlib</span><span class="o">.</span><span class="n">import_module</span><span class="p">(</span><span class="n">c</span><span class="p">[</span><span class="s2">"module_name"</span><span class="p">])</span>
<span class="n">model_class</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">model_module</span><span class="p">,</span> <span class="n">c</span><span class="p">[</span><span class="s2">"class_name"</span><span class="p">])</span>
<span class="n">model_object</span> <span class="o">=</span> <span class="n">model_class</span><span class="p">()</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">model_object</span><span class="p">,</span> <span class="n">MLModel</span><span class="p">)</span> <span class="ow">is</span> <span class="kc">False</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">"The ModelManager can only hold references to objects of type MLModel."</span><span class="p">)</span>
<span class="bp">cls</span><span class="o">.</span><span class="n">_models</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">model_object</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/model_manager.py#L10-L21">here</a>.</p>
<p>The load_models() class method receives a configuration dictionary
object and iterates through it, importing the classes from the
environment, instantiating the classes, and saving the references to the
objects in the _models class property. The method also checks that the
classes being imported and instantiated are instances of the MLModel
base class. The ModelManager singleton object is able to hold any number
of model objects.</p>
<p>The ModelManager class also provides three other methods that help to
use the models that it manages. The <a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/model_manager.py#L23-L33">get_models()
method</a>
returns a list of dictionaries with information about the model object.
The
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/model_manager.py#L35-L52">get_model_metadata()</a>
method returns detailed data about a single model object, identified
with the qualified_name property of the model object. The metadata
returned by this method contains the input and output schemas of the
model encoded as JSON schema dictionaries. Lastly, the
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/model_manager.py#L54-L63">get_model()</a>
method searches the models in the _models list and returns a reference
to one model object. When searching through the list of model objects in
the _models class property, the qualified name of the model is used to
identify the model.</p>
<p>With the ModelManager class, we can now test it out with the iris_model
package from <a href="https://www.tekhnoal.com/improving-the-mlmodel-base-class.html">the previous blog
post</a>.
To do this we need to install the package from github by executing this
command:</p>
<div class="highlight"><pre><span></span><code>pip install git+https://github.com/schmidtbri/ml-model-abc-improvements
</code></pre></div>
<p>Once we have the iris_model package installed in the environment, we
can use a python interactive session to execute this code to try out the
ModelManager class:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="kn">from</span> <span class="nn">model_service.model_manager</span> <span class="kn">import</span> <span class="n">ModelManager</span>
<span class="o">>>></span> <span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="o">>>></span> <span class="n">model_manager</span><span class="o">.</span><span class="n">load_models</span><span class="p">(</span><span class="n">configuration</span><span class="o">=</span><span class="p">[</span>
<span class="o">...</span> <span class="p">{</span>
<span class="o">...</span> <span class="s2">"module_name"</span><span class="p">:</span> <span class="s2">"iris_model.iris_predict"</span><span class="p">,</span>
<span class="o">...</span> <span class="s2">"class_name"</span><span class="p">:</span> <span class="s2">"IrisModel"</span>
<span class="o">...</span> <span class="p">}</span>
<span class="o">...</span><span class="p">])</span>
<span class="o">>>></span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">()</span>
<span class="p">[{</span><span class="s1">'display_name'</span><span class="p">:</span> <span class="s1">'Iris Model'</span><span class="p">,</span> <span class="s1">'qualified\_name'</span><span class="p">:</span> <span class="s1">'iris_model'</span><span class="p">,</span> <span class="s1">'description'</span><span class="p">:</span> <span class="s1">'A machine learning model for predicting the species of a flower based on its measurements.'</span><span class="p">,</span> <span class="s1">'major\_version'</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">'minor_version'</span><span class="p">:</span> <span class="mi">1</span><span class="p">}]</span>
</code></pre></div>
<p>The ModelManager class is being used to load the IrisModel class which
is found in the the iris_model package within the iris_predict module,
the information needed to find the class is held within the
configuration. Once the model object is instantiated, the get_models()
method is called to get data about the models in memory.</p>
<p>In order to use the ModelManager class within the Flask application we
have to instantiate it and call the load_model(). Since the model
classes will load their parameters from disk when they are instantiated,
it's important that we only do this one time at application startup. We
can do that by adding this code to the __init__.py module:</p>
<div class="highlight"><pre><span></span><code><span class="nd">@app</span><span class="o">.</span><span class="n">before_first_request</span>
<span class="k">def</span> <span class="nf">instantiate_model_manager</span><span class="p">():</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">model_manager</span><span class="o">.</span><span class="n">load_models</span><span class="p">(</span><span class="n">configuration</span><span class="o">=</span><span class="n">app</span><span class="o">.</span><span class="n">config</span><span class="p">[</span><span class="s2">"MODELS"</span><span class="p">])</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/__init__.py#L18-L22">here</a>.</p>
<p>The \@app.before_first_request decorator on the function causes it to
be executed before requests can be handled by the application. The model
manager configuration is loaded from the Flask application configuration
found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/config.py#L5-L10">here</a>.</p>
<p>The ModelManager class handles the complexities of instantiating and
managing model objects in memory. As long as a MLModel-derived class can
be found in the python environment, then it can be loaded and managed by
the ModelManager class.</p>
<h1>Flask REST Endpoints</h1>
<p>To make use of the models hosted in the ModelManager object, we will
first build a simple REST interface that will allow clients to find and
make predictions. To define the data models that are returned by the
REST interface we make use of the <a href="https://marshmallow.readthedocs.io/en/3.0/quickstart.html">marshmallow schema
package</a>.
Although it's not strictly necessary to use it to build a web app, the
marshmallow package provides a simple and quick way to build schemas and
do serialization and deserialization.</p>
<p>The Flask application has three endpoints: a models endpoint for getting
information about all models hosted by the app, a metadata endpoint for
getting information about a specific model, and a predict endpoint for
making predictions with a specific model.</p>
<p>The models endpoint is created by registering a function with the Flask
application:</p>
<div class="highlight"><pre><span></span><code><span class="nd">@app</span><span class="o">.</span><span class="n">route</span><span class="p">(</span><span class="s2">"/api/models"</span><span class="p">,</span> <span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s1">'GET'</span><span class="p">])</span>
<span class="k">def</span> <span class="nf">get_models</span><span class="p">():</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">models</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">()</span>
<span class="n">response_data</span> <span class="o">=</span> <span class="n">model_collection_schema</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="nb">dict</span><span class="p">(</span><span class="n">models</span><span class="o">=</span><span class="n">models</span><span class="p">))</span><span class="o">.</span><span class="n">data</span>
<span class="k">return</span> <span class="n">response_data</span><span class="p">,</span> <span class="mi">200</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/endpoints.py#L14-L32">here</a>.</p>
<p>The function uses the ModelManager class to access data about all models
hosted within it. It uses the get_models() method in the same way that
the index defined above view does. The response_data is serialized
using a marshmallow schema object which is instantiated from the schema
class defined
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/schemas.py#L4-L15">here</a>.</p>
<p>The metadata endpoint is built similarly to the models endpoint. The
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/endpoints.py#L35-L67">metadata endpoint function</a>
uses the ModelManager class to access information about the models. In
the same way as the models endpoint, the metadata endpoint also defines
a set of <a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/schemas.py#L18-L37">schema
classes</a>
for serialization.</p>
<p>The <a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/endpoints.py#L70-L130">predict endpoint</a>
functions differently from the previous endpoints since it does not
define a schema class for the input and output data that is expects. If
a client wants to know what fields it needs to send to a model to make a
prediction, it can find a description of the fields in the JSON schema
published by the metadata endpoint. If a new version of a model with new
input or output schemas is installed into the Flask application, the
code of the Flask app would not have to change at all to accommodate the
new model.</p>
<h1>Flask Views</h1>
<p>The Flask framework is also able to render web pages using Jinja
templates, a great guide for learning about this can be found
<a href="https://code.tutsplus.com/tutorials/templating-with-jinja2-in-flask-essentials--cms-25571">here</a>.
To add webpages rendered with Jinja templates to the web application I
added the <a href="https://github.com/schmidtbri/using-ml-model-abc/tree/master/model_service/templates">templates
folder</a>
to the application package. In it I created the <a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/templates/base.html">base html
template</a>,
from which other templates inherit. The base template uses styles from
the bootstrap package. To render the templates into views I also added
the <a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/views.py">views.py
module</a>.</p>
<p>In order to show some information about the models that are in the
ModelManager object, I added the <a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/templates/index.html">index.html
template</a>.
To render the template, I added this code to the views.py module:</p>
<div class="highlight"><pre><span></span><code><span class="nd">@app</span><span class="o">.</span><span class="n">route</span><span class="p">(</span><span class="s1">'/'</span><span class="p">,</span> <span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s1">'GET'</span><span class="p">])</span>
<span class="k">def</span> <span class="nf">index</span><span class="p">():</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">models</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_models</span><span class="p">()</span>
<span class="k">return</span> <span class="n">render_template</span><span class="p">(</span><span class="s1">'index.html'</span><span class="p">,</span> <span class="n">models</span><span class="o">=</span><span class="n">models</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/views.py#L8-L17">here</a>.</p>
<p>The index view function first registers itself with the Flask
application's root URL so that it becomes the homepage. The ModelManager
is then instantiated, but since it is a singleton that was first
instantiated at application startup, the reference to the singleton
object is returned with all of the model objects already loaded. Next,
we use the singleton's get_models() method to get a list of models
available. Lastly, we send the list of models returned to the template
for rendering, and return the resulting webpage to the user. This view
also renders links to a model's metadata and prediction views. These
views are presented below. The index webpage looks like this:</p>
<p><img alt="Index View" src="https://www.tekhnoal.com/index_view.png" width="100%"></p>
<p>A similar approach is followed for the metadata view, which displays an
individual model's metadata as well as the input and output schemas. The
template for this view is
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/templates/metadata.html">here</a>,
and the view function is
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/views.py#L20-L29">here</a>.
One difference between this view and the index view is that it accepts a
path parameter that determines which model's metadata is rendered in the
view. The metadata webpage looks like this:</p>
<p><img alt="Metadata View" src="https://www.tekhnoal.com/metadata_view.png" width="100%"></p>
<h1>Dynamic Web Form</h1>
<p>The last webpage of the application makes use of a view to render a
webpage and the predict endpoint. The prediction web page for a model
renders a dynamic form from the input json schema provided by the model,
then accepts user input and sends it to the prediction REST endpoint
when the user presses the "Predict" button, lastly it displays the
prediction results from the model.</p>
<p>The prediction web page is rendered like the other views:</p>
<div class="highlight"><pre><span></span><code><span class="nd">@app</span><span class="o">.</span><span class="n">route</span><span class="p">(</span><span class="s2">"/models/<qualified_name>/predict"</span><span class="p">,</span> <span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s1">'GET'</span><span class="p">])</span>
<span class="k">def</span> <span class="nf">display_form</span><span class="p">(</span><span class="n">qualified_name</span><span class="p">):</span>
<span class="n">model_manager</span> <span class="o">=</span> <span class="n">ModelManager</span><span class="p">()</span>
<span class="n">model_metadata</span> <span class="o">=</span> <span class="n">model_manager</span><span class="o">.</span><span class="n">get_model_metadata</span><span class="p">(</span><span class="n">qualified_name</span><span class="o">=</span><span class="n">qualified_name</span><span class="p">)</span>
<span class="k">return</span> <span class="n">render_template</span><span class="p">(</span><span class="s1">'predict.html'</span><span class="p">,</span> <span class="n">model_metadata</span><span class="o">=</span><span class="n">model_metadata</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/views.py#L32-L38">here</a>.</p>
<p>The template, however is different because it uses JQuery to get the
input schema of the model from the metadata endpoint:</p>
<div class="highlight"><pre><span></span><code><span class="x">$(document).ready(function() {</span>
<span class="x">$.ajax({</span>
<span class="x"> url: '/api/models/</span><span class="cp">{{</span><span class="nv">model_metadata.qualified_name</span><span class="cp">}}</span><span class="x">/metadata',</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/templates/predict.html#L14-L16">here</a>.</p>
<p>If the request returns successfully then we use the <a href="https://github.com/brutusin/json-forms">brutusin forms
package</a> to
render a form from the model's input JSON schema. The webform created
from the JSON schema is dynamic, which allows a custom form to be
created for any model that is hosted by the application. Below is the
code to render the form:</p>
<div class="highlight"><pre><span></span><code><span class="n">success</span><span class="o">:</span><span class="w"> </span><span class="kd">function</span><span class="o">(</span><span class="n">data</span><span class="o">)</span><span class="w"> </span><span class="o">{</span><span class="w"></span>
<span class="w"> </span><span class="n">var</span><span class="w"> </span><span class="n">container</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">document</span><span class="o">.</span><span class="na">getElementById</span><span class="o">(</span><span class="s1">'prediction_form'</span><span class="o">);</span><span class="w"></span>
<span class="w"> </span><span class="n">var</span><span class="w"> </span><span class="n">BrutusinForms</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">brutusin</span><span class="o">[</span><span class="s2">"json-forms"</span><span class="o">];</span><span class="w"></span>
<span class="w"> </span><span class="n">bf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">BrutusinForms</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="n">data</span><span class="o">.</span><span class="na">input_schema</span><span class="o">);</span><span class="w"></span>
<span class="w"> </span><span class="n">bf</span><span class="o">.</span><span class="na">render</span><span class="o">(</span><span class="n">container</span><span class="o">);</span><span class="w"></span>
<span class="o">}</span><span class="w"></span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/templates/predict.html#L17-L22">here</a>.</p>
<p>Lastly, there is a <a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/templates/predict.html#L29-L36">JQuery
request</a>
to make the prediction when the user presses the "Predict" button, and a
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/templates/predict.html#L37-L45">callback
function</a>
that renders the prediction to the webpage.</p>
<p>Here is a screen shot of the prediction webpage:</p>
<p><img alt="Predict View" src="https://www.tekhnoal.com/predict_view.png" width="100%"></p>
<h1>Documentation</h1>
<p>To make the REST API easier to use we will produce documentation for it.
A common way to document RESTful interfaces is the <a href="https://swagger.io/docs/specification/about/">OpenAPI
specification</a>.
In order to automatically create an OpenAPI document for the RESTful API
that the model service provides, I used the python <a href="https://github.com/marshmallow-code/apispec">apispec
package</a>.
The apispec package is able to automatically extract schema information
from marshmallow Schema classes, and is able to extract endpoint
specifications from Flask \@app.route decorated functions.</p>
<p>To be able to automatically extract the OpenAPI specification document
from the code, I created a python script called
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/scripts/openapi.py">openapi.py</a>.
The script creates an object to describe the document:</p>
<div class="highlight"><pre><span></span><code><span class="n">spec</span> <span class="o">=</span> <span class="n">APISpec</span><span class="p">(</span>
<span class="n">openapi_version</span><span class="o">=</span><span class="s2">"3.0.2"</span><span class="p">,</span>
<span class="n">title</span><span class="o">=</span><span class="s1">'Model Service'</span><span class="p">,</span>
<span class="n">version</span><span class="o">=</span><span class="s1">'0.1.0'</span><span class="p">,</span>
<span class="n">info</span><span class="o">=</span><span class="nb">dict</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="vm">__doc__</span><span class="p">),</span>
<span class="n">plugins</span><span class="o">=</span><span class="p">[</span><span class="n">FlaskPlugin</span><span class="p">(),</span> <span class="n">MarshmallowPlugin</span><span class="p">()],</span>
<span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/scripts/openapi.py#L9-L15">here</a>.</p>
<p>Then we can add the marshmallow schema classes, which are imported from
the <a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/schemas.py">schemas.py
module</a>:</p>
<div class="highlight"><pre><span></span><code><span class="n">spec</span><span class="o">.</span><span class="n">components</span><span class="o">.</span><span class="n">schema</span><span class="p">(</span><span class="s2">"ModelSchema"</span><span class="p">,</span> <span class="n">schema</span><span class="o">=</span><span class="n">ModelSchema</span><span class="p">)</span>
<span class="n">spec</span><span class="o">.</span><span class="n">components</span><span class="o">.</span><span class="n">schema</span><span class="p">(</span><span class="s2">"ModelCollectionSchema"</span><span class="p">,</span> <span class="n">schema</span><span class="o">=</span><span class="n">ModelCollectionSchema</span><span class="p">)</span>
<span class="n">spec</span><span class="o">.</span><span class="n">components</span><span class="o">.</span><span class="n">schema</span><span class="p">(</span><span class="s2">"JsonSchemaProperty"</span><span class="p">,</span> <span class="n">schema</span><span class="o">=</span><span class="n">JsonSchemaProperty</span><span class="p">)</span>
<span class="n">spec</span><span class="o">.</span><span class="n">components</span><span class="o">.</span><span class="n">schema</span><span class="p">(</span><span class="s2">"JSONSchema"</span><span class="p">,</span> <span class="n">schema</span><span class="o">=</span><span class="n">JSONSchema</span><span class="p">)</span>
<span class="n">spec</span><span class="o">.</span><span class="n">components</span><span class="o">.</span><span class="n">schema</span><span class="p">(</span><span class="s2">"ModelMetadataSchema"</span><span class="p">,</span> <span class="n">schema</span><span class="o">=</span><span class="n">ModelMetadataSchema</span><span class="p">)</span>
<span class="n">spec</span><span class="o">.</span><span class="n">components</span><span class="o">.</span><span class="n">schema</span><span class="p">(</span><span class="s2">"ErrorSchema"</span><span class="p">,</span> <span class="n">schema</span><span class="o">=</span><span class="n">ErrorSchema</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/scripts/openapi.py#L17-L22">here</a>.</p>
<p>To document the paths of the API, the OpenAPI specification has to be
added to the docstring of the controller functions that are registered
with the Flask application, an example of how to do this can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/model_service/endpoints.py#L16-L25">here</a>.
After this is done, we can add the paths to the OpenAPI document using
the code below:</p>
<div class="highlight"><pre><span></span><code><span class="k">with</span> <span class="n">app</span><span class="o">.</span><span class="n">test_request_context</span><span class="p">():</span>
<span class="n">spec</span><span class="o">.</span><span class="n">path</span><span class="p">(</span><span class="n">view</span><span class="o">=</span><span class="n">get_models</span><span class="p">)</span>
<span class="n">spec</span><span class="o">.</span><span class="n">path</span><span class="p">(</span><span class="n">view</span><span class="o">=</span><span class="n">get_metadata</span><span class="p">)</span>
<span class="n">spec</span><span class="o">.</span><span class="n">path</span><span class="p">(</span><span class="n">view</span><span class="o">=</span><span class="n">predict</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/scripts/openapi.py#L24-L27">here</a>.</p>
<p>Once all the components are loaded from the codebase, the OpenAPI
document can be saved to disk as a YAML file, using <a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/scripts/openapi.py#L29-L30">this
code</a>.
The resulting file can be found
<a href="https://github.com/schmidtbri/using-ml-model-abc/blob/master/openapi_specification.yaml">here</a>.
There is also an <a href="https://editor.swagger.io/">open source
viewer</a>
for OpenAPI documents which is able to do automatic code generation and
renders a webpage for viewing the document:</p>
<p><img alt="OpenAPI Documentation" src="https://www.tekhnoal.com/openapi_doc.png" width="100%"></p>
<h1>Conclusion</h1>
<p>In this blog post I showed how to create a web application that is able
to host any model that inherits from and follows the standards of the
MLModel base class. By using an abstraction to deal with machine
learning model code, it becomes possible to write an application that
can deploy any model, instead of building applications that can deploy
only one ML model.</p>
<p>A drawback of this blog post's approach is that the types of the fields
in objects given and returned from the model object's predict() method
must be serializable to JSON and the schema package must be able to
create a JSON schema for them. This is not always easy to do with more
complicated data models. Since this is a web application, the use of
JSON schema makes a lot of sense, but there are situations in which a
JSON schema is not the best way to publish schema information.</p>
<p>A point I want to highlight is that I am purposefully maintaining
separate codebases for the model code and the application code. In this
approach, the model is a python package that is installed into the
application codebase. By separating the model code from the application
code, creating new versions of the model becomes simpler and more
straightforward. It also enables Data Scientists and engineers to
maintain separate codebases that better fit their needs, as well as
making it possible to deploy the same model package in multiple
applications and to deploy different versions of the same model.</p>Improving the MLModel Base Class2019-06-12T09:21:00-05:002019-06-12T09:21:00-05:00Brian Schmidttag:www.tekhnoal.com,2019-06-12:/improving-the-mlmodel-base-class.html<p>In the previous blog post in this series I showed an object oriented design for a base class that does Machine Learning model prediction. The design of the base class was intentionally very simple so that I could show a simple example of how to use the base class with a scikit-learn model. I showed an easy way to publish schema metadata about the model inputs and outputs, and how to write model deserialization code so that it is hidden from the users of the model. I also showed how to hide the implementation details of the model by translating the user's input to the model's input so that the user of the model doesn't have to know how to use pandas or numpy. In this blog post I will continue to make improvements to the MLModel class and the example that I used in the previous post.</p><p>This blog post continues with the ideas developed in the previous post
in this series.</p>
<p>All of the code shown in this post can be found in <a href="https://github.com/schmidtbri/ml-model-abc-improvements">this Github repository</a>.</p>
<p>In the previous blog post in this series I showed an object oriented
design for a base class that does Machine Learning model prediction. The
design of the base class was intentionally very simple so that I could
show a simple example of how to use the base class with a scikit-learn
model. I showed an easy way to publish schema metadata about the model
inputs and outputs, and how to write model deserialization code so that
it is hidden from the users of the model. I also showed how to hide the
implementation details of the model by translating the user's input to
the model's input so that the user of the model doesn't have to know how
to use pandas or numpy. In this blog post I will continue to make
improvements to the MLModel class and the example that I used in the
previous post.</p>
<p>In this blog post I will make the iris example code from the previous
post into a full python package with many features that will make the
iris model easier to install and use from other python packages. I will
also continue to improve the MLModel base class. In general, I want to
show how to make ML code easier to install and use.</p>
<p>When I was doing research for this blog post I found a great <a href="https://towardsdatascience.com/building-package-for-machine-learning-project-in-python-3fc16f541693">blog post</a>
by <a href="https://towardsdatascience.com/@mbednarski">Mateusz Bednarski</a>
showing how to build machine learning models as python packages. There
are some similarities between what I will show here and that blog post,
however, this post focuses more on the deployment of ML models into
production systems, whereas Mateusz'z post focuses on packaging the
training code.</p>
<p>This blog post assumes that you have some experience with Python. I will
be referencing resources for learning the tools that I will be using in
the blog post.</p>
<h2>Making the Iris Model into a Python Package</h2>
<p>Another improvement that we can make to the example code is to make it
into a full-fledged Python package. This makes it easier to use and
install in other projects. The goal here is to treat ML models as just
another python package, this makes it possible to leverage all of the
tools that Python has for packaging and reusing code. A good guide for
structuring python packages can be found
<a href="https://python-packaging.readthedocs.io/en/latest/#">here</a>.</p>
<p>An common pattern that can be seen in ML code is that it is almost
always hard to use and deploy. This is something that teams that do
machine learning know very well, since the code written by a Data
Scientist almost always needs to be rewritten by a software engineer
before it is possible to deploy it into production systems. Luckily, we
have a lot of tools to make the transition from experimental model to
production model a smoother process. In this section I will show a few
simple steps that will make the example model from <a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">the last blog
post</a>
into an installable Python package. To accomplish this, we will add
version information to the package, add a command line interface to the
training script, add Sphinx documentation, and add a setup.py file to
the project. As an additional touch, we will automate the documentation
process for the interface of the ML model.</p>
<p>First of all we need to reorganize the code in the project a little bit:</p>
<div class="highlight"><pre><span></span><code><span class="o">-</span><span class="w"> </span><span class="n">project_root</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">docs</span><span class="w"> </span><span class="p">(</span><span class="n">a</span><span class="w"> </span><span class="n">folder</span><span class="p">,</span><span class="w"> </span><span class="n">package</span><span class="w"> </span><span class="n">documentation</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">goes</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">here</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">iris_model</span><span class="w"> </span><span class="p">(</span><span class="n">a</span><span class="w"> </span><span class="n">folder</span><span class="p">,</span><span class="w"> </span><span class="n">iris</span><span class="w"> </span><span class="n">package</span><span class="w"> </span><span class="n">code</span><span class="w"> </span><span class="n">will</span><span class="w"> </span><span class="n">goes</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">here</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">model_files</span><span class="w"> </span><span class="p">(</span><span class="n">a</span><span class="w"> </span><span class="n">folder</span><span class="p">,</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">model</span><span class="w"> </span><span class="n">files</span><span class="w"> </span><span class="k">go</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">here</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">__init__</span><span class="p">.</span><span class="n">py</span><span class="w"> </span><span class="p">(</span><span class="n">this</span><span class="w"> </span><span class="k">file</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">python</span><span class="w"> </span><span class="n">package</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">iris_predict</span><span class="p">.</span><span class="n">py</span><span class="w"> </span><span class="p">(</span><span class="n">the</span><span class="w"> </span><span class="n">prediction</span><span class="w"> </span><span class="n">code</span><span class="w"> </span><span class="n">goes</span><span class="w"> </span><span class="n">here</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">iris_train</span><span class="p">.</span><span class="n">py</span><span class="w"> </span><span class="p">(</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">training</span><span class="w"> </span><span class="n">script</span><span class="w"> </span><span class="n">goes</span><span class="w"> </span><span class="n">here</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">tests</span><span class="w"> </span><span class="p">(</span><span class="n">a</span><span class="w"> </span><span class="n">folder</span><span class="p">,</span><span class="w"> </span><span class="n">unit</span><span class="w"> </span><span class="n">tests</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">iris_model</span><span class="w"> </span><span class="n">package</span><span class="w"> </span><span class="k">go</span><span class="w"> </span><span class="n">here</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">ml_model_abc</span><span class="p">.</span><span class="n">py</span><span class="w"> </span><span class="p">(</span><span class="n">the</span><span class="w"> </span><span class="n">MLModel</span><span class="w"> </span><span class="n">base</span><span class="w"> </span><span class="k">class</span><span class="w"> </span><span class="n">goes</span><span class="w"> </span><span class="n">here</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">requirements</span><span class="p">.</span><span class="n">txt</span><span class="w"></span>
<span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">setup</span><span class="p">.</span><span class="n">py</span><span class="w"> </span><span class="p">(</span><span class="n">the</span><span class="w"> </span><span class="n">package</span><span class="w"> </span><span class="n">installation</span><span class="w"> </span><span class="n">script</span><span class="w"> </span><span class="n">goes</span><span class="w"> </span><span class="n">here</span><span class="p">)</span><span class="w"></span>
</code></pre></div>
<p>A lot of this code is shared with the <a href="https://www.tekhnoal.com/a-simple-ml-model-base-class.html">previous blog
post</a>,
but it is reorganized here to make it possible to have an ML model that
is can be installed as a Python package.</p>
<h2>Adding Package Versioning</h2>
<p>Python packages are usually versioned using <a href="https://semver.org/">semantic
versioning</a>. Software packages that
use semantic versioning must declare a public API. This is complicated
when we want to do versioning of ML models because we have two APIs: the
API for making model predictions and the API for training the model. We
can deal with this complexity by tying the different components of the
semantic version of the package to the prediction API and the training
API of the package.</p>
<p>I chose to version the prediction API of the model using the major and
minor version components of the semantic versioning standard. The
reasoning for this is that a lot of users are affected by changes in the
prediction API, but not as many users are affected by changes in the
training API. This is because ML models are usually used by many people
but trained by a few experts. The patch number of the version can be
used to version changes to the training API.</p>
<p>As an example, whenever the ML model prediction API changes in a
backward-incompatible way the major version number will go up, and
whenever it changes in a backwards-compatible way the minor version will
go up. This approach ensures that any user of the ML model package will
know how changes in the prediction API will affect them when they
install the package. A simple way to understand when to increase the
major or minor version numbers is to do so when the input and output
schemas of the model change. Lastly, any changes to the model training
API will cause the patch version number to go up.</p>
<p>A <a href="https://packaging.python.org/guides/single-sourcing-package-version/">common approach</a>
for storing version information in a python package is to put a
"__verison__" property into the __init__.py module in the root
of the package:</p>
<div class="highlight"><pre><span></span><code><span class="n">__version_info__</span> <span class="o">=</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<span class="n">__version__</span> <span class="o">=</span> <span class="s2">"."</span><span class="o">.</span><span class="n">join</span><span class="p">([</span><span class="nb">str</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="k">for</span> <span class="n">n</span> <span class="ow">in</span> <span class="n">__version_info__</span><span class="p">])</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/iris_model/__init__.py#L1-L2">here</a>.</p>
<p>I like to think of an ML model as a software component like any other,
the only difference being that an ML model is statistically significant.
Of course, being statistically significant adds a lot of complexity, but
at the end of the day ML models are just code that can be managed just
like any other piece of code. In this section we can see how to take a
step in that direction by attaching version information to the IrisModel
package.</p>
<p>Although semantic versioning is not designed to be used for versioning
models, we can apply it here to version model code and gloss over the
more complicated aspects of ML models. For example, we can't use
semantic versioning to version model parameters since they are not part
of the codebase and don't have an API. This is a problem that I will
tackle in another blog post.</p>
<h2>Adding a CLI interface to the Training Script</h2>
<p>When building ML models, the training code is often written in jupyter
notebooks, while there are ways to automate the training process with
notebooks it's a lot easier to do it through the command line. To do
this we will add a simple command line interface to the Iris model
training script. We will create the interface using the
<a href="https://docs.python.org/3/library/argparse.html">argparse</a>
package and then create a function that calls the train() function when
the iris_train.py script is called from the command line.</p>
<p>To create the argparse ArgumentParse object we create a dedicated
function (the reason for this will be explained below):</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">argument_parser</span><span class="p">():</span>
<span class="n">parser</span> <span class="o">=</span> <span class="n">argparse</span><span class="o">.</span><span class="n">ArgumentParser</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s1">'Command to train the Iris model.'</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'-gamma'</span><span class="p">,</span> <span class="n">action</span><span class="o">=</span><span class="s2">"store"</span><span class="p">,</span> <span class="n">dest</span><span class="o">=</span><span class="s2">"gamma"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s1">'Gamma value used to train the SVM model.'</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">'-c'</span><span class="p">,</span> <span class="n">action</span><span class="o">=</span><span class="s2">"store"</span><span class="p">,</span> <span class="n">dest</span><span class="o">=</span><span class="s2">"c"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="nb">float</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s1">'C value used to train the SVM model.'</span><span class="p">)</span>
<span class="k">return</span> <span class="n">parser</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/iris_model/iris_train.py#L39-L50">here</a>.</p>
<p>To call the train() function from the command line, I created a new
function called main(). The function gets a parser object, parses the
incoming parameters, and calls the train() function:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
<span class="n">parser</span> <span class="o">=</span> <span class="n">argument_parser</span><span class="p">()</span>
<span class="n">results</span> <span class="o">=</span> <span class="n">parser</span><span class="o">.</span><span class="n">parse_args</span><span class="p">()</span>
<span class="k">try</span><span class="p">:</span>
<span class="k">if</span> <span class="n">results</span><span class="o">.</span><span class="n">gamma</span> <span class="ow">is</span> <span class="kc">None</span> <span class="ow">and</span> <span class="n">results</span><span class="o">.</span><span class="n">c</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">train</span><span class="p">()</span>
<span class="k">elif</span> <span class="n">results</span><span class="o">.</span><span class="n">gamma</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="ow">and</span> <span class="n">results</span><span class="o">.</span><span class="n">c</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">train</span><span class="p">(</span><span class="n">gamma</span><span class="o">=</span><span class="n">results</span><span class="o">.</span><span class="n">gamma</span><span class="p">)</span>
<span class="k">elif</span> <span class="n">results</span><span class="o">.</span><span class="n">gamma</span> <span class="ow">is</span> <span class="kc">None</span> <span class="ow">and</span> <span class="n">results</span><span class="o">.</span><span class="n">c</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">train</span><span class="p">(</span><span class="n">c</span><span class="o">=</span><span class="n">results</span><span class="o">.</span><span class="n">c</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">train</span><span class="p">(</span><span class="n">gamma</span><span class="o">=</span><span class="n">results</span><span class="o">.</span><span class="n">gamma</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="n">results</span><span class="o">.</span><span class="n">c</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="n">traceback</span><span class="o">.</span><span class="n">print_exc</span><span class="p">()</span>
<span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">EX_SOFTWARE</span><span class="p">)</span>
<span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">EX_OK</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/iris_model/iris_train.py#L53-L75">here</a>.</p>
<p>The reason for adding the main() function to wrap the train() function
is so that the main() function can be registered as an entry point when
the iris_model package is installed. The main() function also handles
parsing the command line arguments, calls the train() function, handles
exceptions and returns the success or error code to the operating system
when the training process is done. Another benefit of this approach is
that train() function can still be imported into other code and called
as a function, but now it also has a CLI interface.</p>
<h2>Adding Sphinx Documentation</h2>
<p>One of the great parts of working in the Python ecosystem is the Sphinx
package, which is used for creating documentation from source files.
There are a lot of
<a href="https://pythonhosted.org/an_example_pypi_project/sphinx.html">great</a>
<a href="https://www.sphinx-doc.org/en/1.5/tutorial.html">guides</a>
for documenting your package using Sphinx, so I won't go through it
again here. For this blog post, I followed these guides to create a
simple documentation page and <a href="https://schmidtbri.github.io/ml-model-abc-improvements/">hosted it on Github
pages</a>.
Adding documentation is a simple process and it is done by almost all
Python packages that have more than a few users. After putting together
the basic documentation, I followed a few simple extra steps to fully
automate the creation of the documentation for the model.</p>
<p>First of all, I added documentation strings to all classes and methods
in the iris_model package where it made sense. <a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/iris_model/iris_predict.py#L41-L47">Here is an
example</a>
of how I documented the predict() method using the docstring in the .py
file. The docstring is formatted so that it can be automatically built
by the sphinx autodoc extension. This extension makes it easy to extract
docstrings from python packages and modules and build documentation. A
good guide for using the autodoc extension can be found
<a href="https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html">here</a>.</p>
<p>However, one problem with using the MLModel base class for writing code
is that the predict() method of a class that inherits from MLModel only
accepts a single parameter called "data" as input. This makes it hard to
document the input schema of the model through autodoc since the data
structure accepted by the model for prediction can't be easily described
in the docstring. The same problem happens when we try to document the
return type of the predict() method. Luckily, we can automatically
extract the JSON Schema representation of the input and output schemas
of the model. In order to leverage this, I used the
<a href="https://sphinx-jsonschema.readthedocs.io/en/latest/">sphinx-jsonschema</a>
extension to automatically add the schema information to a documentation
page. The process for adding it is simple, I just had to add this code
to an .rst file:</p>
<div class="highlight"><pre><span></span><code><span class="p">..</span> <span class="ow">jsonschema</span><span class="p">::</span> ../build/input_schema.json
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/docs/source/iris_predict.rst">here</a>.</p>
<p>The only problem is that the input and output json schema strings are
not saved to disk for the jsonschema extension to access, but are
available from an instance IrisModel class. To fix this, I added <a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/00d5e558f9af7571d824d597107412ed86681e8b/docs/source/conf.py#L29-L39">this
code</a>
to the conf.py file that creates the Sphinx documentation. The code
instantiates an IrisModel object, extracts the JSON Schema strings, and
saves it to a location that can then be read by the Sphinx documentation
generator. The documentation that is generated can be seen
<a href="https://schmidtbri.github.io/ml-model-abc-improvements/iris_predict.html#iris-model-input-schema">here</a>
and
<a href="https://schmidtbri.github.io/ml-model-abc-improvements//iris_predict.html#iris-model-output-schema">here</a>.</p>
<p>Since we are using the argparse library for creating the CLI interface
for the training script, we can use the
<a href="https://sphinx-argparse.readthedocs.io/en/stable/index.html">sphinxarg.ext</a>
Sphinx extension to automatically generate the documentation. This was
as easy as adding this code to the .rst file that describes the training
script:</p>
<div class="highlight"><pre><span></span><code><span class="p">..</span> <span class="ow">argparse</span><span class="p">::</span>
<span class="nc">:module:</span> iris_model.iris_train
<span class="nc">:func:</span> argument_parser
<span class="nc">:prog:</span> iris_train
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/docs/source/iris_train.rst">here</a>.</p>
<p>The sphixarg.ext extension then goes to the iris_train module and calls
the argument parser function, which returns an instance of a
ArgumentParser object, which is then used to generate the documentation.
The results can be seen in the documentation
<a href="https://schmidtbri.github.io/ml-model-abc-improvements//iris_train.html#iris-training-code-cli-documentation">here</a>.</p>
<p>This section shows how it is possible to write the code of an ML model
in such a way that the documentation can be created automatically.
Exposing the input and output schemas of the model as JSON schema
strings makes it possible for a Data Scientist to communicate the
requirements of the model clearly to the end user of the model. At the
same time, by exposing the hyperparameters of the training script as
command line options, its becomes possible to automatically document the
training process. By writing the ML model code in a certain way, it
makes it possible for any changes to the code to be documented
automatically whenever the documentation is generated.</p>
<h2>Adding a setup.py File</h2>
<p>Now that we have the ML model code structured as a Python package,
versioned, and documented, we'll add a
<a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/setup.py">setup.py</a>
file to the project folder. The setup.py file is used by the setuptools
package to install python packages and makes the ML model easily
installable in a virtual environment. A great guide for writing the
setup.py file for your package can be found
<a href="https://github.com/kennethreitz/setup.py">here</a>.</p>
<p>In the iris_model package setup.py file, most of the fields are very
easy to understand and they are better explained in other guides. In
this blog post, I'll focus on the sections of the setup.py file that had
to be specifically modified for the ML model package. First of all, we
want to point at the folder that contains the iris_model package, we
can do this with this line in the setup.py file:</p>
<div class="highlight"><pre><span></span><code><span class="n">packages</span><span class="o">=</span><span class="p">[</span><span class="s2">"iris_model"</span><span class="p">],</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/setup.py#L20">here</a>.</p>
<p>Next, we need to make sure that the ml_model_abc.py Python module is
installed along with the iris_model package. In the future, it would be
better to take this code and put it into another Python package that the
iris_model package would depend on, but for now we just need this line
of code:</p>
<div class="highlight"><pre><span></span><code><span class="n">py_modules</span><span class="o">=</span><span class="p">[</span><span class="s2">"ml_model_abc"</span><span class="p">],</span>
</code></pre></div>
<p>The code above can be found <a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/setup.py#L19">here</a>.</p>
<p>Next, we take care of the model parameters. The ML model requires that
the model parameters be available for loading at prediction time, the
setup.py file can handle this by adding this line of code:</p>
<div class="highlight"><pre><span></span><code><span class="n">package_data</span><span class="o">=</span><span class="p">{</span><span class="s1">'iris_model'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'model_files/svc_model.pickle'</span><span class="p">]},</span>
<span class="n">include_package_data</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/setup.py#L25-L28">here.</a></p>
<p>This ensures that when the package is installed into an environment, the
model parameters will be copied along with the model_files folder.</p>
<p>Next, we have to register the iris_train.py script as an entry point.
This makes it possible to run the training script from the command line
inside of an environment where the iris_model package is installed:</p>
<div class="highlight"><pre><span></span><code><span class="n">entry_points</span><span class="o">=</span><span class="p">{</span> <span class="s1">'console_scripts'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'iris_train=iris_model.iris_train:main'</span><span class="p">,]</span>
</code></pre></div>
<p>The code above can be found <a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/setup.py#L29-L32">here</a>.</p>
<p>Once we have all of this in the setup.py file, we can try to do a pip
install on a new virtual environment. We will install the package
directly from the git repository to keep things simple. The shell
commands to do this are these:</p>
<div class="highlight"><pre><span></span><code>mkdir example
<span class="nb">cd</span> example
<span class="c1"># creating a virtual environment</span>
python3 -m venv venv
<span class="c1">#activating the virtual environment, on a mac computer</span>
<span class="nb">source</span> venv/bin/activate
<span class="c1"># installing the iris_model package from the github repository</span>
pip install git+https://github.com/schmidtbri/ml-model-abc-improvements#egg<span class="o">=</span>iris_model
</code></pre></div>
<p>Now we can test the installation by starting an interactive Python
interpreter and executing this Python code:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="kn">from</span> <span class="nn">iris</span>\<span class="n">_model</span><span class="o">.</span><span class="n">iris</span>\<span class="n">_predict</span> <span class="kn">import</span> <span class="nn">IrisModel</span>
<span class="o">>>></span> <span class="n">model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="o">>>></span> <span class="n">model</span>
<span class="o"><</span><span class="n">iris_model</span><span class="o">.</span><span class="n">iris_predict</span><span class="o">.</span><span class="n">IrisModel</span> <span class="nb">object</span> <span class="n">at</span> <span class="mh">0x105d1e940</span><span class="o">></span>
<span class="o">>>></span> <span class="n">model</span><span class="o">.</span><span class="n">input_schema</span>
<span class="n">Schema</span><span class="p">({</span><span class="s1">'sepal_length'</span><span class="p">:</span> <span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">float</span><span class="s1">'>, </span>
<span class="s1">'sepal_width'</span><span class="p">:</span> <span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">float</span><span class="s1">'>, </span>
<span class="s1">'petal_length'</span><span class="p">:</span> <span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">float</span><span class="s1">'>, </span>
<span class="s1">'petal_width'</span><span class="p">:</span> <span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">float</span><span class="s1">'>})</span>
<span class="o">>>></span> <span class="n">model</span><span class="o">.</span><span class="n">output_schema</span>
<span class="n">Schema</span><span class="p">({</span><span class="s1">'species'</span><span class="p">:</span> <span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">str</span><span class="s1">'>})</span>
</code></pre></div>
<p>Next, we can test the CLI interface for the training code by executing
the command line in the command line:</p>
<div class="highlight"><pre><span></span><code>iris_train -c<span class="o">=</span><span class="m">10</span>.0 -gamma<span class="o">=</span><span class="m">0</span>.01
</code></pre></div>
<p>This section showed how to install the iris_model Python package using
common Python packaging tools, and how to use and retrain the model in
different Python environment.</p>
<h2>Model Metadata in the MLModel Base Class</h2>
<p>In the <a href="https://towardsdatascience.com/a-simple-ml-model-base-class-ab40e2febf13">previous blog
post</a>
we showed an MLModel base class with two required abstract properties:
"input_schema" and "output_schema". These two properties were required
to be provided by any class that derived from the MLModel base class and
were used to publish schema metadata about the input and output data of
the model. In order to keep things simple, I chose not to expose more
metadata through class properties, however there are several other
pieces of metadata that would be useful to expose to the outside world.
For example:</p>
<ul>
<li>display_name, a property that returns a display name for the model</li>
<li>qualified_name, a property that returns the qualified name of the model, a qualified name is an unambiguous identifier for the model</li>
<li>description, a property that returns a description of the model</li>
<li>major_version, this property returns the model's major version as a string</li>
<li>minor_version, this property returns the model's minor version as a string</li>
</ul>
<p>These properties are exposed as object properties and can be accessed
the same way as the input_schema and output_schema properties. The new
code for the MLModel base class now looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MLModel</span><span class="p">(</span><span class="n">ABC</span><span class="p">):</span>
<span class="nd">@property</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">display_name</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@property</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">qualified_name</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@property</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">description</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@property</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">major_version</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@property</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">minor_version</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@property</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@property</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">validate</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
</code></pre></div>
<p>The code above can be found <a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/ml_model_abc.py#L4-L74">here</a>.</p>
<p>The new MLModel base class looks exactly like the previous
implementation, but now also requires the properties described above to
be published as instance properties.</p>
<p>This metadata is added in the __init__.py file of the iris_model
package, since it is applicable to the whole package:</p>
<div class="highlight"><pre><span></span><code><span class="c1"># a display name for the model</span>
<span class="n">__display_name__</span> <span class="o">=</span> <span class="s2">"Iris Model"</span>
<span class="c1"># returning the package name as the qualified name for the model</span>
<span class="n">__qualified_name__</span> <span class="o">=</span> <span class="vm">__name__</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s2">"."</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
<span class="c1"># a description of the model</span>
<span class="n">__description__</span> <span class="o">=</span> <span class="s2">"A machine learning model for predicting the species of a flower based on its measurements."</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/iris_model/__init__.py#L4-L11">here</a>.</p>
<p>In order to show how a class that derives from the MLModel base class
can publish these properties, we can modify the Iris model example used
in the <a href="https://towardsdatascience.com/a-simple-ml-model-base-class-ab40e2febf13">previous blog
post</a>.
The Iris model class now looks like this:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">ml_model_abc</span> <span class="kn">import</span> <span class="n">MLModel</span>
<span class="kn">from</span> <span class="nn">iris_model</span> <span class="kn">import</span> <span class="n">__version_info__</span><span class="p">,</span> <span class="n">__display_name__</span><span class="p">,</span> <span class="n">__qualified_name__</span><span class="p">,</span> <span class="n">__description__</span>
<span class="k">class</span> <span class="nc">IrisModel</span><span class="p">(</span><span class="n">MLModel</span><span class="p">):</span>
<span class="c1"># accessing the package metadata</span>
<span class="n">display_name</span> <span class="o">=</span> <span class="n">__display_name__</span>
<span class="n">qualified_name</span> <span class="o">=</span> <span class="n">__qualified_name__</span>
<span class="n">description</span> <span class="o">=</span> <span class="n">__description__</span>
<span class="n">major_version</span> <span class="o">=</span> <span class="n">__version_info__</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="n">minor_version</span> <span class="o">=</span> <span class="n">__version_info__</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="c1"># stating the input schema of the model as a Schema object</span>
<span class="n">input_schema</span> <span class="o">=</span> <span class="n">Schema</span><span class="p">({</span><span class="s1">'sepal_length'</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span>
<span class="s1">'sepal_width'</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span>
<span class="s1">'petal_length'</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span>
<span class="s1">'petal_width'</span><span class="p">:</span> <span class="nb">float</span><span class="p">})</span>
<span class="c1"># stating the output schema of the model as a Schema object</span>
<span class="n">output_schema</span> <span class="o">=</span> <span class="n">Schema</span><span class="p">({</span><span class="s1">'species'</span><span class="p">:</span> <span class="nb">str</span><span class="p">})</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">dir_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">realpath</span><span class="p">(</span><span class="vm">__file__</span><span class="p">))</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir_path</span><span class="p">,</span>
<span class="s2">"model_files"</span><span class="p">,</span>
<span class="s2">"svc_model.pickle"</span><span class="p">),</span> <span class="s1">'rb'</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">array</span><span class="p">([</span><span class="n">data</span><span class="p">[</span><span class="s2">"sepal_length"</span><span class="p">],</span>
<span class="n">data</span><span class="p">[</span><span class="s2">"sepal_width"</span><span class="p">],</span>
<span class="n">data</span><span class="p">[</span><span class="s2">"petal_length"</span><span class="p">],</span>
<span class="n">data</span><span class="p">[</span><span class="s2">"petal_width"</span><span class="p">]])</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="n">y_hat</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X</span><span class="p">)[</span><span class="mi">0</span><span class="p">])</span>
<span class="n">targets</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'setosa'</span><span class="p">,</span> <span class="s1">'versicolor'</span><span class="p">,</span> <span class="s1">'virginica'</span><span class="p">]</span>
<span class="n">species</span> <span class="o">=</span> <span class="n">targets</span><span class="p">[</span><span class="n">y_hat</span><span class="p">]</span>
<span class="k">return</span> <span class="p">{</span><span class="s2">"species"</span><span class="p">:</span> <span class="n">species</span><span class="p">}</span>
</code></pre></div>
<p>The code above can be found
<a href="https://github.com/schmidtbri/ml-model-abc-improvements/blob/master/iris_model/iris_predict.py#L1-L65">here</a>.</p>
<p>The display name, qualified name, and description properties are set as
string class properties in the IrisModel class, and they are accessed
from the __init__ module. The major and minor version properties are
extracted from the __version_info__ property.</p>
<p>There can be some situations in which a single Python package will hold
more than one MLModel derived class. In that case the display name,
qualified name, and description metadata would be set individually
within the MLModel derived class itself instead of accessing it from the
package-wide metadata stored in the __init__ module.</p>
<p>The class properties are now easily accessible from the model object, to
show this we can instantiate the object and access the properties:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="kn">from</span> <span class="nn">iris_model.iris_predict</span> <span class="kn">import</span> <span class="n">IrisModel</span>
<span class="o">>>></span> <span class="n">iris_model</span> <span class="o">=</span> <span class="n">IrisModel</span><span class="p">()</span>
<span class="o">>>></span> <span class="n">iris_model</span><span class="o">.</span><span class="n">qualified_name</span>
<span class="s1">'iris\_model'</span>
<span class="o">>>></span> <span class="n">iris_model</span><span class="o">.</span><span class="n">display_name</span>
<span class="s1">'Iris Model'</span>
</code></pre></div>
<p>These new metadata properties can now be used to introspect information
about the model more easily, this also makes it possible to more easily
manage many MLModel model objects in the same python process.</p>
<h2>Future Improvements</h2>
<p>In this blog post we showed how to do versioning of an ML model using
standard conventions of python packages, however the model parameters of
the Iris model also need to be versioned over time and metadata about
them also needs to be kept. This is a problem that I will tackle in a
future blog post.</p>
<p>Another problem that we did not tackle in this blog post is how to have
a more complex API for ML models. For example, the Iris model is only
allowed to have one predict() method, this makes it impossible to do
more complex operations with the Iris model. In a future blog post I
will show how to modify the ML model base class to allow this.</p>A Simple ML Model Base Class2019-04-02T09:20:00-05:002019-04-02T09:20:00-05:00Brian Schmidttag:www.tekhnoal.com,2019-04-02:/a-simple-ml-model-base-class.html<p>When creating software it is often useful to write abstract classes to help define different interfaces that classes can implement and inherit from. By creating a base class, a standard can be defined that simplifies the design of the whole system and clarifies every decision moving forward.</p><p>When creating software it is often useful to write abstract classes to
help define different interfaces that classes can implement and inherit
from. By creating a base class, a standard can be defined that
simplifies the design of the whole system and clarifies every decision
moving forward.</p>
<p>The integration of ML models with other software components is often
complicated and can benefit greatly from using an Object Oriented
approach. Recently, I've been seeing this problem solved in many
different ways, so I decided to try to implement my own solution.</p>
<p>In this post I will describe a simple implementation of a base class for
Machine Learning Models. This post will focus on making predictions with
ML models, and integrating ML models with other software components.
Training code will not be shown to keep the code simple. The code in
this post will be written in Python, if you aren't familiar with
abstract base classes in Python,
<a href="https://www.python-course.eu/python3_abstract_classes.php">here</a>
is a good place to learn.</p>
<h2>Scikit-learn's Approach to Base Classes</h2>
<p>The most well known ML software package in python is scikit-learn, and
it provides a set of abstract base classes in the <a href="https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/base.py">base.py
module</a>.
The scikit-learn API is a great place to learn about machine learning
software engineering in general, but in this case we want to focus on
it's approach to base classes for making predictions with ML models.</p>
<p>Scikit-learn defines an abstract base class called Estimator which is
meant to be the base class for any class that is able to learn from a
data set, a class that derives from Estimator must implement a "fit"
method. Scikit-learn also defines a Predictor base class that is meant
to be the base class for any class that is able to infer from learned
parameters when presented with new data, a class that derives from
Predictor must implement a "predict" method. These two bases classes are
some of the most commonly used abstractions in the Scikit-learn package.
By defining these base classes, the Scikit-learn project provides a
strong base for coding ML algorithms.</p>
<p>These two interfaces are broad enough to take us far, but what about
serialization and deserialization? ML models need to be loaded from
storage before they can be used. On this front scikit-learn is mostly
silent, and no standard interface for hiding the details of model
serialization and deserialization is provided. Also, what if we need to
publish schema information about the input and output data that a model
needs for scoring? Scikit-learn does not provide a way to do this
either, since it uses numpy arrays for input and output.</p>
<p>Because of these factors, using Scikit-learn's API is not necessarily
the best way to integrate ML models with other software components.
Integrating a Scikit-learn model with other software components by using
the Scikit-learn API exposes internal details about how the model is
serialized and how information is passed into the model. For example, if
a Data Scientist hands over a scikit-learn model in a pickled file along
with some code, a software engineer would have to be familiar with how
to deserialize the model object and how to structure a Numpy array in
such a way that it will be accepted by the model's predict() method. The
best way to solve this problem is to hide these implementation details
behind an interface.</p>
<p>In summary, to simplify the use of ML models within production systems,
it would be useful to solve a couple of issues:</p>
<ul>
<li>
<p>How to consistently and transparently send data to the model</p>
</li>
<li>
<p>How to load serialized model assets when instantiating a model</p>
</li>
<li>
<p>How to publish input and output data schema information</p>
</li>
</ul>
<h2>Some Solutions</h2>
<p>Over the last few years, a few big tech companies have been developing
proprietary in-house machine learning infrastructure and software. Some
of these companies sell access to their ML platform and others have
published details about their approach to ML infrastructure. Also, there
have been a few open source projects that seek to simplify the
deployment of ML models to production systems. In this section I will
describe some solutions that have emerged recently for the problems
described above.</p>
<h3>AWS Sagemaker</h3>
<p>AWS Sagemaker is a platform for training and deploying ML models within
the AWS ecosystem. The platform has several ready-made ML algorithms
that can be leveraged without writing a lot of code. However, a way to
deploy custom ML code to the platform is provided. To deploy a
prediction endpoint on top of the Sagemaker service, a Python Flask
application with a "/ping" and "/invocations" endpoints must be created
and deployed within a Docker container.</p>
<p>In the Sagemaker example published
<a href="https://github.com/awslabs/amazon-sagemaker-examples/blob/35941a33425b3a441275abc7243eb1f959a584e4/advanced_functionality/scikit_bring_your_own/container/decision_trees/predictor.py#L24-L43">here</a>,
we can see the recommended way to run the model prediction code within
the Flask application. In the example, the scikit-learn model object is
deserialized and saved as a class property, and the model is then
accessed by the "predict" method. This implementation does not provide a
way to publish schema metadata about the model and does not enforce any
specific implementation on the model code. The AWS Sagemaker library
does not provide a base class to help write the model code.</p>
<h3>Facebook</h3>
<p>Facebook published a blog post about their ML systems
<a href="https://code.fb.com/ml-applications/introducing-fblearner-flow-facebook-s-ai-backbone/">here</a>.
The FBLearner Flow system is made up of workflows and operators. A
workflow is a single unit of work with a specific set of inputs and
outputs, a workflow is made up of operators which do simple operations
on data. The blog post shows how to train a Decision Tree model on the
iris data set. The blog post does not provide many implementation
details about their internal Python packages. An interesting part of the
approach taken is the fact that schema metadata is attached to every
workflow created, ensuring type safety at runtime. There are not details
about loading and storing model assets. Facebook's FBFlow Python package
does not use base classes that developers can inherit from to write
code, but uses function annotations to attach metadata to ML model code.</p>
<h3>Uber</h3>
<p>Uber published a blog post about their approach to custom ML models
<a href="https://eng.uber.com/michelangelo-pyml/">here</a>. Uber's
PyML package is used to deploy ML models that are not natively supported
by Uber's Michelangelo ML platform, which is described
<a href="https://eng.uber.com/michelangelo/">here</a>. The PyML
package does not specify how to write model training code, but does
provide a base class for writing ML model prediction code. The base
class is called DataFrameModel. The interface is very simple, it only
has two methods: the __init__() method, and the predict() method.
The model assets are required to be deserialized in the class
constructor and all prediction code is in the predict method of the
class.</p>
<p>The DataFrameModel interface requires the use of Pandas dataframes or
tensors when giving data to the model for prediction. This is a design
decision can backfire because there is no way to tell the user of the
model how to structure the input data to the model. However, the use of
the __init__() method for loading model assets helps to hide the
complexity of the model from the user. Also, by using base classes that
must be inherited from in order to deploy code to the production
systems, certain requirements can be more easily checked.</p>
<h3>Seldon Core</h3>
<p>Seldon Core is an open source project for hosting ML models. It supports
custom Python models, as described
<a href="https://docs.seldon.io/projects/seldon-core/en/latest/python/python_component.html">here</a>.
The model code is required to be in a Python class with an
__init__() method and a predict() method, it follows Uber's design
closely but does not use an abstract base class to enforce the
interface. Another difference is that Seldon allows the model class to
return results in several different ways, and not just in Pandas
dataframes. Seldon also allows the model class to return column name
metadata for the model inputs, but no type metadata.</p>
<h1>A Simple ML Model Base Class</h1>
<p>NOTE: All of the code shown in this section can be found in <a href="https://github.com/schmidtbri/simple-ml-model-abc">this
Github
repository</a>.</p>
<p>In this section I will present a simple abstract base class that
combines the strengths of the approaches shown above into one abstract
base class for ML models. I will also explain the reasoning behind the
design.</p>
<p>Here is the code for the abstract base class:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MLModel</span><span class="p">(</span><span class="n">ABC</span><span class="p">):</span>
<span class="sd">""" An abstract base class for ML model prediction code """</span>
<span class="nd">@property</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">input_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@property</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">output_schema</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">raise</span> <span class="ne">NotImplementedError</span><span class="p">()</span>
<span class="nd">@abstractmethod</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">validate</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
</code></pre></div>
<p>The code looks very similar to Uber's and Seldon Core's approach. The
model file deserialization code is still expected to be implemented in
the __init__() method, and the prediction code is still expected to
be in the predict() method. Any model that needs to be used by other
software packages is expected to derive from the MLModel abstract base
class and implement these two methods.</p>
<p>However, there are some differences. The input to the predict method is
not expected to be of any particular type, it can be any Python type as
long as the input data is packaged into a single input parameter called
"data". This is different from Seldon Core's and Uber's approach which
required Numpy arrays and Pandas arrays.</p>
<p>Another difference is that the base class shown above requires the model
creator to attach schema metadata to their implementation. The base
class has two extra properties that are not present in the Seldon Core
and Uber implementations: the "input_schema" and "output_schema"
properties are meant to publish the schema of the data that the model
will accept in the predict method and the shema of the model that the
model will output from the predict method. To do this, I will use the
python schema package, but there are many options for writing and
enforcing schema, for example the marshmallow-schema and schematics
python packages.</p>
<p>We also need to define a way for a model creator to raise exceptions.
For this we can write a simple custom Exception:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">MLModelException</span><span class="p">(</span><span class="ne">Exception</span><span class="p">):</span>
<span class="sd">""" Exception type for use within MLModel derived classes """</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="ne">Exception</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span>
</code></pre></div>
<h1>Using the Base Class</h1>
<p>This blog post deals purely with the ML code that will be used for
predicting in production and not with the model training code. However,
we still need to have a model to work with. Here's a simple scikit-learn
model training script:</p>
<div class="highlight"><pre><span></span><code><span class="n">iris</span> <span class="o">=</span> <span class="n">datasets</span><span class="o">.</span><span class="n">load_iris</span><span class="p">()</span>
<span class="n">svm_model</span> <span class="o">=</span> <span class="n">svm</span><span class="o">.</span><span class="n">SVC</span><span class="p">(</span><span class="n">gamma</span><span class="o">=</span><span class="mf">0.001</span><span class="p">,</span> <span class="n">C</span><span class="o">=</span><span class="mf">100.0</span><span class="p">)</span>
<span class="n">svm_model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">iris</span><span class="o">.</span><span class="n">data</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">],</span> <span class="n">iris</span><span class="o">.</span><span class="n">target</span><span class="p">[:</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span>
<span class="n">dir_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">realpath</span><span class="p">(</span><span class="vm">__file__</span><span class="p">))</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dirpath</span><span class="p">,</span> <span class="s2">"model_files"</span><span class="p">,</span> <span class="s2">"svc_model.pickle"</span><span class="p">),</span> <span class="s1">'wb'</span><span class="p">)</span>
<span class="n">pickle</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">svm_model</span><span class="p">,</span> <span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
</code></pre></div>
<p>Now that we have a trained model, we can write the class that will inherit from MLModel and make predictions:</p>
<div class="highlight"><pre><span></span><code><span class="k">class</span> <span class="nc">IrisSVCModel</span><span class="p">(</span><span class="n">MLModel</span><span class="p">):</span>
<span class="sd">""" A demonstration of how to use """</span>
<span class="n">input_schema</span> <span class="o">=</span> <span class="n">Schema</span><span class="p">({</span><span class="s1">'sepal_length'</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span>
<span class="s1">'sepal_width'</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span>
<span class="s1">'petal_length'</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span>
<span class="s1">'petal_width'</span><span class="p">:</span> <span class="nb">float</span><span class="p">})</span>
<span class="c1"># the output of the model will be one of three strings</span>
<span class="n">output_schema</span> <span class="o">=</span> <span class="n">Schema</span><span class="p">({</span><span class="s1">'species'</span><span class="p">:</span> <span class="n">Or</span><span class="p">(</span><span class="s2">"setosa"</span><span class="p">,</span>
<span class="s2">"versicolor"</span><span class="p">,</span>
<span class="s2">"virginica"</span><span class="p">)})</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">dir_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">realpath</span><span class="p">(</span><span class="vm">__file__</span><span class="p">))</span>
<span class="n">file</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir_path</span><span class="p">,</span> <span class="s2">"model_files"</span><span class="p">,</span> <span class="s2">"svc_model.pickle"</span><span class="p">),</span> <span class="s1">'rb'</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
<span class="n">file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
<span class="c1"># calling the super method to validate against the</span>
<span class="c1"># input_schema</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">data</span><span class="p">)</span>
<span class="c1"># converting the incoming dictionary into a numpy array</span>
<span class="c1"># that can be accepted by the scikit-learn model</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">array</span><span class="p">([</span><span class="n">data</span><span class="p">[</span><span class="s2">"sepal_length"</span><span class="p">],</span>
<span class="n">data</span><span class="p">[</span><span class="s2">"sepal_width"</span><span class="p">],</span>
<span class="n">data</span><span class="p">[</span><span class="s2">"petal_length"</span><span class="p">],</span>
<span class="n">data</span><span class="p">[</span><span class="s2">"petal_width"</span><span class="p">]])</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="c1"># making the prediction</span>
<span class="n">y_hat</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_svm_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X</span><span class="p">)[</span><span class="mi">0</span><span class="p">])</span>
<span class="c1"># converting the prediction into a string that will match</span>
<span class="c1"># the output schema of the model, this list will map the</span>
<span class="c1"># output of the scikit-learn model to the string expected by</span>
<span class="c1"># the output schema</span>
<span class="n">targets</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'setosa'</span><span class="p">,</span> <span class="s1">'versicolor'</span><span class="p">,</span> <span class="s1">'virginica'</span><span class="p">]</span>
<span class="n">species</span> <span class="o">=</span> <span class="n">targets</span><span class="p">[</span><span class="n">y_hat</span><span class="p">]</span>
<span class="k">return</span> <span class="p">{</span><span class="s2">"species"</span><span class="p">:</span> <span class="n">species</span><span class="p">}</span>
</code></pre></div>
<p>One useful thing about using the schema package for building the input
and output schemas of the model is that it supports exporting the schema
in the JSON schema format:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="n">model</span> <span class="o">=</span> <span class="n">IrisSVCModel</span><span class="p">()</span>
<span class="o">>>></span> <span class="nb">print</span><span class="p">(</span><span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">input_schema</span><span class="o">.</span><span class="n">json_schema</span><span class="p">(</span><span class="s2">"https://example.com/my-schema.json"</span><span class="p">)))</span>
<span class="p">{</span><span class="s2">"type"</span><span class="p">:</span> <span class="s2">"object"</span><span class="p">,</span> <span class="s2">"properties"</span><span class="p">:</span> <span class="p">{</span><span class="s2">"sepal_length"</span><span class="p">:</span> <span class="p">{</span><span class="s2">"type"</span><span class="p">:</span> <span class="s2">"number"</span><span class="p">},</span> <span class="s2">"sepal_width"</span><span class="p">:</span> <span class="p">{</span><span class="s2">"type"</span><span class="p">:</span> <span class="s2">"number"</span><span class="p">},</span>
<span class="o">...</span>
<span class="o">...</span>
</code></pre></div>
<h2>Conclusion</h2>
<p>In this post I showed a few different approaches to deploying ML model
code to production systems. I also showed an implementation of a Python
base class that brings together the best features of the different
approaches discussed. In conclusion I will discuss some of the benefits
of the approach I sketched out above.</p>
<p>The MLModel base class has very few dependencies. it does not require
the model creator to use Pandas, numpy, or any other Python package to
transfer data to the model. This also means that it does not force the
user of the model to know any internal implementation details about the
model. On the other hand, Uber's solution requires that the user of the
model know how to work with Pandas dataframes. However, if the model
creator still wishes to accept numpy arrays or Pandas dataframes to
their model, the MLModel base class shown above still allows this.</p>
<p>By using python dictionaries for model input and output, the model is
easier to use. There is no need to understand how to use numpy arrays or
Pandas dataframes, remember the order of the columns, or know how the
output columns areencoded in order to use the model.</p>
<p>By stating the input and output schemas of a model programmatically, it
is possible to compare different model's schemas through automated
tools. This can be useful when tracking model changes across many
different versions of a model. Facebook's approach allows schema
metadata to be attached to ML models, but no other approach discussed
above does this.</p>
<p>By hiding the deserialization code behind the __init__() method, the
deserialization technique or the storage location of model files can be
changed without affecting the code that uses the model. In the same way,
I can replace the code in the predict() method without affecting the
user of the model, as long as the input and output schemas remain the
same. This is the benefit of using Object Oriented Programming to hide
implementation details from users of your code.</p>
<p>There are some other improvements that can be added to the MLModel base
class shown in this post, but these will be shown in a later blog post.</p>