Brian Schmidt

Caching for ML Model Deployments

Wed 10 August 2022

In a software system, a cache is a data store that is used to temporarily store computation results or frequently-accessed data. When accessing the results of a computation from a cache, we are able to avoid paying the cost of recomputing the result. When accessing a frequently accessed piece of data we are able to avoid paying the cost of accessing the data from a slower data store. This type of caching is used when accessing data from a slower data store than the cache. When a cache hit occurs, the data being sought is found and returned to the caller. When a “miss” occurs, the data is not found and must be recomputed or accessed from the slower data store by the caller. A data cache is generally built using storage that has low latency, which means that it is more expensive to run. Machine learning model deployments can benefit from caching because making predictions with a model can be a CPU-intensive process, especially for large and complex models. Predictions that take a long time to make can be cached and returned later when the same prediction is requested. This type of caching is also known as memoization. Another reason that a prediction can take a long time to create is if data enrichment is needed. Data enrichment is the process of adding fields to a model's input from a data store before a prediction is made, this process can add latency to the prediction and can benefit from caching.
read more
Data Enrichment for ML Model Deployments

Sun 01 May 2022

Machine learning models need data to make predictions. When deploying a model to a production setting, this data is not necessarily available from the client system that is requesting the prediction. When this happens, some other source is needed for the data that is required by the model but not provided by the client system. The process of accessing the data and joining it to the client's prediction request is called data enrichment. In all cases, the model itself should not need to be modified in order to do data enrichment, the process should be transparent to the model. In this blog post, we'll show a method for doing data enrichment that does not require the model itself to be modified.
read more
Decorator Pattern for ML Models

Sun 27 February 2022

The decorator pattern is a software engineering pattern that allows software to be more flexible, more reusable, and more cohesive. In this blog post, we’ll explore how decorators work, how to implement them, and how to apply them to the MLModel base class.
read more
Property-Based Testing for ML Models

Fri 03 September 2021

Property-based testing is a form of software testing that allows developers to write more comprehensive tests for software components. Property-based tests work by asserting that certain properties of the software component under test hold over a wide range of inputs. Property-based tests rely on the generation of inputs for a component and are a form of generative testing. When doing property-based testing it is useful to think in terms of invariants within the software component that we are testing. An invariant is a condition or assumption that we expect will never be violated by the component.
read more
Training and Deploying an ML Model

Thu 15 July 2021

This post is a collection of several different techniques that I wanted to learn. In this blog post I'll be using open source python packages to do automated data exploration, automated feature engineering, automated machine learning, and model validation. I'll also be using docker and kubernetes to deploy the model. I'll cover the entire codebase of the model, from the initial data exploration to the deployment of the model behind a RESTful API in Kubernetes.
read more

⇇ « Page 2 / 5 » ⇉

social