Other articles

  1. Load Tests for ML Models

    Thu 01 September 2022

    In a previous blog post we showed how to create a RESTful model service for a machine learning model that we want to deploy. A common requirement for RESTful services is to be able to be able to continue working while being used by many users at the same time. In this blog post we'll show how to create a load testing script for an ML model service.

    read more
  2. Caching for ML Model Deployments

    Wed 10 August 2022

    In a software system, a cache is a data store that is used to temporarily store computation results or frequently-accessed data. When accessing the results of a computation from a cache, we are able to avoid paying the cost of recomputing the result. When accessing a frequently accessed piece of data we are able to avoid paying the cost of accessing the data from a slower data store. This type of caching is used when accessing data from a slower data store than the cache. When a cache hit occurs, the data being sought is found and returned to the caller. When a “miss” occurs, the data is not found and must be recomputed or accessed from the slower data store by the caller. A data cache is generally built using storage that has low latency, which means that it is more expensive to run. Machine learning model deployments can benefit from caching because making predictions with a model can be a CPU-intensive process, especially for large and complex models. Predictions that take a long time to make can be cached and returned later when the same prediction is requested. This type of caching is also known as memoization. Another reason that a prediction can take a long time to create is if data enrichment is needed. Data enrichment is the process of adding fields to a model's input from a data store before a prediction is made, this process can add latency to the prediction and can benefit from caching.

    read more
  3. Data Enrichment for ML Model Deployments

    Sun 01 May 2022

    Machine learning models need data to make predictions. When deploying a model to a production setting, this data is not necessarily available from the client system that is requesting the prediction. When this happens, some other source is needed for the data that is required by the model but not provided by the client system. The process of accessing the data and joining it to the client's prediction request is called data enrichment. In all cases, the model itself should not need to be modified in order to do data enrichment, the process should be transparent to the model. In this blog post, we'll show a method for doing data enrichment that does not require the model itself to be modified.

    read more
  4. Decorator Pattern for ML Models

    Sun 27 February 2022

    The decorator pattern is a software engineering pattern that allows software to be more flexible, more reusable, and more cohesive. In this blog post, we’ll explore how decorators work, how to implement them, and how to apply them to the MLModel base class.

    read more

Page 1 / 5 »

social