1. 10 Ways to Deploy a Machine Learning Model

    Wed 28 October 2020

    In previous blog posts we've seen how it is possible to deploy the same model in ten different ways. The model itself was developed one time and released as a package, which was then used in each deployment. These blog posts started as an exercise in finding new and interesting ways to deploy an ML model, so we decided to write this blog post about some of the things that we've learned along the way. In order to be able to deploy the same model in 10 different ways, we needed to build the model so that it was not incompatible with all the different ways we wanted to deploy it. We also needed to make it easy to install and to make sure that the model published metadata about itself. All of these features of the model became very important once we needed to deploy it into a real software system.

    read more
  2. An Apache Beam ML Model Deployment

    Fri 31 July 2020

    Data processing pipelines are useful for solving a wide range of problems. For example, an Extract, Transform, and Load (ETL) pipeline is a type of data processing pipeline that is used to extract data from one system and save it to another system. Inside of an ETL, the data may be transformed and aggregated into more useful formats. ETL jobs are useful for making the predictions made by a machine learning model available to users or to other systems. The ETL for such an ML model deployment lookslike this: extract features used for prediction from a source system, send the features to the model for prediction, and save the predictions to a destination system. In this blog post we will show how to deploy a machine learning model inside of a data processing pipeline that runs on the Apache Beam framework.

    read more
  3. A ZeroRPC ML Model Deployment

    Mon 04 May 2020

    There are many different ways for two software processes to communicate with each other. When deploying a machine learning model, it's often simpler to isolate the model code inside of its own process. Any code that needs to use the model to make predictions then needs to communicate with the process that is running the model code to make predictions. This approach is easier than embedding the model code in the process that needs the predictions because it saves us the trouble of recreating the model's algorithm in the programming language of the process that needs the predictions. RPC calls are also used widely to connect code that is executing in different processes. In the last few years, the rise in popularity of microservice architectures has also caused the rise in popularity of RPC for integrating systems.

    read more
  4. A Websocket ML Model Deployment

    Sat 04 April 2020

    In the world of web applications, the ability to create responsive and interactive experiences is limited when we do normal request-response requests against a REST API. In the request-response programming paradigm, requests are always initiated by the client system and fulfilled by the server and continuously sending and receiving data is not supported. To fix this problem, the Websocket standard was created. Websockets allow a client and service to exchange data in a bidirectional, full-duplex connection which stays open for a long period of time. This approach offers much higher efficiency in the communication between the server and client. Just like a normal HTTP connection, Websockets work in ports 80 and 443 and support proxies and load balancers. Websockets also allow the server to send data to the client without having first received a request from the client which helps us to build more interactive applications.

    read more
  5. A MapReduce ML Model Deployment

    Sun 23 February 2020

    Because of the growing need to process large amounts of data across many computers, the Hadoop project was started in 2006. Hadoop is a set of software components that help to solve large scale data processing problems using clusters of computers. Hadoop supports mass data storage through the HDFS component and large scale data processing through the MapReduce component. Hadoop clusters have become a central part of the infrastructure of many companies because of their usefulness. In this blog post, we'll focus on the MapReduce component of Hadoop since we will be deploying a machine learning model, which is a compute-intensive process. MapReduce is a programming framework for data processing which is useful for processing large amounts of distributed data. MapReduce is able to handle errors and failures in the computation. MapReduce is also inherently parallel in nature but abstracts out that fact, making the code look like single-process code.

    read more

« Page 3 / 5 »