This blog post builds on the ideas started in three previous blog posts.

In this blog post I'll show how to deploy the same ML model that l deployed as a batch job in this blog post, as a task queue in this blog post, and inside an AWS Lambda in this blog post.

The code in this blog post can be found in this github repo.

Introduction

In general, when a client communicates with a software service two patterns are available: synchronous and asynchronous communication. When doing synchronous communication, a message is sent to the service which blocks the sender until the operation is done and the result is returned to the client. With an asynchronous message, the service receives the message and does not block the sender of the message while it does the processing. We've already seen an asynchronous deployment for a machine learning model in a previous blog post. In this blog post, we'll show a similar type of deployment that is useful in different situations. We'll be focusing on deploying an ML model as part of a stream processing system.

Stream processing is a data processing paradigm that treats a dataset as an unending stream of ordered records. A stream processor works by receiving a record from a data stream, processing it, and putting it in another data stream. This approach is different from batch processing, in which a process sees a data set as a batch of records that are processed together in one processing run. Stream processing is inherently asynchronous, since a producer of records does not have to coordinate with the process that consumes the records.

In order for a stream processor to receive messages from producers, a message broker is often used. In this case, the message broker acts as middleware that enables producers and consumers to communicate without being explicitly aware of each other. The message broker allows the system to be more decoupled than in other types of software architectures.

In a previous blog post, we used Redis as a message broker to deploy a model inside a task queue. One thing that is different about the current blog post and that one is the lack of a result backend, since we are not going to store the results of a prediction into a result store for later retrieval. The ML model stream processor we'll build will pick up data used for prediction from the message broker and put the resulting predictions back into the message broker. Instead of Redis, we'll be using Kafka as the message broker.

Software Architecture

Architecture

The model stream processor application we will build will communicate with other software components through topics on a message broker. A topic is a channel of communication that exists in a message broker. A software service can "produce" messages to a topic and also "consume" messages from a topic. Each model will need three topics for its own use: an input topic from which it will receive data used to make predictions, an output topic to which it will write the prediction results, and an error topic to which it will write any input messages that caused an error to occur. The error topic is essentially an invalid message channel for the model.

Kafka for Stream Processing

To show how to deploy an ML model as a stream processor, we'll be using Kafka as the message broker service. Over the last few years, Kafka has become an important tool for doing stream processing because of its high performance and rich tool ecosystem.

To connect to Kafka from python, we'll use the aiokafka python library. This library can be used to produce and consume messages on kafka as well as other operations. The aiokafka library uses the asyncio library to improve the performance of the application. Asyncio is a new library in python that helps to write concurrent code that performs IO-bound operations in a more performant manner. The async/await syntax will appear in the code of this blog post, I won't go out of my way to explain it since there are many better places to learn about this programming paradigm.

Package Structure

-   model_stream_processor
    -   __init__.py
    -   app.py (application code)
    -   config.py (configuration for the application)
    -   ml_model_stream_processor.py (MLModel stream processor class)
    -   model_manager.py (model manager singleton class)
-   scripts
    -   create_topics.py (script for automating topic creation)
    -   receive_messages.py (script for receiving messages from a topic)
    -   send_messages.py (script for sending messages to a topic)
-   tests (unit test suite)
-   Makefile
-   README.md
-   docker-compose.yml
-   requirements.txt
-   setup.py
-   test_requirements.txt

This structure can be seen in the github repository.

MLModelStreamProcessor Class

To be able to have an MLModel that sends and receives data from Kafka topics, we'll write a class that wraps around an MLModel instance. The class will take care of finding and connecting to Kafka brokers, serializing and deserializing the messages from Kafka, and detecting errors.

We'll start by creating the class:

class MLModelStreamProcessor(object):
    """Processor class for MLModel stream processors."""