254 lines
8.3 KiB
Markdown
254 lines
8.3 KiB
Markdown
# `slingshot-microservice`: A Rust framework for standard microservice design
|
||
|
||

|
||
|
||
`slingshot-microservice` is a Rust package that provides a simple, opinionated
|
||
framework for building microservices. The framework makes the following
|
||
assumptions about a microservice:
|
||
|
||
1. A microservice listens to incoming requests on its own dedicated and
|
||
singular queue (RabbitMQ).
|
||
2. Incoming requests are in the form of a 64-bit unsigned integer (`u64`).
|
||
2. Microservices process requests via a `process` function, which takes four
|
||
arguments: the incoming request (`u64`), a `read_file` function, a
|
||
`write_file` function, and a database ORM `connection`.
|
||
3. All microservices must communicate with the shared PostgreSQL database via
|
||
an ORM connection passed into `process`.
|
||
- Rust microservices use `diesel::pg::PgConnection`.
|
||
- Python microservices use `sqlalchemy.engine.base.Connection`.
|
||
3. The `process` function returns a set of IDs (also `u64`) that are the result
|
||
of processing the incoming request. Each of these IDs is also associated
|
||
with a "case variable" that is used for routing the result to the
|
||
appropriate outbound queues. Case variables for routing must be one of:
|
||
boolean, integer, or string.
|
||
4. Rather than hard-coding the inbound and outbound queues, the
|
||
microservice communicates with a self-contained configuration service shared
|
||
across all microservices.
|
||
- This service provides inbound queue name, as well as any outbound queues
|
||
and their corresponding case variables.
|
||
- It is also responsible for providing the RabbitMQ connection details
|
||
(host, port, username, password), and the object-storage host plus GNU
|
||
`pass` references for the S3 access key and secret key.
|
||
|
||
The `slingshot-microservice` framework handles setting up the RabbitMQ
|
||
connection, listening to the inbound queue and routing results based on case variables.
|
||
|
||
## Adding The Framework To Your Project
|
||
|
||
Add `slingshot-microservice` to your `Cargo.toml` dependencies directly from Codeberg:
|
||
|
||
```toml
|
||
[dependencies]
|
||
slingshot-microservice = { git = "https://codeberg.org/seanhly/slingshot-microservice" }
|
||
```
|
||
|
||
Then fetch and build dependencies:
|
||
|
||
```bash
|
||
cargo build
|
||
```
|
||
|
||
## Python Usage
|
||
|
||
`slingshot-microservice` ships Python bindings built with
|
||
[PyO3](https://pyo3.rs) and [maturin](https://www.maturin.rs). Pre-built
|
||
ABI3 wheels work on Python ≥ 3.8 without requiring Rust locally.
|
||
|
||
### Installing
|
||
|
||
**From PyPI** (once published):
|
||
|
||
```bash
|
||
pip install slingshot-microservice
|
||
```
|
||
|
||
**From git** (Rust toolchain required):
|
||
|
||
```bash
|
||
pip install git+https://codeberg.org/seanhly/slingshot-microservice
|
||
```
|
||
|
||
**From a local clone** (for development):
|
||
|
||
```bash
|
||
pip install maturin
|
||
pip install -e .
|
||
```
|
||
|
||
### Usage
|
||
|
||
```python
|
||
from typing import Generator
|
||
from sqlalchemy.engine.base import Connection
|
||
|
||
from slingshot_microservice.typing import ReadFileFn, WriteFileFn
|
||
from slingshot_microservice import Microservice
|
||
|
||
|
||
def process(
|
||
request: int,
|
||
read_file: ReadFileFn,
|
||
write_file: WriteFileFn,
|
||
connection: Connection,
|
||
) -> Generator[tuple[int, bool | int | str], None, None]:
|
||
reader = read_file("in", request)
|
||
input_data = reader.read().decode()
|
||
|
||
writer = write_file("out", request)
|
||
writer.write(f"Hello {input_data}".encode())
|
||
|
||
yield (request, True)
|
||
|
||
|
||
microservice = Microservice("simple-py-microservice", "sys-map.slingshot.cv", process)
|
||
microservice.start()
|
||
```
|
||
|
||
### Type Annotations
|
||
|
||
`slingshot_microservice.typing` exports `Protocol`-based types for use in
|
||
editors and type-checkers:
|
||
|
||
| Symbol | Description |
|
||
|---|---|
|
||
| `ReadFileFn` | Callable returned by `read_file(key, id)` – behaves like `BinaryIO` |
|
||
| `WriteFileFn` | Callable returned by `write_file(key, id)` – behaves like `BinaryIO` |
|
||
| `ProcessFn` | The generator signature expected by `Microservice` with `(request, read_file, write_file, connection)` |
|
||
| `CaseVariable` | `bool \| int \| str` – valid case variable types |
|
||
|
||
### Publishing Wheels
|
||
|
||
Build and upload to PyPI using maturin:
|
||
|
||
```bash
|
||
pip install maturin
|
||
maturin publish
|
||
```
|
||
|
||
For CI/cross-compilation (Linux, macOS, Windows), use
|
||
[maturin-action](https://github.com/PyO3/maturin-action) in GitHub/Codeberg
|
||
Actions. Because the extension is compiled with ABI3 (`abi3-py38`), a single
|
||
Linux wheel covers all CPython versions ≥ 3.8.
|
||
|
||
## Example Usage
|
||
|
||
```rust
|
||
use slingshot_microservice::Microservice;
|
||
use diesel::pg::PgConnection;
|
||
use slingshot_microservice::{AnyError, ReadFileFn, WriteFileFn};
|
||
use std::io::{Read, Write};
|
||
|
||
fn process(
|
||
request: u64,
|
||
read_file: &ReadFileFn,
|
||
write_file: &WriteFileFn,
|
||
connection: &mut PgConnection,
|
||
) -> Result<Vec<(u64, String)>, AnyError> {
|
||
let mut input = String::new();
|
||
let mut reader = read_file("in", request)?;
|
||
reader.read_to_string(&mut input)?;
|
||
|
||
let mut writer = write_file("out", request)?;
|
||
writer.write_all(input.as_bytes())?;
|
||
|
||
Ok(vec![(request, "case_a".to_string())])
|
||
}
|
||
|
||
fn main() {
|
||
// Create a new microservice instance with the processing function
|
||
let microservice = Microservice::new(
|
||
"simple-microservice",
|
||
"sys-map.example.com",
|
||
process
|
||
);
|
||
|
||
// Start the microservice (this will block and listen for incoming requests)
|
||
microservice.start();
|
||
}
|
||
```
|
||
|
||
## How it works:
|
||
|
||
The configuration service responds to requests of the form:
|
||
`https://{HOSTNAME}/{MICROSERVICE_NAME}`. All configuration is done over HTTP
|
||
GET. The response contains a JSON object with two fields: an inbound queue name
|
||
and a mapping of case variables to outbound queue names. For example:
|
||
|
||
```json
|
||
{
|
||
"in": "simple-microservice-inbound",
|
||
"out": [
|
||
{
|
||
"case": "case_a",
|
||
"queues": ["case_a_outbound_1", "case_a_outbound_2"]
|
||
},
|
||
{
|
||
"case": "case_b",
|
||
"queues": ["case_b_outbound"]
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
The case variables used for routing can be one of: string, integer, or boolean.
|
||
E.g. a binary classification microservice might decide on which outbound queue
|
||
to send results to based on a case variable that is either `false` or `true`:
|
||
|
||
```json
|
||
{
|
||
"in": "binary-classification-inbound",
|
||
"out": [
|
||
{
|
||
"case": false,
|
||
"queues": ["binary-classification-false-outbound"]
|
||
},
|
||
{
|
||
"case": true,
|
||
"queues": ["binary-classification-true-outbound"]
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
The configuration service also provides the RabbitMQ connection details (host,
|
||
port, etc.):
|
||
|
||
Object storage credentials are fetched separately from
|
||
`https://sys-map.slingshot.cv/object-storage`. The access-key and secret-key
|
||
values returned there are GNU `pass` entry names, so the runtime resolves the
|
||
actual secrets with `pass show <key>` before constructing the S3 client.
|
||
|
||
When the microservice first starts up, it makes a request to the configuration
|
||
service to get the queue metadata. Then it starts to listen to the inbound
|
||
queue. Inbound requests are processed by the user-programmed `process`
|
||
function, which is called with `(request, read_file, write_file, connection)`
|
||
and returns a set of tuples of the form `(result_id, case_variable)`.
|
||
|
||
Within each `process` pass:
|
||
|
||
1. `read_file(key, id)` treats `key` as a bucket reference such as `in`, not
|
||
as the canonical bucket name. On first use, the runtime fetches
|
||
`https://{HOSTNAME}/{MICROSERVICE_NAME}/{key}` to resolve the real bucket
|
||
name, caches that mapping, and then returns a synchronous reader for object
|
||
`id` in that bucket using the AWS SDK.
|
||
2. `write_file(key, id)` resolves `key` through the same cached lookup and
|
||
returns an opened local file handle for writing, staging the output for
|
||
`s3://{resolved_bucket}/{id}`.
|
||
3. `connection` is an ORM-backed PostgreSQL connection passed into `process`
|
||
(`diesel::pg::PgConnection` in Rust, `sqlalchemy.engine.base.Connection`
|
||
in Python).
|
||
4. After `process` returns, opened files are closed.
|
||
5. Then staged write files are uploaded to S3 with the AWS SDK, local staged
|
||
files are deleted, and local temporary directories are removed.
|
||
6. Only after file finalization is complete are output IDs published to
|
||
outbound queues.
|
||
|
||
The output queue routing step looks like this:
|
||
|
||
Peudocode:
|
||
```
|
||
for each (result_id, case_variable) in process(request, read_file, write_file, connection):
|
||
for each outbound_queue in config.out[case_variable]:
|
||
send result_id to outbound_queue
|
||
```
|