DearDiary.jl: A lightweight but powerful machine learning experiment tracking tool for Julia

#mlops #tracking #workflow #restapi

After months of planning, and some weeks of development, the package is finally here and ready for use! As a solution for tracking machine learning experiments in Julia, DearDiary.jl aims to be lightweight, easy to use, and flexible enough to adapt to different workflows.

Motivation

After the unpleasant experience trying to maintain an interface for the REST API from Python's MLFlow (MLFlowClient.jl, MLJFlow.jl), after finding out that it is poorly documented, incomplete, and has some abandoned or partially implemented features (and they are still adding new ones...); an idea came to my mind: why not write the same API but well designed and documented but in Julia? This package is that idea.

Core concepts

Architecture-first

Unlike the common monolithic architecture found in many Julia packages, my goal was to implement something that can be easily maintained and extended over time, focusing on developer experience and code readability (inspired by Alan Edelman's TED talk and MLJ.jl "micro-package" architecture).

DearDiary.jl consists of an N-layered architecture, which bring us the possibility of encapsulating different functionalities to allow better collaboration and separation of concerns.
Now it is composed of the following layers:

Repository layer: responsible for data storage and retrieval.
Service layer: handles package logic and data processing.
Route layer: manages RESTful API endpoints and HTTP requests.

And having the idea of implementing a frontend layer in the future.

Simple types

One of the problems I found while working on the integration project was the overuse of complex types. Imagine a type that has a field that is another type with field that is another type that has a field with an integer. Well, that's real and you can find it if you are curious enough.
DearDiary.jl tries to avoid that by keeping types simple and flat, totally immutable, and clear as possible. Never search for complexity when you don't need it.

Flexible by design

DearDiary.jl is flexible enough to adapt to different workflows. You can use it as a standalone package, or integrate it with other tools in your ML pipeline, or call it from the "outside world" via its RESTful API.
In the case something is not implemented in the way you want it, you can always modify or extend it, thanks to its modularity.

Portability

DearDiary.jl is designed to be portable. You can run it locally, on a server, or in the cloud. Thanks to SQLite as the default storage backend, you can easily move your projects between different environments without worrying about compatibility issues.

Note: one of the main goals for next releases is to support more storage backends, coming from SQL and NoSQL databases, or cloud storage solutions.

Getting started

A Tutorial is available in the documentation. It covers installation, and a workflow example with MLJ.jl.

Contributing

Contributions are welcome! If you find a bug or have a feature request, please open an issue on the GitHub repository. Pull requests are also encouraged. Please make sure to follow the existing code style and include tests for any new features.

Julia Community 🟣