Warning alert:We're building our new website - with more examples, tips and docs! Some sections though might still be incomplete or under review. When browsing, keep an eye for progress notes and ways to get updated on particular topics.

OpenTelemetry at mishmash io

  • Here you can find open source code that we developed because we found it useful in our own work. We share it with the open source community because we believe it might be useful for your own software development effort too.

    The projects here are not directly related to our distributed database. See our integrations section for open source that you can use along with mishmash io.

OpenTelemetry  is a collection of tools to instrument, generate and export telemetry data - metrics, logs and traces.

Collecting exported telemetry enables deep observability of software systems and at mishmash io we rely heavily on OpenTelemetry when testing our own code or when monitoring production run-times. It is an invaluable source of information on how our code works.

Info alert:About

Over time we've developed additional tools to help us collect, process and visualize telemetry data.

On these pages we share our OpenTelemetry-related tools and their source code along with a few ideas on how you can use them in your software development process.

Scenarios

Following are a few quick examples on how you can use OpenTelemetry and our open source tools.

Their intent is to give you an overview before getting deeper into each individual topic.

First steps

If you've never used OpenTelemetry before begin by collecting some data and get a feel for what's in it.

Easiest way to get started is to use an existing app, like an app you're currently working on, instrument it with an OpenTelemetry agent (or plugin) that will export signals (like logs, metrics and traces) and collect these signals with our simple stand-alone Parquet server.

Info alert:Zero changes to your code

OpenTelemetry supports auto-instrumentation - agents (or other tools) that can attach themselves to a running program and automatically collect telemetry as code is executed. It can aslo 'understand' when your app uses popular frameworks like logging or REST client and server methods without you having to change anything in your app code.

Jump to how to auto-instrument Java , Python  and JavaScript . For other languages go to OpenTelemetry documentation. 

Here's a quick view on how telemetry of your app is exported (via a network protocol called 'OTLP') and then saved to Apache Parquet  files by our simple stand-alone server:

Once you've collected logs, metrics and traces files, exploring their contents is easy with our repackaged Apache Drill  and Apache Superset  container images. We have extended the original images with telemetry-related functionalities like UDFs, data source queries, charts and dashboards:

Default alert:What are the OpenTelemetry signals exactly?

On this page we're not going to document OpenTelemetry itself. Our focus is to give you an idea on how we use it and what OpenTelemetry-related tools we share as open source.

If you would like to understand the individual OpenTelemetry signals though here are a few links to the official docs: Logs , Metrics  and Traces. 

Collecting telemetry from unit tests

Running unit tests on your code changes is a great way of ensuring you're not breaking existing functionality and thus secures the stability of your new version. When it comes to it, at mishmash io we like to go a bit further - by also ensuring there's no degradation in performance or observability, both of which are very imporant in production environments.

Such issues might be tricky to detect, especially in a distributed system - where you can't be sure if changes done to one component won't affect negatively another component running in a separate process, container or server.

OpenTelemetry, with its metrics, logs and spans signals is a great tool to collect the data you need to make sure your new code is not breaking your production-level standards, even across processes, conatainers and servers. On top of that - you can also monitor your tests over time and get a better understanding of the direction your development is going in.

At mishmash io we've combined telemetry collection with another great open source tool - Testcontainers  which allows running your tests along with all other remote services they need.

Continuous Telemetry for teams

Collecting telemetry on development laptops might help each individual, but when teams employ Continuous Integration/Continous Delivery practices - storing data over time and sharing among all developers becomes more and more important.

Default alert:What telemetry means for software companies

Company-wide access to telemetry helps with:

  • On-boarding new team members, as they can learn more quickly how your systems work
  • Mapping overall trends in your software's quality and your teams' progress
  • Finding potential areas for improvement
  • Iterating quicker, as telemetry data easily proves or disproves the feasability of software features

However, one technical challenge with continuous telemetry is the large amount of data that is produced and accumulated over time. As we also use quite a lot of the Apache Big Data stack - we developed tools to ingest, process and explore OpenTelemetry data at scale with various Apache projects.

Examples of our open source are embeddable data sources that run within Apache-project clusters, like ingesting directly to Apache Druid: 

Or a data source for Apache Pulsar  that allows scalable pre-processing and pipelining of the data:

Next steps

© 2024, Mishmash I O UK Ltd. or its affiliates. All rights reserved. | Privacy Policy | Cookies