case study

Adding Observability to with Transaction Monitoring and

My Role
Security & Application Review Enterprise
1 month

Observability with a Bubble App is not easy.  Ed and the team at Habitude have built a workflow system on top of Bubble, using custom data types to describe generic workflow processes that then use Bubble’s easy integration capabilities to integrate Enterprise systems. It’s kind of Zapier on steroids, but much much more and targeted to running Enterprise Admin problems. So a lot of long-running business-critical processes, with numerous inter-system dependencies. So a great example of where Observability is vital.

The issues with observability on  Bubble are

  • The built-in Logs have a short retention period - 14 days on a professional plan
  • There is no way to ship Bubble logs off for analysis, event detection and alerting
  • Using Bubble’s inbuilt database for event logging, analysis and alerting is resource intensive - it is entirely possible to overload your Bubble app with monitoring for an overloaded app.

With Bubble, it is possible for workflows to fail mid-flow, for multiple reasons, for reasons you have no sight of or information on. There is no “system-log” with Bubble - all you have is “failed - try again”. So this is what we had to overcome in the design.

Additionally, Bubble has no concept of transactions or rollback. Workflows aren’t synchronous and sequential.

The design approach we took with Habitude was two parts;

We created a transaction system for critical workflows. Critical workflows log into a separate Bubble app  “start process” and at the end of the process (which could occur asynchronously in another Backend workflow) a “completed process”. And log with it all the information needed to diagnose and retry the transaction. Within this transaction monitoring app, we detect failed transaction events that we send off to 

We use to send log events out of Bubble - trap all workflow errors, trap all plugin errors - this stops us logging stuff into the DB and creating more load in our main applications.  We use this to get observability of errors that would otherwise be silent workflow failures.

Then within Logzio we centrally analyse and alert on all these events. We chose because of its reasonable and predictable pricing.

We found that there are more silent workflow failures than we guessed - we picked up a lot of edge cases during testing that would have otherwise been invisible.