Interesting thread. We're building Markhub, a B2B collaboration SaaS, and we have a pragmatic, hybrid approach to our observability stack. Our philosophy is: use open source where it gives us control and flexibility, and use managed SaaS where it saves us time and engineering overhead.
Here's our stack:
Logs: We self-host a simple stack using Fluentd to collect logs, which are then shipped to Elasticsearch for storage and analysis. It's powerful, gives us full control over our data, and is more cost-effective at our scale than a managed logging service.
Metrics & Monitoring: For this, we decided not to reinvent the wheel. We use Datadog. The out-of-the-box dashboards, alerting, and deep integration with our cloud provider (AWS/GCP) save our small team hundreds of hours. The cost is justified by the engineering time we save.
Traces: We're currently using Datadog's APM for tracing as well, as it's tightly integrated. However, we're actively exploring moving to OpenTelemetry for more vendor neutrality in the future.
It's working out well. The key has been to be honest about our most valuable resource: engineering time. We self-host where we have a clear need for control (logs), and we pay for a service where the platform provides undeniable value and speed (metrics/monitoring).