Telemetry & Observability for Elixir Apps at Cars.com with Zack Kayser & Ethan Gunderson

About this Episode

Published December 12, 2024 | Duration: 42:39 | RSS Feed | Direct download
Transcript: English

Zack Kayser and Ethan Gunderson, Software Engineers at Cars Commerce, join the Elixir Wizards to share their expertise on telemetry and observability in large-scale systems. Drawing from their experience at Cars.com—a platform handling high traffic and concurrent users—they discuss the technical and organizational challenges of scaling applications, managing microservices, and implementing effective observability practices.

The conversation highlights the pivotal role observability plays in diagnosing incidents, anticipating system behavior, and asking unplanned questions of a system. Zack and Ethan explore tracing, spans, and the unique challenges introduced by LiveView deployments and WebSocket connections.

They also discuss the benefits of OpenTelemetry as a vendor-agnostic instrumentation tool, the significance of Elixir’s telemetry library, and practical steps for developers starting their observability journey. Additionally, Zack and Ethan introduce their upcoming book, Instrumenting Elixir Applications, which will offer guidance on integrating telemetry and tracing into Elixir projects.

Topics Discussed:

  • Cars.com’s transition to Elixir and scaling solutions
  • The role of observability in large-scale systems
  • Uncovering insights by asking unplanned system questions
  • Managing high-traffic and concurrent users with Elixir
  • Diagnosing incidents and preventing recurrence using telemetry
  • Balancing data collection with storage constraints
  • Sampling strategies for large data volumes
  • Tracing and spans in observability
  • LiveView’s influence on deployments and WebSocket behavior
  • Mitigating downstream effects of socket reconnections
  • Contextual debugging for system behavior insights
  • Observability strategies for small vs. large-scale apps
  • OpenTelemetry for vendor-agnostic instrumentation
  • Leveraging OpenTelemetry contrib libraries for easy setup
  • Elixir’s telemetry library as an ecosystem cornerstone
  • Tracing as the first step in observability
  • Differentiating observability from business analytics
  • Profiling with OpenTelemetry Erlang project tools
  • The value of profiling for performance insights
  • Making observability tools accessible and impactful for developers

Links Mentioned

https://www.carscommerce.inc/
https://www.cars.com/
https://hexdocs.pm/telemetry/readme.html
https://kubernetes.io/
https://github.com/ninenines/cowboy
https://hexdocs.pm/bandit/Bandit.html
https://hexdocs.pm/broadway/Broadway.html
https://hexdocs.pm/oban/Oban.html
https://www.dynatrace.com/
https://www.jaegertracing.io/
https://newrelic.com/
https://www.datadoghq.com/
https://www.honeycomb.io/
https://fly.io/phoenix-files/how-phoenix-liveview-form-auto-recovery-works/
https://www.elastic.co/
https://opentelemetry.io/
https://opentelemetry.io/docs/languages/erlang/
https://opentelemetry.io/docs/concepts/signals/traces/
https://opentelemetry.io/docs/specs/otel/logs/
https://github.com/runfinch/finch
https://hexdocs.pm/telemetry_metrics/Telemetry.Metrics.html
https://opentelemetry.io/blog/2024/state-profiling
https://www.instrumentingelixir.com/
https://prometheus.io/
https://www.datadoghq.com/dg/monitor/ts/statsd/
https://x.com/kayserzl
https://github.com/zkayser
https://bsky.app/profile/ethangunderson.com 
https://github.com/open-telemetry/opentelemetry-collector-contrib

Special Guests: Ethan Gunderson and Zack Kayser.

Transcript (English):