Skip to content

Releases: badoo/pinba2

v2.6.0: wire packet compression and performance optimisation

18 Mar 16:53
Compare
Choose a tag to compare

This release introduces packet versions in experimental and mostly backwards compatible way.
Backwards compatibility is not an issue as of now, since there are no publicly available clients that use versions yet.
A spec describing the wire protocol will be released, and a fully backwards compatible way to migrate will be implemented at some point in the future.

Short list of changes

  • wire packet versionning support implemented
  • lz4 compression support in v1 wire packets
  • major performance optimisation in internal word -> id dictionary translation, resulting in major cpu scalability improvements in some workloads. This fix has resulted in 20-300% cpu usage reduction in some badoo internal setups, speedup is proportional to incoming packets' variability in tag names.
  • build fix for archive (i.e. non git) downloads

v2.5.0: major performance improvements

24 Apr 14:35
Compare
Choose a tag to compare

Major performance and scalability improvements across the board.

v2.4.1: ipv6 listen support

21 Feb 20:53
Compare
Choose a tag to compare

Added ipv6 listen support, and now listen on all resolved addresses instead of just the first one.
pinba_address new default is "*", which means "all ipv4 and ipv6 addresses"

v2.4.0: Docker hub and minor features

12 Feb 16:34
Compare
Choose a tag to compare

Features

  • fractional percentiles, i.e. can now measure 99.9, 99.99, 99.9999 percentiles
  • improved report row_count estimation significantly (new one is based on recent selects - that actually calculate the precise value)

Build

  • docker hub automated build support by our friends: https://hub.docker.com/r/alexanderilyin/pinba2
  • fixed build issues causing early crash when using default Dockerfile
  • gcc8 compatibility (please report if you find something being broken still)

Bugfixes

  • startup errors (like 'address already in use') are now reported nicely, instead of silently aborting
  • deleting a report while selects are running on it should not cause segfaults

v2.3.1: Bugfix release

04 Jun 16:38
Compare
Choose a tag to compare

Fix a few minor bugs, including the "histogram is too small" assert failure and a race condition in a rare case for packet reports only.

v2.3.0 hdr histograms, performance (and a very small breaking change)

03 May 11:48
Compare
Choose a tag to compare

This release is an effort to improve performance by introducing hdr histograms (forked from https://github.com/HdrHistogram/HdrHistogram_c) and optimising internal timer aggregation and histogram machinery.

Currently in production at Badoo we're seeing up to 5 million timers/sec (each with 10-20-30 tags) per instance.
And some heavily loaded reports (the very non-specific ones, that aggregate almost the entire stream) - are hitting 100% cpu mark. So this release aims to improve that.

Breaking change

  • added 'timers_skipped_by_bloom' field to 'active' report. Breaking, since it was added 'in the middle', after 'timers_aggregated' field.

Release highlights

  • histograms now use hdr_histogram-like machinery internally
    • percentiles (at the end of histogram interval) become more coarse might shift slightly
    • percentiles (at the start of histogram interval) become more precise
    • histograms will use slightly more memory on average (if your workload is anything like ours)
    • performance should improve in most cases
    • it's now possible and feasible to have histograms with 1 microsecond resolution (which is nice if you measure some short on-cpu functions for example) - they'll use more cpu (~2x for 1us vs 1ms histograms).
  • performance enhancements
    • 'request' reports are now significantly faster (and use less memory) both in aggregation and selects (converted them to be very similar to timer reports internally). use case: aggregating stats from nginx (response codes, etc.)
    • added bloom filters for individual timers in the packet (fast skip for timers that the report is definitely not interested in)
    • coordinator thread now uses considerably less resources (5M timers/sec are transcoded into internal format using ~1 cpu core).

v2.2.1: Bugfix release

17 Jan 13:09
Compare
Choose a tag to compare
  • fixed percentile fields (broken in 2.2.0 when implementing _percent fields).

v2.2.0: Implement percent fields like in old pinba

16 Jan 14:45
Compare
Choose a tag to compare

Implement percent fields like in old pinba.

v2.1.1

10 Nov 14:18
Compare
Choose a tag to compare

Increased max number of keys in reports from 7 to 15. This change incurs no memory overhead (unless your're using more than 7 keys, that is :) ).

v2.1.0: Dictionary rework, raw histograms and more

27 Oct 12:25
Compare
Choose a tag to compare

This release is a merge to master from dictionary_refcounted_erase branch (that was 2.0.9 release)

Histograms

Internals

  • internal string dictionaries have been reworked. Fixed an issue where sending random data (tag names and values) could cause dictionaries to grow indefinitely large (and oom the engine). Now we track all incoming strings and remove them from memory when they're no longer used.
    • This has the consequence of increased memory and cpu usage though (YMMV, but we've observed ~3x memory usage and ~2x cpu usage increase).
  • timer bloom filtering fixed, should no longer say 'no' where 'maybe' is the correct answer
  • added pinba_enable_blooms setting to mysql engine, default ON
  • fixed a few assertion failures when selecting percentiles from hashtable histograms (info reports only nowadays)
  • added runtime symbol discovery for recvmmsg, pthread_setname_np, pthread_setaffinity_np (those were only checked at compile-time before). This should theoretically improve performance at high packet rates (like 1.1M packets/sec we have at some setups at Badoo).