HNNewShowAskJobs
Built with Tanstack Start
Cache Benchmarks(github.com)
53 points by jjwiseman 6 days ago | 22 comments
  • junon3 days ago

    The choice for log graphs here probably wasn't necessary and seems to have hurt more than it helped. Despite looking relatively similar, memcached performed 3x faster than redis on some benchmarks whilst appearing only slightly above average.

    Otherwise, very thorough and well done benchmark from the looks of it. Redis my beloved not holding up so well against some others these days it looks like.

    • tidwall6 hours ago |parent

      Thanks for the feedback. I updated the graphs to use linear scale by default. Log scale is still available, which is useful for latency viewing. Also only included the summarized graphs on the README. All graphs are moved into a separate file.

      • junon6 hours ago |parent

        Nice, way clearer in my opinion. Thanks!

    • nchmy2 days ago |parent

      I wouldn't have even noticed if I hadn't seen this comment. Definitely necessary to change to linear scale.

      Also, while I appreciate the thoroughness, I think it would be very useful to reduce the number of graphs significantly. Maybe 10x fewer. Just present the key ones that tell the story, and put the rest in another folder.

    • robinhoodexe3 days ago |parent

      Agreed, it'd be nice to see the graphs with a linear scale.

  • phoronixrly3 days ago

    > c8g.8xlarge (32 core non-NUMA ARM64)

    Requests are scheduled on half of these. Despite that, a plateau is hit after 8 threads? Is this a 16-core 32-thread type of a setup?

    Also, consider redoing this in linear scale.

    Edit: Oddly enough, no? 1 thread per core as per https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/cpu-opti...

    • tidwall6 hours ago |parent

      Thanks for the feedback. The plateauing is due to a taskset misconfiguration. I fixed it and reran the benches. They look much better now.

    • swiftcoder2 days ago |parent

      > a plateau is hit after 8 threads?

      Most of the graphs plateau around 6 threads, for pretty much all the caches under test. I wonder if there is some interesting architectural issue with cache-sharing on this particular platform?

      • namibj2 days ago |parent

        I guess memory controller behavior; especially if it's not set up for parallelism IOPS but for single-threaded throughout (channel interleaving).

    • everfrustrated2 days ago |parent

      g = graviton which doesn't support SMT so 1 vcpu is 1 full core

  • greatgib3 days ago

    Does anyone have an idea about why there is such a gap sometimes between valley and Redis? I would have expected only a marginal différence at this point.

    • ac130kz2 days ago |parent

      Why so? Just after the forking process Valkey has gone beyond what Redis is capable of due to high volumes of funding and new attention from devs wanting to improve Redis's performance.

    • dilyevsky3 days ago |parent

      I assume because valkey is multithreaded and redis isn’t

  • motorest3 days ago

    I'm happy to see Valkey consistently outperforming Redis. It should be food for thought for anyone considering rug pulls.

  • everfrustrated2 days ago

    Be good to also include AWS own hosted variants of Elasticache. They do a bunch of tuning as well so the results are likely different vs running the same software on the same Aws instance type too.

  • namibj2 days ago

    I'm sad to see memcached is used with the legacy text protocol instead of the recommended (and supported by the benchmarking software) binary protocol.

    That shouldn't be representative of any modern deployments and not even declaring this outside of the code itself is IMO misleading.

    Please fix the documentation or better, run that one and update the graphs.

    • tidwall4 hours ago |parent

      Memcache binary protocol was deprecated years ago. It’s no longer recommended to be used.

  • randomtoast3 days ago

    I would also like to see linear scaling graphs.

    • tidwall4 hours ago |parent

      I updated the graphs to use linear scaling. Thanks for the feedback

  • magnio3 days ago

    Anyone knows how Garnet outperforms others so much in pipelining >1 tests while being written in C#?

    • algorithmsRcool2 days ago |parent

      If you look at Garnet's source code it is very non-idiomatic C#. It goes to extraordinary lengths to avoid the garbage collector. Almost all memory management is done with unmanaged memory and pointers.

      They also have a very clever internal design and do some other tricks like strategically avoiding async/await and moving I/O operations onto the network request thread.

    • idoubtit2 days ago |parent

      I think the programming language is not relevant, especially since startup time plays no role. A different design can have much more impact, and IIRC Garnet is not fully compatible with Redis.

      The main difference appears to be that Garnet is more parallel, according to this student's report of benchmarking various keystores (see the "CPU usage" sections in the PDF) https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A196...