There is an increasing number of libraries which are described as high performance and have benchmarks to back that claim up. Here is a selection that I am aware of.
Disruptor library
http://code.google.com/p/disruptor/LMAX aims to be the fastest trading platform in the world. Clearly, in order to achieve this we needed to do something special to achieve very low-latency and high-throughput with our Java platform. Performance testing showed that using queues to pass data between stages of the system was introducing latency, so we focused on optimising this area.
The Disruptor is the result of our research and testing. We found that cache misses at the CPU-level, and locks requiring kernel arbitration are both extremely costly, so we created a framework which has "mechanical sympathy" for the hardware it's running on, and that's lock-free.
The 6 million TPS benchmark was measured on a 3Ghz dual-socket quad-core Nehalem based Dell server with 32GB RAM.
http://martinfowler.com/articles/lmax.html
Java Chronicle
https://github.com/peter-lawrey/Java-ChronicleThis library is an ultra low latency, high throughput, persisted, messaging and event driven in memory database. The typical latency is as low as 16 nano-seconds and supports throughputs of 5-20 million messages/record updates per second.
It uses almost no heap, trivial GC impact, can be much larger than your physical memory size (only limited by the size of your disk). and can be shared
between processes with better than 1/10th latency of using Sockets over loopback.
It can change the way you design your system because it allows you to have independent processes which can be running or not at the same time (as no messages are lost) This is useful for restarting services and testing your services from canned data. e.g. like sub-microsecond durable messaging.
You can attach any number of readers, including tools to see the exact state of the data externally. e.g. You can use; od -t cx1 {file} to see the current state.
Colt Matrix library
http://acs.lbl.gov/software/colt/Scientific and technical computing, as, for example, carried out at CERN, is characterized by demanding problem sizes and a need for high performance at reasonably small memory footprint. There is a perception by many that the Java language is unsuited for such work. However, recent trends in its evolution suggest that it may soon be a major player in performance sensitive scientific and technical computing. For example, IBM Watson's Ninja project showed that Java can indeed perform BLAS matrix computations up to 90% as fast as optimized Fortran. The Java Grande Forum Numerics Working Group provides a focal point for information on numerical computing in Java. With the performance gap steadily closing, Java has recently found increased adoption in the field. The reasons include ease of use, cross-platform nature, built-in support for multi-threading, network friendly APIs and a healthy pool of available developers. Still, these efforts are to a significant degree hindered by the lack of foundation toolkits broadly available and conveniently accessible in C and Fortran.
The latest stable Colt release breaks the 1.9 Gflop/s barrier on JDK ibm-1.4.1, RedHat 9.0, 2x IntelXeon@2.8 GHz.
Javolution
http://javolution.org/ Javolution real-time goals are simple: To make your application faster and more time predictable! That being accomplished through:
- High performance and time-deterministic (real-time) util / lang / text / io / xml base classes.
- Context programming in order to achieve true separation of concerns (logging, performance, etc).
- A testing framework addressing not only unit tests but also performance and regression tests as well.
- Straightforward and low-level parallel computing capabilities with ConcurrentContext. Struct and Union base classes for direct interfacing with native applications (e.g. C/C++).
- World's fastest and first hard real-time XML marshalling/unmarshalling facility. Simple yet flexible configuration management of your application.
Trove collections for primitives
http://trove.starlight-systems.com/
The Trove library provides high speed regular and primitive collections for Java.
The GNU Trove library has two objectives:
- Provide "free" (as in "free speech" and "free beer"), fast, lightweight implementations of the java.util Collections API. These implementations are designed to be pluggable replacements for their JDK equivalents.
- Provide primitive collections with similar APIs to the above. This gap in the JDK is often addressed by using the "wrapper" classes (java.lang.Integer, java.lang.Float, etc.) with Object-based collections. For most applications, however, collections which store primitives directly will require less space and yield significant performance gains.
MG4J: Managing Gigabytes for Java™
http://mg4j.dsi.unimi.it/
MG4J (Managing Gigabytes for Java) is a free full-text search engine for large document collections written in Java. MG4J is a highly customisable, high-performance, full-fledged search engine providing state-of-the-art features (such as BM25/BM25F scoring) and new research algorithms.
Other links
Overview of 8 performance libraries
http://www.dzone.com/links/r/8_best_open_source_high_performance_java_collecti.html
Sometimes collection classes in JDK may not sufficient. We may require some high performance hashtable, Bigarrays etc. Check out the list of open source high performance collection libraries.
Serialization benchmark
http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking
This is a comparison of some serialization libraries.
Its still hard to beat hand coded serialization.