SphinxConnector.NET 4.2 has been released

by Administrator 30. May 2019 12:20

We're pleased to announce that SphinxConnector.NET 4.2 is available for download and via NuGet!

This release contains some compatibility fixes for Manticore 3 and adds support for Manticore's REGEX() and CONCAT() functions. If you are using JSON attributes in conjunction with the fluent API you might want to checkout two new methods we've added to the JsonObjectSerializer. You can use these methods to access the raw bytes or chars returned from searchd and pass them directly to your JSON serializer if supported. This avoids the creation of intermediate strings which can be a huge saving. If you're looking for alternatives to JSON.NET, which is used by default by SphinxConnector.NET: we recently did a comparison of JSON serializers which can be found here.

For a full list of changes, please refer to the version history.

Tags:

Comparing JSON Serializers

by Dennis 2. May 2019 13:44

As you might be aware, SphinxConnector.NET’s fluent API uses JSON.NET as its default JSON serializer when dealing with JSON attributes. As the old saying goes “No one has ever been fired for using JSON.NET” ;-) Choosing JSON.NET was obvious, because it has been the de-facto standard in the .NET world since, well, forever. Even Microsoft adopted it as the default library for handling JSON in ASP.NET (MVC/Web API), so it’s probably not even an additional dependency for most people, which is a bonus.

There are however a bunch of other JSON libraries out there that do better in terms of performance and memory usage than JSON.NET, so I decided to test some them with real world data of a customer. I’ll be comparing

to JSON.NET 12.0.1.

Switching JSON serializers in SphinxConnector.NET’s fluent API is pretty easy, just inherit from JsonObjectSerializer, implement Serialize and Deserialize, register your class by setting the JsonObjectSerializer property and you are good to go. Here’s how the ServiceStackJsonSerializer looks like:

class ServiceStackJsonSerializer : JsonObjectSerializer
{
    public override string Serialize(object obj) => ServiceStack.Text.JsonSerializer.SerializeToString(obj);

    public override object Deserialize(string json, Type type) => ServiceStack.Text.JsonSerializer.DeserializeFromString(json, type);
}

As mentioned earlier, I’m using real world data that is used in production and is pretty JSON heavy. For the benchmark I’m just querying the JSON data, to better isolate the effect of using the different libraries with regards to execution speed (Op/s). Here are the results (data queried from Manticore 2.8.2):

BenchmarkDotNet=v0.11.5, OS=Windows 7 SP1 (6.1.7601.0)
Intel Core i5-3570 CPU 3.40GHz (Ivy Bridge), 1 CPU, 4 logical and 4 physical cores
Frequency=3420566 Hz, Resolution=292.3493 ns, Timer=TSC
  [Host]     : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3324.0
  Job-FEFVFA : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3324.0
InvocationCount=1  UnrollFactor=1  
Method Mean Error StdDev Op/s Ratio Gen 0 Gen 1 Gen 2 Allocated
JsonNet 205.2 ms 4.617 ms 7.456 ms 4.874 1.00 3000.0 1000.0 - 13.75 MB
ServiceStack.Text 195.9 ms 1.734 ms 1.537 ms 5.104 0.94 1000.0 - - 7.24 MB
Utf8Json 190.9 ms 1.335 ms 1.249 ms 5.239 0.92 2000.0 - - 9.45 MB
JilJson 190.0 ms 1.263 ms 1.182 ms 5.262 0.92 3000.0 1000.0 - 14.44 MB

The results are pretty impressive: Servicestack.Text is using nearly 50% less memory, causes less garbage collections and is also slightly faster than JSON.NET*. Utf8Json is even faster than Servicestack.Text but uses more memory. Jil performs best, albeit slightly, but does so at the cost of an even higher memory consumption than JSON.NET, which isn’t surprising as this an intentional design decision.

Could we do even better?

We can. In SphinxConnector.NET 4.2 we'll be adding two more overloads to the Deserialize method:

public virtual object Deserialize(ReadOnlyMemory<byte> json, Type type)
public virtual object Deserialize(ReadOnlyMemory<char> json, Type type)

Instead of passing a string containing the JSON data, we are passing an instance of ReadOnlyMemory of char or byte. This way, we’re avoiding the creation of an intermediate string and instead pass on the raw data we’ve read from the socket.

ServiceStack.Text supports deserializing JSON from ReadOnlyMemory<char>, Utf8Json accepts an array of bytes. Jil and JSON.NET don’t offer any methods to that work directly on chars or bytes, here we need to create a MemoryStream over ReadOnlyMemory<byte> which in turn gets passed into a TextReader which can then be passed to the corresponding Deserialize methods.

Method Mean Error StdDev Op/s Ratio Gen 0 Gen 1 Gen 2 Allocated
JsonNet 196.9 ms 2.050 ms 1.8170 ms 5.078 1.00 3000.0 1000.0 - 13.25 MB
ServiceStack.Text 191.3 ms 1.119 ms 0.9346 ms 5.228 0.97 1000.0 - - 5.56 MB
Utf8Json 189.4 ms 1.088 ms 1.0177 ms 5.279 0.96 2000.0 - - 7.75 MB
Jil 193.2 ms 1.295 ms 1.2116 ms 5.175 0.98 3000.0 1000.0 - 13.93 MB

After eliminating the intermediate string allocations and passing our chars/bytes directly to ServiceStack.Text / Utf8Json we can observe a further reduction in memory usage. JSON.NET and Jil see some improvement, but not as much as the other two. JSON.NET also is faster than before in terms of Op/s whereas Jil is slightly slower.

* The performance advantages of the tested libraries compared to JSON.NET don’t show that much in a benchmark like this, because there’s much more going on here than just deserializing JSON data (query execution, network transmission etc.). If you’d just compare the deserialization process, you’d see much more of a speed advantage these libraries provide (check out the benchmarks on the respective project sites).

Conclusion

 

Replacing JSON.NET with an alternative JSON serializer in SphinxConnector.NET’s fluent API can lead to a significant reduction in memory usage and a nice improvement in speed. Given that this can be done with just a few lines of code, this is something I’d seriously look into for applications that make use of JSON attributes. On a related note, I’d also consider replacing JSON.NET within ASP.NET MVC/Web API in case your application is build with these.

Tags: ,

SphinxConnector.NET

SphinxConnector.NET 4.1 has been released

by Dennis 30. November 2018 15:13

We're pleased to announce that SphinxConnector.NET 4.1 is available for download and via NuGet!

For this release we’ve continued our work on improving performance and reducing memory usage. With the release of .NET Core 2.1 and the introduction of Span and related types, we were able to further improve the work done in 4.0. For example, we are now able to read data from searchd for primitive types more efficiently, namely by reading and converting data directly from the array of bytes instead of creating a string first. We also optimized the process of building JSON path strings within the fluent API, which could be a big source for allocations.

For a full list of changes, please refer to the version history.

Tags:

Announcements

SphinxConnector.NET 4.0 has been released

by Dennis 18. April 2018 13:32

We’re pleased to announce that the next major release of SphinxConnector.NET is finally available for download and via NuGet!

Highlights

  • Support for .NET Core via .NET Standard 2.0
  • Fully asynchronous implementations of both SphinxQL and fluent API
  • Numerous performance improvements

In addition, Manticore Search is now fully supported without the need to change the mysql_version_string setting.

Please note that in order to use the .NET Standard 2.0 build, you’ll have to install SphinxConnector.NET via NuGet.

Breaking changes

A new method named Metadata() has been added to IFulltextQuery which returns query metadata via an out parameter. Overloads of ToList(), First() etc. that previously returned query metadata that way have been removed.

JSON.NET is now an explicit dependency, previously it had been merged into the SphinxConnector.NET assembly.

We've also decided to not provide a special build for Mono anymore, as demand for this was virtually nonexistent.

For a full list of changes please refer to the version history.

Tags:

Performance Improvements in SphinxConnector.NET 4.0

by Dennis 23. March 2018 14:02

As we're preparing the release of SphinxConnector.NET 4.0 I'd like to share some improvements we have made since V3. Lets start with performance: as many of you might be aware, there currently is a strong focus in the .NET ecosystem on improving performance, often done by reducing memory consumption, especially in low-level code. Case in point: .NET Core. The development of .NET Core is driven by a strong focus on providing great performance. This is accomplished, among other things, by using less memory to eliminate unnecessary garbage that would need to be collected by the GC. This is especially true for the aforementioned low-level stuff that might be called several hundred or even thousand times a second,e.g. in a very busy web application or API.

During the development of SphinxConnector.NET we took it upon ourselves to reduce the memory that is being used by the fluent API and SphinxQL. Following are two benchmarks (created with the great BenchmarkDotNet) based on our Stackoverflow with Sphinx sample web app. The benchmarked methods are based on the search method from the sample, which executes three queries: the search itself and two facet queries, so it resembles a scenario which is akin to what is done in a real-world application. The same queries are executed via plain SphinxQL and the fluent API. The benchmark is run against Manticore Search 2.6.2 on a Ubuntu Server 16 VM with 1 GB RAM and 1 CPU core, the search term used is ‘c#’.

BenchmarkDotNet=v0.10.12, OS=Windows 7 SP1 (6.1.7601.0)
Intel Core i5-3570 CPU 3.40GHz (Ivy Bridge), 1 CPU, 4 logical cores and 4 physical cores
Frequency=3330156 Hz, Resolution=300.2862 ns, Timer=TSC
  [Host]     : .NET Framework 4.7 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.2558.0
  Job-MZUNZG : .NET Framework 4.7 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.2558.0
Jit=RyuJit  Platform=X64  Force=False  
MethodMeanErrorStdDevOp/sGen 0Allocated
SearchSphinxQL 7.421 ms 0.1472 ms 0.1305 ms 134.76 23.4375 72.07 KB
SearchFluentAPI 10.555 ms 0.2626 ms 0.2579 ms 94.75 93.7500 314.95 KB

 

As you can see, there’s significantly less memory being allocated when queries are executed directly via SphinxQL compared to the fluent API. Some of this is of course expected, as the fluent API does the heavy lifting of creating the queries and the result objects for you, which comes with a cost. But there’s certainly room for improvement as you can see from the following results with SphinxConnector.NET 4.0:

MethodMeanErrorStdDevOp/sGen 0Allocated
SearchSphinxQL 7.552 ms 0.1415 ms 0.2441 ms 132.4 7.8125 45.99 KB
SearchFluentAPI 9.656 ms 0.1931 ms 0.4030 ms 103.6 31.2500 131.81 KB

 

Memory usage with SphinxQL has been reduced by about a third and Gen0 collections are down by over 60%! The reduction seen with the fluent API is even higher with more than 50% of allocations now gone and Gen0 collections reduced to a third of their previous value!

So what did we do to achieve these numbers? In this release we focused on addressing all the obvious issues that popped up under profiling and manual code inspections: avoid unnecessary copying of data, avoid or defer the creation of objects when possible, reuse buffers etc. This mostly applies to the SphinxQL infrastructure which needed a major overhaul anyway, to provide async implementations of query methods.

One thing that had a major impact on the fluent API, is the use of the awesome FastExpressionCompiler to compile expressions. It not only compiles expression trees faster than Expression.Compile() but also often produces faster code, which both results in an increase of Op/s.

To conclude, SphinxConnector.NET 4.0 comes with major improvements in this area, but there’s still room for more optimizations, which will be done over the next releases, so stay tuned!

Tags: ,