Performance Improvements in SphinxConnector.NET 4.0

by Dennis 23. March 2018 14:02

As we're preparing the release of SphinxConnector.NET 4.0 I'd like to share some improvements we have made since V3. Lets start with performance: as many of you might be aware, there currently is a strong focus in the .NET ecosystem on improving performance, often done by reducing memory consumption, especially in low-level code. Case in point: .NET Core. The development of .NET Core is driven by a strong focus on providing great performance. This is accomplished, among other things, by using less memory to eliminate unnecessary garbage that would need to be collected by the GC. This is especially true for the aforementioned low-level stuff that might be called several hundred or even thousand times a second,e.g. in a very busy web application or API.

During the development of SphinxConnector.NET we took it upon ourselves to reduce the memory that is being used by the fluent API and SphinxQL. Following are two benchmarks (created with the great BenchmarkDotNet) based on our Stackoverflow with Sphinx sample web app. The benchmarked methods are based on the search method from the sample, which executes three queries: the search itself and two facet queries, so it resembles a scenario which is akin to what is done in a real-world application. The same queries are executed via plain SphinxQL and the fluent API. The benchmark is run against Manticore Search 2.6.2 on a Ubuntu Server 16 VM with 1 GB RAM and 1 CPU core, the search term used is ‘c#’.

BenchmarkDotNet=v0.10.12, OS=Windows 7 SP1 (6.1.7601.0)
Intel Core i5-3570 CPU 3.40GHz (Ivy Bridge), 1 CPU, 4 logical cores and 4 physical cores
Frequency=3330156 Hz, Resolution=300.2862 ns, Timer=TSC
  [Host]     : .NET Framework 4.7 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.2558.0
  Job-MZUNZG : .NET Framework 4.7 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.2558.0
Jit=RyuJit  Platform=X64  Force=False  
MethodMeanErrorStdDevOp/sGen 0Allocated
SearchSphinxQL 7.421 ms 0.1472 ms 0.1305 ms 134.76 23.4375 72.07 KB
SearchFluentAPI 10.555 ms 0.2626 ms 0.2579 ms 94.75 93.7500 314.95 KB

 

As you can see, there’s significantly less memory being allocated when queries are executed directly via SphinxQL compared to the fluent API. Some of this is of course expected, as the fluent API does the heavy lifting of creating the queries and the result objects for you, which comes with a cost. But there’s certainly room for improvement as you can see from the following results with SphinxConnector.NET 4.0:

MethodMeanErrorStdDevOp/sGen 0Allocated
SearchSphinxQL 7.552 ms 0.1415 ms 0.2441 ms 132.4 7.8125 45.99 KB
SearchFluentAPI 9.656 ms 0.1931 ms 0.4030 ms 103.6 31.2500 131.81 KB

 

Memory usage with SphinxQL has been reduced by about a third and Gen0 collections are down by over 60%! The reduction seen with the fluent API is even higher with more than 50% of allocations now gone and Gen0 collections reduced to a third of their previous value!

So what did we do to achieve these numbers? In this release we focused on addressing all the obvious issues that popped up under profiling and manual code inspections: avoid unnecessary copying of data, avoid or defer the creation of objects when possible, reuse buffers etc. This mostly applies to the SphinxQL infrastructure which needed a major overhaul anyway, to provide async implementations of query methods.

One thing that had a major impact on the fluent API, is the use of the awesome FastExpressionCompiler to compile expressions. It not only compiles expression trees faster than Expression.Compile() but also often produces faster code, which both results in an increase of Op/s.

To conclude, SphinxConnector.NET 4.0 comes with major improvements in this area, but there’s still room for more optimizations, which will be done over the next releases, so stay tuned!

Tags: ,

Optimized Attribute Filtering with SphinxConnector.NET’s Fluent API

by Dennis 1. February 2013 12:11

An interesting article over at the MySQL Performance Blog was recently published about optimizing Sphinx queries that only filter by an attribute (i.e. do not contain a full-text query). I recommend reading the article first and then coming back here, but here’s a quick summary: a Sphinx query that only filters by an attribute may be relatively slow compared to an equivalent query in a regular DBMS. The reason for this is the fact that one cannot create indexes (as in B-tree indexes) for attributes in Sphinx as one would do in a DBMS. So to retrieve the results of such a query, Sphinx has to perform a full-scan of the index which is relatively costly depending on the size of the index.

The article describes a neat trick to get around this limitation: by adding a full-text indexed field for an attribute and querying that, one can achieve a greatly improved query time. In this post I’d like to demonstrate how this technique can be used with SphinxConnector.NET’s fluent API in conjunction with a real-time index.

The index in the articles example contains data about books, so I’ll be using that here as well. These documents have an integer attribute for a user id that we’d like to store as a full-text indexed field. Let’s take a look at what the document model should look like and which additional settings need to be applied.

To add the user id attribute to the full-text index it needs to be converted to a string. We’ll also add a prefix to each value to avoid it being included in the results of a “regular” full-text query. To do this, we add a string property to the document model that returns the converted and prefixed value:

public class CatalogItem
{
    public int Id { get; set; }

    public int UserId { get; set; }

    public string Title { get; set; }

    public string UserIdKey
    {
        get { return "userkey_" + UserId; }
    }
}

As of Version 3.2, SphinxConnector.NET will automatically exclude any read-only property when selecting the results of a query, so no further setup is required here (it will of course still be inserted into the index during a save).

In previous versions of SphinxConnector.NET the UserIdKey property would have to be configured as follows:

fulltextStore.Conventions.IsFulltextFieldOnly = memberInfo => memberInfo.Name == "UserIdKey";

A query that uses the new attribute would then look this:

IList<CatalogItem> results = session.Query<CatalogItem>().
                                     Match("@UserIdKey userkey_42").
                                     ToList();

For the sake of completeness, here’s the corresponding Sphinx configuration:

index catalog
{
    type = rt
    path = catalog
rt_field = title rt_field = useridkey rt_attr_string = title rt_attr_uint = userid }

Tags: , ,

How-to