Percentiles aggregation with Redis

We all know that we should monitor pro-actively our applications and response is a natural choice when working on performance. Concerning timings, it’s important to keep this golden rule in mind “Average is bad !” Averages are used too often for in monitoring because they are quite easy/memory efficient to compute (sum of values/number of samples). There are hundreds of posts and articles that explain this and why you should not rely on it (like Anscombe’s quartet). Percentiles are much better, but less easy to compute especially on a large-scale distributed system (big data, multiple sources, …)

Aggregating percentiles is a common problem for everyone and it exists a lot of ways to do it. I highly suggest to read this article from baselab : this is just a must-read if you plan to work on this topic !  One section in this last article refers to “Buckets” and this is exactly the approach we will use here.  “Another approach is to divide the domain into multiple buckets and for each value/occurrence within the bucket boundaries increment this bucket’s count by one”


Increments ? Do you know an appropriate key-value store for this purpose ? Yes, Redis.  My implementation consists of storing all buckets into a redis hash, each bucket being a hash field. The field name is the upper bound and the value is the number of samples for this interval. We use HINCRBY (complexity O(1)) on client/emitter side to increment the number of samples for the target bucket. This will give us some sort of histogram in redis for that event. It could be incremented by one to many clients because commands are isolated and atomic. On the other side (supervisor, admin dashboard, …), in order to get the N-percentile for that hash, we need to get all fields, sort them (hash fields are not sorted by default) and find the exact field name where the aggregated number of samples (starting from the first) is beyond the target number of samples (total samples * N-percentiles).

One particular thing to notice is the bucket size. There is no rule concerning this and it could a constant, linear, exponential, … scale. The choice of the bucket size depends mainly on two factors : the desired accuracy (small size means better accuracy) and the domain (delta between  min/max). For example, a constant bucket size of 50 ( [0-50],[50-100],[100-150], …) is not accurate for small values (ex 130 => [100-150]) but much more accurate for higher values (like 1234 => [1200-1250]). Over the time, I generally use some kind of varying bucket size inspired by HDR Histogram.

Bucket Size/Round Factor Step Lower Value Upper Value
1 0 0 128
2 1 128 384
4 2 384 896
8 3 896 1920
16 4 1920 3968
32 5 3968 8064
64 6 8064 16256
128 7 16256 32640
256 8 32640 65408

Number of redis hash fields

For example, a sample value of 2040 is located in the interval [2032-2048]. This is just an example of a non-linear scale that provides a good accuracy, while keeping an acceptable number of hash fields

The number of hash fields is important because it will directly impact on memory or storage (if configured). Hashes have a hidden memory optimization using ziplist when the number of fields is below a configurable threshold (512 by default). So, we can’t have too many fields/buckets if persistence is required. In practice, depending on the number of samples and the distribution of values, some buckets are often missing. This is not a problem and give us more possibilities to select the bucket size.

Querying Redis

During my work on this topic, my initial intention was to create a LUA script to compute percentiles in redis : we don’t need to get all the fields+values on processing side, and percentile computation is just a matter of sorting/counting hash fields. After several test, I can easily say that it is a very bad idea ! The best version of my script is quite efficient –I suppose-  but taking several ms for the worst cases (eg. thousands of buckets). You should know that only one script can be executed at the same time on redis and I prefer to do more IO than blocking my redis instance. By the way, the complexity of HGETALL is O(1), so quite CPU-efficient.

Precision on client side

This implementation has a nice advantage: the precision is configured on client/emitter side, because the algorithm to compute the bucket is located there. You could change the algorithm at runtime or even have multiple implementations that you can apply to your components. On server-side, it doesn’t matter because you just have to sort and count. This is pretty good when you have multiple kind of apps and components, all having a unique domain (nano seconds, milliseconds…)

Smart Key Pattern

Since now, I’ve considered only one hash, but this is not enough if we want to create beautiful graphs (99th percentile for the last 7 days per hour). A common way –in NoSQL- to solve this problem is to use the smart key pattern. A Key can be much more that a simple constant string value. If we know how we will query the data model later (and we know in this case), let’s add dynamic parts into the key. We’re working with timeseries so a rounded timestamp is a natural choice. All the keys will follow the pattern ‘mykey:hour:roundedts’.

We can’t aggregate percentiles, and there is no “magical algorithm” to compute percentiles from percentiles. If we want to investigate response times during a few minutes, we need aggregations  for every minute. The best way to do this is to increment several smart keys at the same time (minute, 5 minute, hour, day, …) on client side. That’s not a real problem on client side if your binding implementents pipelining.

Limit the impact on client side

We’ve seen that buckets can be incremented quite easily on client side, I mean without spending too much cpu cycles. However, itis not always a good thing. Depending on your applications, a better approach could be to implement some kind of batch processing. If you receive 100 requests during the same second, there is a high probability that you could group your timings (10 requests < 10ms, 30 requests between 10 and 20, 15 between 20 and 30ms, …). This will reduce significantly the number of commands executed.

To illustrate the concept, I’ve created a small gist using nodejs. This example is not prod-ready and far from my current implementation (C#) but it should help you to understand the principles of this approach.

A few line of codes later, and with the help of your favorite chart library, you can visualize this kind of graph. Please take a moment to compare the average and percentiles. Percentiles are definitively muc better !



Performance Tips & Tricks with Xamarin for Android

I was recently involved on a project made with Xamarin. This new mobile app is not released yet, but it’s on a good track and I’m satisfied of the result. The initial objective was to create several Xamarin apps (Brands x Areas  on Android + iOS) with an important focus on code sharing & quality. I can’t talk too much on the software architecture but it’s built with MvvmCross.

This article targets Xamarin for Android (aka MonoDroid) only. I hope to find the time to write the same kind of article for iOS, but most of things listed here are pertinent for both platforms.

Watch out network requests

The primary data source for the app are public web apis. These days, it’s a fairly common architecture for a mobile application but don’t forget that latency on mobile networks matters. On a mobile network, we can’t really expect to get the response in less than 100 ms like on desktop browsers. To illustrate this, we’ve simply generated trace messages on every HTTP request ! (url, headers, and sometimes content) It may seem very basic, but it’s terribly important. This helped us to detect duplicated calls, lack of caching, ghost UI controls, duplicated events handlers …

Here are some HTTP logs at app’s startup. Same color = Same uri. Look at the timings, any problem here?

xam_iosI suggest you enable this kind of logging very early in your project, so everyone will be familiar and educated to mobile latency. Logging all IOs was certainly one of the best idea, but it’s only half of the jjob. To reduce the number of requests you could

  • Use Batch requests (combine several calls into a single request) or design your api following “Scenario Driven Design” principles
  • Use Abuse local caching, for example with Akavache or Sqlite
  • Use modernhttpclient and check gzip compression
  • Use pertinent speculative requests and background refreshe

Watch out memory allocation

My team and I am are used to develop server side code (web sites, web apis, workers, ..), running on servers having decent hardware. Unfortunately, you don’t have –yet- that luxury on a mobile app. I don’t think there are limitations but to be clear your app will have to run with only a few MBs. You should be more concerned on memory than ever. There are mainly two problems :

  • Excessive allocation adds pressure on GC

Garbage Collection & Memory and Performance Best Practices are must read to understand GC mechanisms on Xamarin. Why GC is so bad ? It just stops all running threads, including the UI thread L. Bye Bye, smooth animations and fast rendering. Minor collections are cheap (only a few ms, sometimes 50 ms) but Major collections are quite slow (from 100 ms to 1 sec). Don’t forget that Major GC collects Gen1 and large object space heaps (LOS). The LOS is where objects that require more than 8000 bytes are kept.  8000 is small, terribly small, especially when you have to consume web apis.

  • Memory leak increase major collections and leads to maximum GRefs reached

If you don’t pay attention on memory, your nursery, major heap or LOS will be rapidly full. Since Xamarin.Android 4.1.0, a full  -major- GC is performed when a gref threshold is crossed. This threshold is 90% of the known maximum grefs for the platform: 1800 grefs on the emulator (2000 max), and 46800 grefs on hardware (maximum 52000).

Here are some GC minor messages, visible via logcat. You should always try to understand these messages, and when they happen.

xam_gcs.pngObserve how the LOS size increases after HttpClient calls, a major GC is expected very soon…

Here is another typical example: Suppose translations contains 2000 items, and that this code is executed on each webview. Failed !


To detect excessive allocation and memory leaks, Android Studio are Xamarin Profiler are great. The produced mlpd file by the last tool can also be opened by Heap-Shot (mono).

Optimize your images

iOS has a built-in image optimization process but not Android. Like for Web performance Optimization, it’s better to have small & optimized images for both the GPU and your app size. Especially on Android, where are there are several folders, it is easy to include hundreds of images.

Many online services are available but you can also use directly pngout/opting for png and jpegtran for jpeg files. Just for your information, 25% was saved of the total assets size.

Note : please be sure to not include useless resources

Optimize your webviews

It’s also very common to have webviews in a mobile app. You can think of the WebView as a chromeless browser window that’s typically configured to run fullscreen. It’s sometimes required to explicitly NOT use native code for some topics (login, registration, content boxes with html, ….)  There are two well-known issues with them. First, webviews and their mobile browser counterpart don’t have similar performance profiles (Do u webview ?) ; it’s less accurate since latest versions, but that’s still something to take into account. Second, a webview is often focused on one feature and that’s all. Traditional SPAs has references to all scripts and styles in the main document, leading to a few seconds lost at startup (networks requests, parsing, layout …). That’s a terribly bad idea to integrate an SPA (made with any JS framework), and hope the page to be fully loaded in less than 5 secs. This simply means that your webviews should be optimized for your native app usage.

  • Log every request of your webview
  • Intercept HTTP requests and sometimes replace response
  • Use Javascript interfaces to bootstrap your JS
  • Check HTTP caching headers
  • Optimize your web resources (js, css, images, …) for a webview usage

Here is what you should AVOID (aka our initial attempt to integrate a SPA into a webview)


Optimize your .NET code and keep your prod build clean

Developers often add debug messages, diagnostics code, local variables … don’t forget to cleanup that kind of code. Here is one of my favorite examples:


string.format and JsonConvert are executed even in a RELEASE build.

I will not talk about Asynchronous programming, which is just mandatory to create a responsive and professional app in 2015 but as a general perf recommendation, try to understand the “hot path” on your app/views. For example, the primary serializer used in this app is Newtonsoft.Json. It’s not the fastest .NET json serializer but certainly the more mature on Xamarin. There are some tips to make it go even faster. Using stream for deserialization is also great to avoid allocation on LOS, because the threshold could be easily reached with json format (see memory section, only 8KB).

Finally, like for any app, keep your production build optimized and clean.

  • Use a Release configuration
  • Remove dead code, unused variables & classes
  • Disable Android debugging
  • Review logging & diagnostics

Understand the Android Platform and controls

The app may be written in C#, using Mono/Xamarin … but it’s still running on Android. If you are familiar with Windows development, Xaml or WP, please try to forget everything because it’s a completely different platform. Is it better to use a listview or a recyclerview ? How to implement properly a Pager Adapter ? Is my layout optimized ? So yes, you will have to read a lot of articles here but it’s not a waste of time, because it is very close to the –official- Android development. Most of the articles, tips, SO answers that you can read on Android also apply to Xamarin.Android. I’ve already talked about webviews and we’ve done a lot of things that are specific to Android to get the fastest page load. At the middle of the project, a memory leak was identified on a recyclerview. There were simply too many instances of a custom controls.


Our understanding of the inner PagerAdapter was just totally wrong, and we’ve fixed it after reading several SO answers.

Here the main tricks & tips that help me a lot. I wish I knew this at the beginning. My team and I did a lot of codes fixes & fine tuning that are mostly irrelevant without our  context, that’s why you have to find your way into Xamarin development.  I Hope this will help you.

Storing Time Series in Redis

Last time, I explored how to store time series in Microsoft Azure Table Service. This time I’ll do the same but in Redis. It’s is a very popular key-value store (but not only) and I highly encourage you to review it if you still don’t know it. In addition to my latest post it’s important to note that Time Series have very special properties.

  • Append only
    Most of the time, collectors or emitters append data “at the end”.
  • Readonly mode
    Time Series are never updated. Points are just read to create beautiful charts, compute statistics, etc … Most probably, latest points are more important than the others : it’s common to use last minute, last hour, last day in charts.
  • Big Data
    A simple collector that runs every second creates 86 400 points per day,  2 592 000 per months, 31 536 000 per year… and this is just for a single metric.

There are many ways to implement Time series in Redis, for example by using sorted sets, lists or even hashes. None is really better and the choice depends on your context (yes, again). A few weeks ago, for one of my needs, I needed a very efficient way to store thousands of thousands of data points.  Here are the details of this implementation, heavily inspired by a reposity of @antirez

How it works

The main idea is to use the APPEND command.

If key already exists and is a string, this command appends the value at the end of the string. If key does not exist it is created and set as an empty string, so APPEND will be similar to SET in this special case.

Every time series is stored into a single string containing multiple data points that are mostly ordered in time. So, the library appends every data point as a serialized string terminated by a special character (#).Internally, every data point contains two parts: a timestamp and a value (also delimited by a special character (‘:’).

Now to make things a little bit more efficient, Key Names are generated via a basic algorithm. The key name contains a constant (user-defined) and a rounded timestamp, so we will not add all the points to the same key. The rounding factor depends on your configuration settings. If you set it to one hour, all data points inserted between 4PM and 5PM will go to the same key.

Reading data is done via GETRANGE. Why not a basic GET ? It’s mainly because each key can have a lot of data points. I don’t want to risk an OutOfMemoryException, Excessive Large Heap Allocations or GC … Depending of the rounding factor, it is also possible to hit redis several times. For example, if the rounding factor is set to one hour and you want the last 4 hours.

Pros Cons
Very space efficient Inserts at the end only
Fast inserts (APPEND is O(1)) Unordered list
Up to 512 MB Range queries in a serie not supported (requires fixed-size)
TTL on keys

Introducing RedisTS (Redis Time Series)

I’ve committed an implementation on github and a nuget package is available here. The usage is pretty trivial and you can inspect the test project or samples to understand how to use it.

Note : RedisTS depends on StackExchange.Redis. An open ConnectionMultiplexer is required and redists will never open a new connection or close the current.

At first, you have to create a client you’re your time series.

var options= new TimeSeriesOptions(3600 * 1000, 1, TimeSpan.FromDays(1));
var client = TimeSeriesFactory.New(db, "msts", options);

This operations is cheap and the client is –more or less- stateless. Once you have a client, you can add one or more datapoints. This is typically done via a background task or a scheduled task.

//here I add new datapoint. Date : now and Value : 123456789
client.AddAsync(DateTime.UtcNow, 123456789);

That’s all on client side. Now if you insert data points from one location, it’s expected to read them from another one, maybe to display beautiful charts or compute statistics. For example, if you want to get all the data points since one hour for the serie “myts”, you can use the following piece of code :

client.RangeAsync(DateTime.UtcNow.AddHours(-1), DateTime.UtcNow);

That’s all for today. I hope this will help you and feedback is heavily encouraged. Thanks

Moving to is a free, open source, community-focused unit testing tool for the .NET Framework. Written by the original inventor of NUnit v2, is the latest technology for unit testing C#, F#, VB.NET and other .NET languages. works with ReSharper, CodeRush, TestDriven.NET and Xamarin.”

Many Test Framework are available in .Net : MS Test, NUnit, xUnit, … Testing is an important aspect in Agile and XP and you can’t reach Continuous Deployement without it. Today, we write more and more tests, sometimes we have more tests than production code ; that’s why a decent test framework is so important today.

Why another test framework?

There are several articles & posts on internet explaining the pros and cons of each Test Framework. Here are my –unordered- personal reasons:

  • Nuget ‘All the way’

You don’t need to install a vsix extensions to VS, or manually add a reference to a file-based dependency (Microsoft.VisualStudio.QualityTools.UnitTestFramework.dll…. v10.0 or v10.1?)  or to create a specific Test Project. In xUnit, you just need a basic class Library with a reference to the official Nuget. The VS Test Runner is even a nuget. Everything is distributed via nupkg and it’s terribly simple and efficient.

  • Great community & active development is free and open source. The code is hosted on github (800 commits, 30 contibutors)  and was previously on codeplex. The core team is made of inspired evangelists and very active. The official twitter account has 1400 followers and it’s still growing. xUnit is extensible and a lot of extensions like AutoFixture are already available on

  • Part of the next big thing

Do you know ASP.NET 5, Xamarin, DNX, …? xUnit is already compatible with all of these platforms. xUnit may be the first class citizen and default choice in .net in the future. It’s not useless to spend a couple of hours to understand it.


  • Well integrated in .NET ecosystem

There are troubles to switch to xUnit because it is already well integrated in the .net ecosystem: Team Foundation Server,, AppVeyor, TeamCity, Resharper,, …

  • It helps to write “Better, Faster, Stronger” Tests

It’s maybe the most important reason. I have the impression that parameterized tests (Via Theories), Generic Tests, extensibility via Attributes, Shared Context, … remove a lot of glue and frictions in my test project. Assertions & Attributes are quite common compared to the others test Framework but there are a few additional interesting properties.xunittest

Integrating xUnit with Team Foundation Server

As I said, is already well integrated with several Build Server like AppVeyor or TeamCity. Using it with TFS is not so complicated but we will have a few extra configuration steps.

  1. Download the latest stable version of runner.visualstudio
  2. Change the extension to zip and extract all the files.
  3. 3 assemblies should be in the folder build\_common
  4. The 3 files show to committed somewhere in TFS (up to you)
  5. xunitdllsOpen Team Explorer Tab, Build -> Actions ->Manage Build Controllers, Select Properties. You should fill the TFS path to custom assemblies (xUnit)


That’s all. You can now write your tests and build your solution on the build agent. xUnit will be automatically used.

One particular problem in the current version is that TestFilters/TestCategories are not supported. xUnit has Traits but these ones are not understood by the runner. It’s common to mix unit tests & integration tests in a solution in order to have a better and stronger test strategy.


If you are in this case, you have only two options:

  • Wait for xunit.runner.visualstudio 2.1 (summer 2015 ?)
    A recent commit added support to Test Filters. It’s a question a week to see it on prod. (Note : the fix is available on the public MyGet feed)
  • Create another solution without integration tests
    A little bit painful to maintain but having a clean and fast solution to build is a good practice.

I’m completely convinced by and I hope you too ! If you still don’t know it,  please give it a try.

Being valuable, Being a swiss knife

ultimate-swiss A Swiss Knife is a very popular pocket knife, used by several armies -but not only- and generally has a very sharp blade, as well as various tools, such as screwdrivers, a can opener, and many others. These attachments are stowed inside the handle of the knife through a pivot point mechanism. When I was a child, I owned one and was fascinated by its utility. What a great design! To be clear, my wish, as a software engineer is to be a swiss knife and I am very proud to be considered as is by my co-workers or my managers. I will try to explain in this post why it’s important to me and why every developer should take this into consideration.

The need for a swiss knife

A swiss knife is a highly valuable tool that everyone wants to have in its pocket because it helps you to solve any problem. And yes this is the sad tragedy of the software industry: problems and puzzles are everywhere. Why do I have an exception here? How will you implement this business feature ? What is this crappy code ? What will be the architecture of our future web site? Why our application has miserably failed yesterday? Is this new hype tech/language/framework/tool/… interesting for us? .… Of course, we can’t predict our future problems but solving puzzles is the essence of software engineering. At the end of the month, we’re not paid just for coding but for solving problems. The most important thing is here : each time that we fix up something, we bring value to someone or something: our customers, our company, our co-workers or even ourself. Bringing value is what makes us valuable, one of the most important recognition in our job. Even better, try to avoid puzzles, but it’s another story.

To be a swiss knife you need … to be focused on value

Even nowadays, I can see some developers waiting for tasks, committing quick and dirty solutions, being focused only on their scope and not aligned on company’s objectives … of course this isn’t how we should be. Everyone in the company should be focused on the final product, on the customer, on business value, to help customer support & product owners. Read more on devops culture. Many people can say here “Of course, I am” but don’t be too naive : Being focused only on value means that you should sometimes work on boring tasks, old-fashioned techs, deprecated libraries, crappy code, ..etc… This is the tradeoff

To be a swiss knife you need … to think as a software engineer

An engineer is a professional practitioner of engineering, concerned with applying scientific knowledge, mathematics, and ingenuity to develop solutions for technical, societal and commercial problems.

We’ve already seen several articles that illustrate this concept but please stop to be focused on your favorite stack. NOW ! After a decade, I’ve seen so many things… For example, WebForms were very cool 10 years ago (and yes it was really compared to the others available techs at this period), but at some moment I’ve switched to something else. Because I learned strong foundations, I was able to reuse my skills quite easily in another tech after. Don’t learn to code. Learn to think ! The Silver Bullet Syndrome is a good illustration of this problem. It is the belief that “the next big change in tools, resources or procedures will miraculously or magically solve all of an organization’s problems”. Stop to chase the chimera, it won’t really fix it. If it helps you to fix your current problems, you will quickly see new ones. The Answer to the Ultimate Question of Life, The Universe, and Everything is not “42”, but “it depends on the context”. Like for design patterns, there is absolutely no magic solution for everything, you just need to find a good one, by applying YOUR vision in YOUR context. Like a real craftman, you need tools in your daily job but unlike a real craftman, our tools are constantly evolving. Do you know an Industry in the world with so many flavors & communities that allow you to work on tools released less than two weeks ? Neither do I. You have to learn to think and practices and not technologies, you should not be narrow minded but open to experience and feedbacks.

To be a swiss knife you … don’t need to be a technical expert

There are several technical experts and evangelists in this world (and we need them) but we’re not all dedicated to this future. By the way, it’s so boring to always do the same thing and to always chase the same daemons. We have the illusion that TOP programmers are the most valuable resource for a company but I don’t think so. You no longer need to be a brilliant programmer to bring success to your company or to achieve success. You just have to solve problems and the good news is that it’s fairly useless to stay hours in front of our screen to reinvent the wheel in each project. On one side, you have a supercomputer in your pocket, another supercomputer on your desk, and dozens of supercomputers in the cloud. On another side, thousands of open source frameworks and libraries can do 90% of the work for you. GitHub, Wikipedia, Stack Overflow, and of course the very wide range of articles, tutorials, feedbacks, posts available on the Internet. The hard part of building software is specification, design, modelling … not how you will write code. Architecture, Monitoring & Diagnostics, TestS, Ownership, Continuous delivery, maintainability, Performance & Scalability, … so many things that should be part of your software.

To be a swiss knife you should … practice

The positive side effect of considering myself as a swiss knife is for learning new stuff and run experimentation. We know that it’s vital for us to learn because our world is moving so fast. When I started to work, it was very different and I know it will be different in 10 years too. Compared to a traditional developers attached to its “confort zone”, I have much more occasions to try & test new things in a real world examples. At the end, if it doesn’t work properly, if I takes too much times, … it doesn’t matter because I am my own product owner. All these tests give me more arguments, experiences, more skills, to become a better problem solver aka a better swiss knife. It’s a way to be pro-active : I have a wide range of things in my pocket to help me in every situation. To conclude, I prefer to know thousands of topics -imperfectly- compared to mastering just only one. I know that I have enough skills and motivations to adapt myself to any problem. A new stack to learn? Ok, let me several days, I’ll do it. It doesn’t really matter the number of skills that you have, what is important is to be able to adapt to each situation, as a soldier with a swiss knife. I know that if I have to work on an unknow topic, I will spent my hours to learn it in order to be efficient and in order to make the right decisions when it’s needed.

Storing time series in Microsoft Azure Table Storage

In a recent project, I needed to store time series into Microsoft Azure Table Storage. As for most of NoSQL databases, the design is very important and you have to mainly think about your access patterns.

A time series is a sequence of data points, typically consisting of successive measurements made over a time interval. Time series data often arise when monitoring an application or tracking business metric, but occur naturally in many application areas like finance, economics, medicine …


Time series are very frequently plotted via –beautiful- charts. Here is an example on availability on Application Insight. In this example, the end user can view response time for several time ranges. Data is aggregated depending on the selected time range (every second for last hour, every minute for last day, every 5 minutes for last 48 hours, …) to keep it easy to understand.

If you’re unfamiliar with Table Service on Azure, I highly recommend you to read the Introduction then the Design Guide ; after reading both articles, I hope you will understand why storing time series is not so easy and needs this kind of article.

To illustrate following designs, I use random generated data, basically one point/second. Having one data point per second makes it very clear and easy to read. In your implementation, you can have more than one point per second; if doesn’t matter.

Basic design

A first design could be to create a basic entity for each data points, like this one

public class DataPoint : TableEntity
    public long Value { get; set; }
    public DataPoint() { }
    public DataPoint(string source, long timeStamp, long value)
        this.PartitionKey = source;
        this.RowKey = timeStamp.ToString();
        this.Value = value;

Why a timestamp for the time property? It’s simply a better choice because RowKey is a string (constraint by Table Service). Ticks could be an idea but I prefer a common value (Unix timestamp) for time axis, so any client could request this table via REST endpoint.

This will produce this kind of results


The RowKey equals here to the Unix timestamp in seconds (due to my test data) but it could be the traditional Unit timestamp (ms). Now If you want to query entities for a specific time range (two hours), you need to use a range query on the RowKey. This is the 2nd most efficient query type on table storage. Here is an example for the first two hours of 2015:

// time range : first two hours of 2015
var from = new DateTime(2015, 1, 1, 0, 0, 0);
var to = new DateTime(2015, 1, 1, 2, 0, 0);

// generate the range query
string filter = TableQuery.CombineFilters(
    TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "mysource"),
        TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.GreaterThanOrEqual, ToSecondsTimestamp(from).ToString()),
        TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.LessThan, ToSecondsTimestamp(to).ToString()))

TableQuery<DataPoint> query = new TableQuery<DataPoint>().Where(filter);
var results = table.ExecuteQuery<DataPoint>(query);

The generated filter is $filter= (PartitionKey eq ‘mysource’) and ((RowKey ge ‘1420070400’) and (RowKey le ‘1420077600’))

This design seems to be good but it isn’t for real world scenarios. First, the rule #1 of Table Service (scalability) is not followed here. Performance is not the same for 1 000 rows and 1 000 000 000 rows. Having a hot partition (always the same PartitionKey) is an anti-scalability on Table Service. Second, the number of data points included in the desired time range affects performance : in my example, for 2 hours, we will receive 7200 entities (one entity for every second). The REST service limits the number of returned entities to 1000 per request. So, we will have to execute several requests to get all entities for the selected time range. Fortunately the .net will silently do this job for us but it’s not the case for all clients. And what about last 24 hours ? What about data every ms ?

Pros Cons
Natural representation No Scalability (hot partition)
Easy to query with Range queries Inefficient queries (limited to 1000 entities per request)
  Slow inserts (limited to 100 points by EGT)

A better approach could be to store row in reverse order. New data points are automatically added at the beginning of the table. Queries should be a little more efficient, but not so many in facts.

Advanced Design

The key idea with time series is that one dimension is well-known: the time. Every request contains a lower bound (from) and an upper bound (to). So, we will use the Compound key pattern to compute smart Partition & Row Keys thus enabling a client to lookup related data with an efficient query. A very common technique when working with time series is to round the time value with a magic factor. To fully understand these magic factors, you need one of my gists. ToRoundedSecondsTimestamp() is heavily used in the my examples. For example Tuesday, 28 April 2015 12:04:35 UTC could be rounded to

Factor Rounded timestamp Datetime
1 (no factor) 1430222735 Tuesday, 28 April 2015 12:04:35 UTC
60 (one hour) 1430222700 Tuesday, 28 April 2015 12:04:00 UTC
3600 (one hour) 1430222400 Tuesday, 28 April 2015 12:00:00 UTC
86400 (one day) 1430179200 Tuesday, 28 April 2015 00:00:00 UTC

In this design, each row contains more than one data points; basically all points included in the current step (between the current rounded timestamp and the next rounded timestamp). This is possible thanks to DynamicTableEntity in Table Service. Just to illustrate the concept, here is an example with PartitionFactor = 60 secs and RowFactor = 5 secs.


Notice how PartitionKey/RowKey are now rounded timestamps. This is exactly the same data in my first design. In this example, each row contains 5 data points and each partition contains 12 rows. Of course, in a real scenario, our objective is to maximize the number of points in a single row.

To get the first two hours of 2015, the query is a little bit more complicated but not less efficient.

var from_ts = ToRoundedSecondsTimestamp(new DateTime(2015, 1, 1, 0, 0, 0), RowFactor);
var to_ts = ToRoundedSecondsTimestamp(new DateTime(2015, 1, 1, 2, 0, 0).AddSeconds(-1), RowFactor);

// generate all row keys in the time range
var nbRows = (to_ts - from_ts) / RowFactor + 1;
var rowKeys = new long[nbRows];

for (var i = 0; i < nbRows; i++)
    rowKeys[i] = from_ts + i * RowFactor;

// group row keys by partition
var partitionKeys = rowKeys.GroupBy(r => ToRoundedTimestamp(r, PartitionFactor));

var partitionFilters = new List<string>();
foreach (var part in partitionKeys)
    // PartitionKey = X and (RowKey=Y or RowKey=Y+1 or RowKey=Y+2 ...)
    string partitionFilter = TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, part.Key.ToString());
    string rowsFilter = string.Join(" " + TableOperators.Or + " ", part.Select(r => TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.Equal, r.ToString())));
    string combinedFilter = TableQuery.CombineFilters(partitionFilter, TableOperators.And, rowsFilter);

// combine all filters
string final = string.Join(" " + TableOperators.Or + " ", partitionFilters);
var query = new TableQuery<DynamicTableEntity>().Where(final);
var res = table.ExecuteQuery(query);

//do something with results...

The generated filter (with partition Factor = one hour, row Factor =4 minutes) is $filter= (PartitionKey eq ‘1420070400’) and (RowKey eq ‘1420070400’ or RowKey eq ‘1420070640’ or RowKey eq ‘1420070880’ or RowKey eq ‘1420071120’ or RowKey eq ‘1420071360’ or RowKey eq ‘1420071600’ or RowKey eq ‘1420071840’ or RowKey eq ‘1420072080’ or RowKey eq ‘1420072320’ or RowKey eq ‘1420072560’ or RowKey eq ‘1420072800’ or RowKey eq ‘1420073040’ or RowKey eq ‘1420073280’ or RowKey eq ‘1420073520’ or RowKey eq ‘1420073760’) or (PartitionKey eq ‘1420074000’) and (RowKey eq ‘1420074000’ or RowKey eq ‘1420074240’ or RowKey eq ‘1420074480’ or RowKey eq ‘1420074720’ or RowKey eq ‘1420074960’ or RowKey eq ‘1420075200’ or RowKey eq ‘1420075440’ or RowKey eq ‘1420075680’ or RowKey eq ‘1420075920’ or RowKey eq ‘1420076160’ or RowKey eq ‘1420076400’ or RowKey eq ‘1420076640’ or RowKey eq ‘1420076880’ or RowKey eq ‘1420077120’ or RowKey eq ‘1420077360’)

So, what are good PartitionKey/RowKey factors? It’s up to you and depends on your context but there are two constraints. First, a table service entity is limited to 1 MB with a maximum of 255 properties (including the PartitionKey, RowKey, and Timestamp). If you have data every second, 4 mins (60*4 = 240) seems to be a good RowKey Factor, but if you have several points per second it won’t be pertinent. The second problem is the filter length. Having too many partitions and rows will create a very long filter that can be rejected by the table service (HTTP 414 “Request URI too long”). In this case, you can execute partitions scans by removing row keys in the produced filter but this will be less efficient (depends on the number of rows in each partition)

Pros Cons
Highly scalable Point queries are not always possible (Uri length limits)
Fast inserts (if required) Granularity should be defined at the beginning (how many rows/partition ? how many points ?)
Dev tools can’t be used with dynamic columns

To conclude, Microsoft Azure Table Storage is a fast and very powerful service that you should not forget to use for your application, even if they are not hosted on Azure. However you should use proper designs for your tables, very far from traditional relational designs. Choosing a good couple of PartitionKey/RowKey is very important. We covered in this article a simple use case (time series) but there are so many scenarios… In terms of pricing, it’s very cheap ; but your choice should not be driven by costs here, but by scalability and performance.

Source code is available here : Basic and Advanced 

How to avoid 26 API requests on your page?

The problem

Create an applications relying on web APIs seems to be quite popular these days. There is already an impressive collection of ready-to-use public APIs (check it at ) that we can consume to create mashup, or add features into our web sites.

In the meantime, it has never been so easy to create your own REST-like API with node.js, web api, ruby or whatever tech you want. It’s also very common to create your own private/restricted API for your SPA, Cross-platform mobiles apps or your own IoT device. The naïve approach, when building our web API, is to add one API method for every feature ; at the end, we got a well-architectured and brilliant web API following the Separation of Concerns principle : one method for each feature. Let’s put it all together in your client … it’s a drama in terms of web performance for the end user and there is never less than 100s requests/sec  on staging environment; Look at your page, there are 26 API calls on your home page !

I talk here about web apps but it’s pretty the same for mobile native applications .RTT and latency is much more important than bandwidth speed. It’s impossible to create a responsive and efficient application with a chatty web API.

The proper approach

At the beginning of December 2014, I attended at the third edition in APIdays in Paris. There was an interesting session –among the others- on Scenario Driven Design by @ijansch.

The fundamental concept in any RESTful API is the resource. It’s an abstract concept and it’s merely different from a data resource. A resource should not be a raw data model (the result of an SQL query for the Web) but should be defined with client usage in mind. “REST is not an excuse to expose your raw data model. “ With this kind of approach, you will create dump clients and smart APIs with a thick business logic.

A common myth is “Ok, but we don’t know how our API is consumed, that’s why we expose raw data”. Most of the times, it’s false. Let’s take the example of twitter timelines. They are list of tweets or messages displayed in the order in which they were sent, with the most recent on top. This is a very common feature and you can see timelines in every twitter client. Twitter exposes a timeline API and API clients just have to call this API to get timelines. Especially, clients don’t have to compute timelines by themselves, by requesting XX times the twitter API for friends, tweets of friends, etc …

I think this is an important idea to keep in mind when designing our APIS. generally,  We don’t need to be so RESTful (Wwhat about HATEOS ?). Think more about API usability and scenarios, that RESTfullness.

The slides of this session are available here.

Another not-so-new approach: Batch requests

Reducing the number of request from a client is a common and well-known Web Performance Optimization technique. Instead of several small images, it’s better to use sprites. Instead of many js library files, it’s better to combine them. Instead of several API calls, we can use batch requests.

Batch requests are not REST-compliant, but we already know that we should sometimes break the rules to have better performance and better scaliblity.

If you find yourself in need of a batch operation, then most likely you just haven’t defined enough resources., Roy T. Fieldin, Father of REST

What is a batch request ?

A batch request contains several different API requests into a single POST request. HTTP provides a special content type for this kind of scenario: Multipart. On server-side, requests are unpacked and dispatched to the appropriate API methods. All responses are packed together and sent back to the client as a single HTTP response.

Here is an example of a batch request:


POST http://localhost:9000/api/batch HTTP/1.1
Content-Type: multipart/mixed; boundary="1418988512147"
Content-Length: 361

Content-Type: application/http; msgtype=request

GET /get1 HTTP/1.1
Host: localhost:9000

Content-Type: application/http; msgtype=request

GET /get2 HTTP/1.1
Host: localhost:9000

Content-Type: application/http; msgtype=request

GET /get3 HTTP/1.1
Host: localhost:9000



HTTP/1.1 200 OK
Content-Length: 561
Content-Type: multipart/mixed; boundary="91b1788f-6aec-44a9-a04f-84a687b9d180"
Server: Microsoft-HTTPAPI/2.0
Date: Fri, 19 Dec 2014 11:28:35 GMT

Content-Type: application/http; msgtype=response

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8

"I am Get1 !"
Content-Type: application/http; msgtype=response

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8

"I am Get2 !"
Content-Type: application/http; msgtype=response

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8

"I am Get3 !"

Batch requests are already supported by many web Frameworks and allowed by many API providers: web api, a node.js module, Google Cloud platform, Facebook, Stackoverflow,Twitter …

Batch support in web api

To support batch requests in your web API, you just have to add a new custom route  :

routeName: "batch",
routeTemplate: "api/batch",
batchHandler: new DefaultHttpBatchHandler(GlobalConfiguration.DefaultServer)

Tip : the DefaultBatchHandler doesn’t provide a way to limit the number of requests in a batch. To avoid performance issues, we may want to limit to 100/1000/… concurrent requests. You have to create your own implementation by inheriting DefaultHttpBatchHandler.

This new endpoint will allow client to send batch requests and you have nothing else to do on server-side. On client side,  to send batch requests, you can use this jquery.batch, batchjs, angular-http-batcher module, …

I will not explain all details here but there is an interesting feature provided by DefaultHttpBatchHandler : the property ExecutionOrder allow to choose between sequential or no sequential processing order. Thanks to the TAP programming model, it’s possible to execute API requests in parallel (for true async API methods)

Here is the result of async/sync batch requests for a pack of three 3 web methods taking one second to be processed.


Finally, Batch requests are not a must-have feature but it’s certainly something to keep in mind. It can help a lot in some situations. A very simple demo application is available here. Run this console app or try to browse localhost:9000/index.html. From my point of view here are some Pros/Cons of this approach.

Pros Cons
Better client performance (less calls) May increase complexity of client code
Really easy to implement on server side Hide real clients scenarios, not REST compliant
Parallel requests processing on server side Should limit batch size at server level for public API
Allow to use GET, POST, PUT, DELETE, … Browser cache may not work properly