Why is GraphQL not suitable for reports?
This article describes my personal experience dealing with GraphQL usage for reports in the Asp.Net Core application. I’m not claiming that my solution is the best one. However, I was focused on achieving low maintenance costs and simple implementation.
I will provide only a high-level overview of GraphQL without diving deep into more advanced features like Schema Stitching, etc.
I’m open to discussion about this topic, so feel free to prove I’m wrong :)
What is GraphQL?
GraphQL is a tool that provides dynamic queries against application data.
The main benefit is that client of API can easily configure the data he wants to retrieve. GraphQL is a beautiful tool that can optimize API-related work to serve different projections of the same normalized source data.
For example, an E-commerce project may use it to retrieve the different amounts of information about products, depending on the current screen.
How I tried to use GraphQL?
Imagine a read-only reporting application as an example. Read-only means that the data is already collected and validated elsewhere and then just pushed into our database.
The responsibility of our application is to aggregate and show data depending on selected dimensions.
For instance, let’s imagine a denormalized table containing some metrics values (Like Sales Revenue or amount of new Customers, etc.) that we need to aggregate by TenantId and display.
In this example:
- MetricValues is a denormalized table that contains a metric value per Tenant, Country, Region, and Date.
- The metrics table contains a description of each Metric.
- Tenants table contains information about tenants and refers to the multitenant application as a source of Metrics.
At first glance, GraphQL will give a huge benefit, as it will allow filtering this denormalized table by any dimension that we want. So, we can actually build multiple different reports on top of the same structure.
Let’s create a GraphQL endpoint with HotChocolate (one of the GraphQL packages for .Net):
It looks pretty straightforward, but the question immediately arose. How can we aggregate the metric value by Tenant?
The problem of Data Aggregations
Group data on the client
The first and simplest decision would be to group data on the client and calculate all aggregated values.
This solution might seem feasible on a small scale. However, on a larger scale, the performance impact will be dramatic. Databases are optimized to manipulate data in almost any way. However, client applications don’t have the proper tools to do that.
Also, transfers of denormalized data will impact network throughput and the time needed to request and render the data.
This solution will definitely increase the cost of maintenance and will create performance issues in the future.
Group data inside Query class
Another option would be to group data inside the query class.
However, with this approach, we will lose the ability to filter by dimensions, which was the sole purpose of GraphQL usage in the first place.
There is a way to add parameters for each dimension to filter by them.
However, such an approach will not differ from the usage Asp.Net Core controllers for serving data.
Group data with custom HotChocolate Extension
There is no clear solution, so I would assume that these extensions have to be introduced case-by-case, resulting in a more complicated development process. That’s why I did not consider this option, as we will have to dive deep into package-specific configurations any time we need to have a new report in place.
Analyzing all challenges GrpahQL created when used together with denormalized data, I had to switch from GraphQL to my custom package called QueryNinja.
Functionality is pretty much the same. However, it allows applying filters, sorting rules, and projections in the middle of the Queryable chain. Let’s take a look at our example report implemented with QueryNinja.
As we can see, we need to call
.WithQuery() The extension method on desired IQueryable instance will append all of the specified filters and ordering rules in place. IQuery instance contains filters and sorting rules and will be deserialized from request query parameters. More details in GitHub wiki.
With this approach, we have an opportunity to filter denormalized data first and then aggregate it as we need.
This article describes the problem that I was dealing with recently. I did my best to research all possible options and refuse the usage of my package in favor of already existing GraphQL packages. Including:
I did not claim that my package is nearly as good as GraphQL-related packages. And I have to appreciate the work done by the GraphQL community.
GraphQL still has powerful advantages like:
- The schema allows you to verify your request before you will execute it.
- Schema Stitching allows joining multiple different data sources under the same GraphQL endpoint.
- Define the Security rules to limit malicious usage of GraphQL endpoint.
However, this particular case of reporting applications creates complex issues to overcome with existing GraphQL implementations.