Context

Aggregations are a powerful feature of the public API, allowing users to compute summary statistics such as totals, averages, and counts directly on the server.
This reduces the amount of data transferred to the client, optimizing performance and ensuring efficiency in data analysis.

The API enables aggregations through grouping and aggregation functions.
By leveraging these capabilities, users can obtain pre-aggregated data in a structured manner, which is particularly useful in financial and accounting applications.

A key advantage is that users can filter the data to be aggregated using all filtering options supported by DataLoaders.
This means filters can be applied not only to the main entity but also to its parent entity (if it exists) and its child entities (if they exist).
This provides extensive flexibility in refining aggregated results, ensuring that only relevant data is considered while maintaining optimal performance.

There is also a simpler alternative to perform aggregations without manually configuring grouping, filters, and aggregation parameters.
The Aggregation API provides a guided approach based on a predefined catalog of aggregations:

This approach is recommended for most use cases and does not require knowledge of the underlying business queries.

Some Use Case Examples

Advantages

Example of Query

The following query demonstrates how aggregations work.
It retrieves accounting entry lines where the subAccount code starts with 6 or 7, within the specified date range, and computes total sums for creditAmount and debitAmount, grouped by subAccountId and accountingEntry.date (by Quarter).

Only the aggregation results and grouping values are returned in the response, not the detailed lines used for the calculations.

query {
  accountingEntryLines(
    where: {
      and: [
        { accountingEntry: { date: { gte: "2021-01-01", lte: "2025-01-31" } } }
        { subAccount: {
            or: [
              { code: { startsWith: "6" } },
              { code: { startsWith: "7" } }
            ]
          }
        }
      ]
    }
  ) {
    edges {
      node {
        creditAmount
        debitAmount
        subAccountId
        accountingEntry {
          date
        }
      }
    }
    aggregate(
      aggregateFields: ["creditAmount", "debitAmount"],
      groupFields: ["subAccountId", "accountingEntry.date|Quarter"]
    ) {
      groupValues {
        name
        value
      }
      aggregates {
        name
        value
      }
    }
    pageInfo {
      endCursor
      hasNextPage
    }
  }
}

Definition of the aggregate Block

Aggregations work by grouping data using specified fields and then applying aggregation functions to compute summary values. Additionally, sorting can optionally be applied in ascending or descending order.

Aggregation is defined using three main components:

aggregate(
   aggregateFields: ["creditAmount", "debitAmount"],
   groupFields: ["subAccountId", "accountingEntry.date|Quarter"],
   orderFields: ["creditAmount|DESC"]
)

1. Aggregate Data

The aggregateFields array, which specifies the fields to aggregate, is mandatory.

Available aggregations are:

Aggregation Type Keyword Description Field Type
Sum SUM Returns the sum, default if no keyword is specified Decimal
Average AVG Returns the arithmetic mean Decimal
Minimum MIN Returns the smallest value Decimal
Maximum MAX Returns the largest value Decimal
Distinct Count DISTINCTCOUNT Counts the number of unique values for an Id field Id
Count COUNT Counts the number of values (use __count as field name) __count (constant)

By default, specifying ["creditAmount", "debitAmount"] in aggregateFields will return the sum of creditAmount and debitAmount for each group defined in groupFields.

Example with SUM
aggregateFields: ["creditAmount", "debitAmount"]

To use other aggregation types, you can append a | followed by the aggregation keyword to each field name.
Although it is implicit, you may also choose to explicitly mention SUM.

Example with MAX
aggregateFields: ["creditAmount|MAX", "debitAmount|MAX"]

To count how many distinct values appear for a specific field in each group (e.g. number of unique customers per product), use: DIISTINCTCOUNT

Example with DISTINCTCOUNT
aggregateFields: ["customerId|DISTINCTCOUNT"]

This returns the number of unique customer IDs in each group, e.g. number of different customers per product.

To count the number of items in each group (rather than aggregating a specific field), use the special keyword __count.
This is different from other aggregations, as it does not apply to a specific field but rather to the number of entries per group.

Examples with COUNT
aggregateFields: ["__count"]

You can also combine __count with other aggregations:

aggregateFields: ["__count", "creditAmount|MAX", "customerId|DISTINCTCOUNT"]
Example for a global COUNT (without groupFields)

When you want to get a total count without grouping, you must include id in the node.
Example: total number of customers

query {
  customers {
    edges {
      node {
        id
      }
    }
    aggregate(
      aggregateFields: ["__count"]
    ) {
      aggregates { value }
    }
  }
}

2. Grouping Data

The groupFields array, which specifies the fields used for grouping results, is optional.

  • If groupFields is defined, the query must also specify groupValues in the selection set to retrieve the grouping information:

    groupValues {
      name
      value
    }
    
    Example: grouping by subAccountId
    groupFields: ["subAccountId"]
    
  • If groupFields is not defined, the aggregation is computed on the entire dataset. This allows retrieving global indicators such as the total turnover of a customer without requiring any grouping by date or other dimensions.

    In this case, the groupValues block cannot be used because there is no grouping information to return.

    Here do not use:

    groupValues {
        name
        value
      }
    


For date-based grouping, users can define time intervals by using the | separator. The resulting grouped date value will always be expressed in the format YYYY-MM-DDT00:00:00.000Z, where the day component is always set to 01 for Month, Quarter, and Year groupings.

Grouping Type Example Value
Day 2024-03-30T00:00:00.000Z
Week 2024-03-30T00:00:00.000Z
Month 2024-03-01T00:00:00.000Z
Quarter 2024-07-01T00:00:00.000Z
Year 2024-01-01T00:00:00.000Z

Date grouping is one of the possible elements inside the groupFields array, meaning it can be combined with other fields for more granular aggregations.

Example: grouping by subAccountId and then by monthly period
groupFields: ["subAccountId", "accountingEntry.date|Month"]

3. Ordering Data

The orderFields array is optional and allows sorting of aggregated results based on specific fields.

In the aggregate block, each field can be ordered in ascending (ASC) or descending (DESC) order using the | separator.

For example, ordering by totalNet in descending order:

orderFields: ["totalNet|DESC"] 

When using a count aggregation with __count, you can also sort by the number of elements in each group:

orderFields: ["__count|DESC"]

This is particularly useful when you want to retrieve the most frequent groups, such as the top 10 customers by number of documents or the most used products.

This feature is particularly useful when retrieving Top N results.
For example, to get the top 3 customers by totalNet in descending order, limiting the number of records returned by the API, you can combine orderFields with first:

 salesQuotes(
    first: 3
    ...){
    ...
    aggregate(
       aggregateFields: ["totalNet", "totalLiquid"],
       groupFields: ["customerId", "documentDate|Quarter"],
       orderFields: ["totalNet|DESC"] 
    ) {
       groupValues { value }
       aggregates { value }
    }
 }

This ensures that only the top 3 customers by totalNet are returned, reducing the volume of data processed and making API responses more efficient.

Caution

orderFields and first can only be used to sort and retrieve a Top N when the expected result is a direct aggregation based on the fields specified in groupFields.

For example, if we group by customerId and want to retrieve the top 3 customers with the highest totalNet in descending order, orderFields and first can be used effectively.

However, if an additional aggregation needs to be applied after the API has performed its aggregation, such as grouping customers by country, then it will not be possible to limit the result to a Top N at the API level.
This is because we would need 100% of the customer-level aggregations first in order to compute the country-level aggregation locally.

In such cases, orderFields also becomes irrelevant, as the data must be re-aggregated and re-sorted locally after retrieval. For instance, instead of ordering by totalNet per customer, we would now need to order by totalNet per country after performing the local aggregation.

4. Retrieving Aggregated Results

The API processes the request and returns aggregated data structured by the specified groups.
The aggregated results may contain:

  • groupValues: The values of the fields used for grouping, each containing a name (field name) and a value (grouped value).
  • aggregates: The computed sums, averages, or counts for each group, each containing a name (aggregation type or field name) and a value (computed result).

Here is an example of a query that returns the names and values of both grouped and aggregated fields:

{
  aggregate(
    aggregateFields: ["debitAmount", "creditAmount"],
    groupFields: ["subAccountId", "accountingEntry.date|Quarter"],
    orderFields: ["debitAmount|DESC"] 
  ) {
    groupValues { name value }
    aggregates { name value }
  }
}

Example Response:
This response shows aggregated values grouped by subAccountId and by month, where each group contains the computed creditAmount and debitAmount.

  {
      "data": {
          "accountingEntryLines": {
              "edges": [],
              "aggregate": [
                  {
                      "groupValues": [
                          {
                              "name": "subAccountId",
                              "value": "fe5ddebf-92b5-4d6c-b56b-82efeefd95f6"
                          },
                          {
                              "name": "accountingEntry.date|Month",
                              "value": "2024-03-01T00:00:00.000Z"
                          }
                      ],
                      "aggregates": [
                          {
                              "name": "creditAmount",
                              "value": "6000"
                          },
                          {
                              "name": "debitAmount",
                              "value": "0"
                          }
                      ]
                  },
                  ...
              ]
          }
      }
  }

Alternatively, for a simplified definition, we could simply mention that groupValues and aggregates return values.

{
  groupValues { value }
  aggregates { value }
}

5. Pagination

  • Pagination works the same way as in a standard query, as explained in the pagination documentation.
    Key concepts / API - Pagination
    However, it applies to aggregated results rather than the detailed queried data used for the calculations.
  • Retrieving the Full Aggregation Results:
    To retrieve the full set of aggregation results, the query must be executed multiple times, assigning $endCursor with the endCursor value from the previous response, until hasNextPage is false.
  • Pagination Limits:
    The maximum number of aggregation results that can be retrieved in one request is 500 (first:500).
    If hasNextPage is true, additional requests should be made using after: $endCursor.

Example: If pagination is set to 50 using first: 50 and the requested filter covers 2000 accounting entry lines, all 2000 lines will be used for aggregations and groupings, but only the first 50 aggregation results will be returned.

query ($endCursor: String!) {
  salesInvoices (
    first: 50 
    after: $endCursor
    where: {...}
  ) {
    edges {
      node {
        ...
      }
    }
    aggregate(
      aggregateFields: [...],
      groupFields: [...],
      orderFields: [...]
    ) 
    {
      pageInfo { 
        endCursor
        hasNextPage
      }
    }
  }
}

Important Points and Limitations

Grouping Restrictions for Child Entities

While it is possible to group by fields from an entity and its main entity, it is not allowed to group by fields from child entities.

For example, grouping by subAccountId is valid because it belongs to the main entity. Similarly, grouping by accountingEntry.date is also allowed, as accountingEntry is the parent entity of accountingEntryLines.
However, grouping by subAccount.code is not possible since subAccount is a child entity.

This limitation exists because of data loader optimizations. When executing a query, the system splits it into multiple queries to optimize performance:

  1. The main query retrieves the primary entity records.
  2. Separate follow-up queries fetch child entity data, but only for the specific child entities referenced in the query.

In the case of aggregations, child entity data has not yet been retrieved at the time of aggregation.
Since the grouping operation happens at the aggregation stage, before the child entity queries are executed, it is not possible to group by a field that has not yet been loaded.

Exposing Fields Used in aggregateFields and groupFields

It is mandatory to declare all fields used for aggregation or grouping in the edges node.

For example, to aggregate and group:

{
  ...
  aggregate(
    aggregateFields: ["creditAmount", "debitAmount"],
    groupFields: ["subAccountId", "accountingEntry.date|Quarter"]
  )
  ...
}

You must expose:

edges {
  node {
    creditAmount
    debitAmount
    subAccountId
    accountingEntry {
      date
    }
  }
}

Conversely, it is not allowed to expose fields in edges that are not used in the aggregation.

For example, if totalNet is exposed in edges while the aggregation is:

{
  ...
  aggregate(
    aggregateFields: ["totalQuantity"],
    groupFields: ["productId", "salesOrder.documentDate|Quarter"]
  )
  ...
}

The following error will be returned:

{
  "errors": [
      {
          "message": "totalNet field is not used in the aggregation and should be removed from edges.",
          "extensions": {
              "code": "INVALID_INPUT"
          }
      }
  ]
}
Understanding totalCount in Aggregations

In a standard query, totalCount typically returns the total number of records matching the query, regardless of the number of results returned based on the defined pagination.

However, in the context of aggregations, while totalCount remains syntactically valid, its value is not meaningful.
This is because aggregations transform and group data, making the raw count of records irrelevant to the aggregated output.

This is not a critical issue, as aggregation logic inherently requires retrieving the entire set of results to be useful.
Consequently, the total number of aggregated results can be determined by fully paginating through the aggregation query, as explained in the pagination section.