Aggregations

Quick Links

Accounts Accounting Entries Products Customers Sales quotes Sales invoices Suppliers Purchase invoices

Context

Aggregations are a powerful feature of the public API, allowing users to compute summary statistics such as totals, averages, and counts directly on the server.
This reduces the amount of data transferred to the client, optimizing performance and ensuring efficiency in data analysis.

The API enables aggregations through grouping and aggregation functions.
By leveraging these capabilities, users can obtain pre-aggregated data in a structured manner, which is particularly useful in financial and accounting applications.

A key advantage is that users can filter the data to be aggregated using all filtering options supported by DataLoaders.
This means filters can be applied not only to the main entity but also to its parent entity (if it exists) and its child entities (if they exist).
This provides extensive flexibility in refining aggregated results, ensuring that only relevant data is considered while maintaining optimal performance.

Simpler alternative: Aggregation API

There is also a simpler alternative to perform aggregations without manually configuring grouping, filters, and aggregation parameters.
The Aggregation API provides a guided approach based on a predefined catalog of aggregations:

use aggregationCatalog to discover what can be aggregated
use aggregationExecute to run the aggregation using only the allowed parameters

This approach is recommended for most use cases and does not require knowledge of the underlying business queries.

Some Use Case Examples

Financial Reporting: Aggregate accounting entry lines to compute total debit and credit amounts per account over a specified period, such as monthly or quarterly balances. This enables users to track financial health and prepare accurate financial statements.

Account Code	Date (Aggregated by Month)	Total Debit	Total Credit
`601000` (Purchases of Goods)	`2024-03-01`	`1500,00`	`0,00`
`606000` (General Expenses)	`2024-03-01`	`8500,00`	`0,00`
`701000` (Product Sales)	`2024-03-01`	`0,00`	`22,00`
`707000` (Service Sales)	`2024-03-01`	`0,00`	`35,500`

Journal-Level Analysis: Aggregate debit and credit amounts per journal to analyze financial flows at a higher level, ensuring compliance and better financial tracking. Since accounting entries must always balance, total debit and total credit amounts are equal.

Journal Code	Date (Aggregated by Month)	Total Debit	Total Credit
`GEN`	`2024-03-01`	`34850,00`	`34850,00`
`SALES`	`2024-03-01`	`11200,00`	`11200,00`
`PURCHASE`	`2024-03-01`	`1050,00`	`1050,00`
`BANK`	`2024-03-01`	`400,00`	`400,00`

Sales Performance Analysis: Summarize sales document lines to calculate the total quantity sold per product or the total revenue per product category, helping businesses evaluate best-selling products and optimize inventory management.

Product	Date (Aggregated by Month)	Total Quantity Sold	Total Sales Amount
`PRD001`	`2024-03-01`	`150`	`45,00`
`PRD002`	`2024-03-01`	`85`	`25,500`
`PRD003`	`2024-03-01`	`230`	`78,00`

Customer Revenue Insights: Aggregate sales document headers to determine the total revenue per customer, allowing businesses to identify high-value clients and adjust marketing or sales strategies accordingly.

Customer Code	Date (Aggregated by Quarter)	Total Sales Amount
`CUST001`	`2024-04-01`	`150,00`
`CUST002`	`2024-04-01`	`97,500`
`CUST003`	`2024-04-01`	`42,600`
`CUST004`	`2024-04-01`	`205,300`

Purchase Invoice Analysis: Aggregate purchase invoice amounts per supplier to track total expenses over a given period, helping businesses manage supplier relationships and control costs.

Supplier Code	Date (Aggregated by Quarter)	Total Purchase Amount
`SUPP001`	`2024-04-01`	`320,00`
`SUPP002`	`2024-04-01`	`210,500`

Advantages

Efficient and Optimized Data Processing: Aggregations are computed on the server, reducing network load, minimizing database queries, and enhancing performance by returning pre-aggregated results.
Scalability: Handles large datasets efficiently without performance degradation, ensuring seamless data retrieval even for complex queries.
Flexible Grouping and Filtering: Users can define custom groups using account IDs, dates, or other fields while applying filters on the main entity, its parent entity (if applicable), and its child entities (if they exist) for precise data control.
Simplified Querying and Comprehensive Analysis: A single request retrieves structured insights, eliminating complex data post-processing while enabling detailed financial and operational reporting.
Flexible Time-Based Grouping: Aggregations allow for grouping data using different time periods such as Day, Week, Month, Quarter, and Year.
Minimal Overhead for High-Level Analysis: The API’s aggregation capabilities enable users to perform advanced financial analysis efficiently, reducing computational workload on the client side.

Example of Query

The following query demonstrates how aggregations work.
It retrieves accounting entry lines where the subAccount code starts with 6 or 7, within the specified date range, and computes total sums for creditAmount and debitAmount, grouped by subAccountId and accountingEntry.date (by Quarter).

Only the aggregation results and grouping values are returned in the response, not the detailed lines used for the calculations.

query {
  accountingEntryLines(
    where: {
      and: [
        { accountingEntry: { date: { gte: "2021-01-01", lte: "2025-01-31" } } }
        { subAccount: {
            or: [
              { code: { startsWith: "6" } },
              { code: { startsWith: "7" } }
            ]
          }
        }
      ]
    }
  ) {
    edges {
      node {
        creditAmount
        debitAmount
        subAccountId
        accountingEntry {
          date
        }
      }
    }
    aggregate(
      aggregateFields: ["creditAmount", "debitAmount"],
      groupFields: ["subAccountId", "accountingEntry.date|Quarter"]
    ) {
      groupValues {
        name
        value
      }
      aggregates {
        name
        value
      }
    }
    pageInfo {
      endCursor
      hasNextPage
    }
  }
}

Definition of the `aggregate` Block

Aggregations work by grouping data using specified fields and then applying aggregation functions to compute summary values. Additionally, sorting can optionally be applied in ascending or descending order.

Aggregation is defined using three main components:

aggregateFields: Specifies the fields to be aggregated (e.g., sum of amounts).
groupFields: Defines how the data should be grouped (e.g., by account, by date).
orderFields: Determines the sorting order of the aggregated results (e.g., ordering by totalNet in descending order).

aggregate(
   aggregateFields: ["creditAmount", "debitAmount"],
   groupFields: ["subAccountId", "accountingEntry.date|Quarter"],
   orderFields: ["creditAmount|DESC"]
)

1. Aggregate Data

The aggregateFields array, which specifies the fields to aggregate, is mandatory.

Available aggregations are:

Aggregation Type	Keyword	Description	Field Type
Sum	`SUM`	Returns the sum, default if no keyword is specified	Decimal
Average	`AVG`	Returns the arithmetic mean	Decimal
Minimum	`MIN`	Returns the smallest value	Decimal
Maximum	`MAX`	Returns the largest value	Decimal
Distinct Count	`DISTINCTCOUNT`	Counts the number of unique values for an Id field	Id
Count	`COUNT`	Counts the number of values (use `__count` as field name)	__count (constant)

By default, specifying ["creditAmount", "debitAmount"] in aggregateFields will return the sum of creditAmount and debitAmount for each group defined in groupFields.

Example with SUM

aggregateFields: ["creditAmount", "debitAmount"]

To use other aggregation types, you can append a | followed by the aggregation keyword to each field name.
Although it is implicit, you may also choose to explicitly mention SUM.

Example with MAX

aggregateFields: ["creditAmount|MAX", "debitAmount|MAX"]

To count how many distinct values appear for a specific field in each group (e.g. number of unique customers per product), use: DIISTINCTCOUNT

Example with DISTINCTCOUNT

aggregateFields: ["customerId|DISTINCTCOUNT"]

This returns the number of unique customer IDs in each group, e.g. number of different customers per product.

To count the number of items in each group (rather than aggregating a specific field), use the special keyword __count.
This is different from other aggregations, as it does not apply to a specific field but rather to the number of entries per group.

Examples with COUNT

aggregateFields: ["__count"]

You can also combine __count with other aggregations:

aggregateFields: ["__count", "creditAmount|MAX", "customerId|DISTINCTCOUNT"]

Example for a global COUNT (without `groupFields`)

When you want to get a total count without grouping, you must include id in the node.
Example: total number of customers

query {
  customers {
    edges {
      node {
        id
      }
    }
    aggregate(
      aggregateFields: ["__count"]
    ) {
      aggregates { value }
    }
  }
}

2. Grouping Data

The groupFields array, which specifies the fields used for grouping results, is optional.

If groupFields is defined, the query must also specify groupValues in the selection set to retrieve the grouping information:
```
groupValues {
  name
  value
}
```
Example: grouping by subAccountId
```
groupFields: ["subAccountId"]
```
If groupFields is not defined, the aggregation is computed on the entire dataset. This allows retrieving global indicators such as the total turnover of a customer without requiring any grouping by date or other dimensions.

In this case, the groupValues block cannot be used because there is no grouping information to return.

Here do not use:
```
groupValues {
    name
    value
  }
```

For date-based grouping, users can define time intervals by using the | separator. The resulting grouped date value will always be expressed in the format YYYY-MM-DDT00:00:00.000Z, where the day component is always set to 01 for Month, Quarter, and Year groupings.

Grouping Type	Example Value
`Day`	`2024-03-30T00:00:00.000Z`
`Week`	`2024-03-30T00:00:00.000Z`
`Month`	`2024-03-01T00:00:00.000Z`
`Quarter`	`2024-07-01T00:00:00.000Z`
`Year`	`2024-01-01T00:00:00.000Z`

Date grouping is one of the possible elements inside the groupFields array, meaning it can be combined with other fields for more granular aggregations.

Example: grouping by subAccountId and then by monthly period

groupFields: ["subAccountId", "accountingEntry.date|Month"]

3. Ordering Data

The orderFields array is optional and allows sorting of aggregated results based on specific fields.

In the aggregate block, each field can be ordered in ascending (ASC) or descending (DESC) order using the | separator.

For example, ordering by totalNet in descending order:

orderFields: ["totalNet|DESC"] 

When using a count aggregation with __count, you can also sort by the number of elements in each group:

orderFields: ["__count|DESC"]

This is particularly useful when you want to retrieve the most frequent groups, such as the top 10 customers by number of documents or the most used products.

This feature is particularly useful when retrieving Top N results.
For example, to get the top 3 customers by totalNet in descending order, limiting the number of records returned by the API, you can combine orderFields with first:

 salesQuotes(
    first: 3
    ...){
    ...
    aggregate(
       aggregateFields: ["totalNet", "totalLiquid"],
       groupFields: ["customerId", "documentDate|Quarter"],
       orderFields: ["totalNet|DESC"] 
    ) {
       groupValues { value }
       aggregates { value }
    }
 }

This ensures that only the top 3 customers by totalNet are returned, reducing the volume of data processed and making API responses more efficient.

Caution

orderFields and first can only be used to sort and retrieve a Top N when the expected result is a direct aggregation based on the fields specified in groupFields.

For example, if we group by customerId and want to retrieve the top 3 customers with the highest totalNet in descending order, orderFields and first can be used effectively.

However, if an additional aggregation needs to be applied after the API has performed its aggregation, such as grouping customers by country, then it will not be possible to limit the result to a Top N at the API level.
This is because we would need 100% of the customer-level aggregations first in order to compute the country-level aggregation locally.

In such cases, orderFields also becomes irrelevant, as the data must be re-aggregated and re-sorted locally after retrieval. For instance, instead of ordering by totalNet per customer, we would now need to order by totalNet per country after performing the local aggregation.

4. Retrieving Aggregated Results

The API processes the request and returns aggregated data structured by the specified groups.
The aggregated results may contain:

groupValues: The values of the fields used for grouping, each containing a name (field name) and a value (grouped value).
aggregates: The computed sums, averages, or counts for each group, each containing a name (aggregation type or field name) and a value (computed result).

Here is an example of a query that returns the names and values of both grouped and aggregated fields:

{
  aggregate(
    aggregateFields: ["debitAmount", "creditAmount"],
    groupFields: ["subAccountId", "accountingEntry.date|Quarter"],
    orderFields: ["debitAmount|DESC"] 
  ) {
    groupValues { name value }
    aggregates { name value }
  }
}

Example Response:
This response shows aggregated values grouped by subAccountId and by month, where each group contains the computed creditAmount and debitAmount.

  {
      "data": {
          "accountingEntryLines": {
              "edges": [],
              "aggregate": [
                  {
                      "groupValues": [
                          {
                              "name": "subAccountId",
                              "value": "fe5ddebf-92b5-4d6c-b56b-82efeefd95f6"
                          },
                          {
                              "name": "accountingEntry.date|Month",
                              "value": "2024-03-01T00:00:00.000Z"
                          }
                      ],
                      "aggregates": [
                          {
                              "name": "creditAmount",
                              "value": "6000"
                          },
                          {
                              "name": "debitAmount",
                              "value": "0"
                          }
                      ]
                  },
                  ...
              ]
          }
      }
  }

Alternatively, for a simplified definition, we could simply mention that groupValues and aggregates return values.

{
  groupValues { value }
  aggregates { value }
}

5. Pagination

Pagination works the same way as in a standard query, as explained in the pagination documentation.
Key concepts / API - Pagination
However, it applies to aggregated results rather than the detailed queried data used for the calculations.
Retrieving the Full Aggregation Results:
To retrieve the full set of aggregation results, the query must be executed multiple times, assigning $endCursor with the endCursor value from the previous response, until hasNextPage is false.
Pagination Limits:
The maximum number of aggregation results that can be retrieved in one request is 500 (first:500).
If hasNextPage is true, additional requests should be made using after: $endCursor.

Example: If pagination is set to 50 using first: 50 and the requested filter covers 2000 accounting entry lines, all 2000 lines will be used for aggregations and groupings, but only the first 50 aggregation results will be returned.

query ($endCursor: String!) {
  salesInvoices (
    first: 50 
    after: $endCursor
    where: {...}
  ) {
    edges {
      node {
        ...
      }
    }
    aggregate(
      aggregateFields: [...],
      groupFields: [...],
      orderFields: [...]
    ) 
    {
      pageInfo { 
        endCursor
        hasNextPage
      }
    }
  }
}

Important Points and Limitations

Grouping Restrictions for Child Entities

While it is possible to group by fields from an entity and its main entity, it is not allowed to group by fields from child entities.

For example, grouping by subAccountId is valid because it belongs to the main entity. Similarly, grouping by accountingEntry.date is also allowed, as accountingEntry is the parent entity of accountingEntryLines.
However, grouping by subAccount.code is not possible since subAccount is a child entity.

This limitation exists because of data loader optimizations. When executing a query, the system splits it into multiple queries to optimize performance:

The main query retrieves the primary entity records.
Separate follow-up queries fetch child entity data, but only for the specific child entities referenced in the query.

In the case of aggregations, child entity data has not yet been retrieved at the time of aggregation.
Since the grouping operation happens at the aggregation stage, before the child entity queries are executed, it is not possible to group by a field that has not yet been loaded.

Exposing Fields Used in `aggregateFields` and `groupFields`

It is mandatory to declare all fields used for aggregation or grouping in the edges node.

For example, to aggregate and group:

{
  ...
  aggregate(
    aggregateFields: ["creditAmount", "debitAmount"],
    groupFields: ["subAccountId", "accountingEntry.date|Quarter"]
  )
  ...
}

You must expose:

edges {
  node {
    creditAmount
    debitAmount
    subAccountId
    accountingEntry {
      date
    }
  }
}

Conversely, it is not allowed to expose fields in edges that are not used in the aggregation.

For example, if totalNet is exposed in edges while the aggregation is:

{
  ...
  aggregate(
    aggregateFields: ["totalQuantity"],
    groupFields: ["productId", "salesOrder.documentDate|Quarter"]
  )
  ...
}

The following error will be returned:

{
  "errors": [
      {
          "message": "totalNet field is not used in the aggregation and should be removed from edges.",
          "extensions": {
              "code": "INVALID_INPUT"
          }
      }
  ]
}

Understanding `totalCount` in Aggregations

In a standard query, totalCount typically returns the total number of records matching the query, regardless of the number of results returned based on the defined pagination.

However, in the context of aggregations, while totalCount remains syntactically valid, its value is not meaningful.
This is because aggregations transform and group data, making the raw count of records irrelevant to the aggregated output.

This is not a critical issue, as aggregation logic inherently requires retrieving the entire set of results to be useful.
Consequently, the total number of aggregated results can be determined by fully paginating through the aggregation query, as explained in the pagination section.

< 6. Benefits of DataLoaders