GroupBy Query Python Examples

Now that you understand the Intersight Telemetry API, you might want to extract inforamtion via Python. All the information from the previous articles is still relevant. If you want to learn more about the Telemetry APIs in general, please review the previous articles. This article will focus on applying that knowledge to the Intersight Python SDK.

Installation and Authentication

You will have to install the Python SDK on your machine for all of the examples below. Please have a look at the Intersight Developer Center Downloads or use pip for installation.

Once you have the module installed, you will need to write some code to handle authentication. Authentication for the Telemetry APIs is handled like it is for any other API in Intersight. You will have to get an API Key and Secret Key from Intersight. You can then use them to create an API Client in the Python SDK. If you are unfamiliar with this process, you can review the Intersight Python SDK documentation. All the examples in this section will assume that authentication is already created and will not include the code for it.

Request elements

Before we get started writing requests, let's have a quick look at all the request elements mapped to Python.

Property Location Class
queryType
dataSource intersight.model.telemetry_druid_data_source TelemetryDruidDataSource
intervals
granularity intersight.model.telemetry_druid_granularity TelemetryDruidGranularity
aggregations intersight.model.telemetry_druid_aggregator TelemetryDruidAggregator
filter intersight.model.telemetry_druid_and_filter intersight.model.telemetry_druid_selector_filter TelemetryDruidAndFilter TelemetryDruidSelectorFilter
postAggregations intersight.model.telemetry_druid_post_aggregator TelemetryDruidPostAggregator
dimensions intersight.model.telemetry_druid_dimension_spec TelemetryDruidDimensionSpec
limitSpec intersight.model.telemetry_druid_default_limit_spec TelemetryDruidDefaultLimitSpec

As you can see, most of the elements require the import of a class and can't just be specified in pure strings or objects. There are further classes available for different Druid features that we are not discussing in this guide. You can browse the module to have a look at them all.

Basic query

If you just want to query data for a certain instrument, the code is actually quite simple and will look something like this:

from intersight.api import telemetry_api
from intersight.model.telemetry_druid_group_by_request import TelemetryDruidGroupByRequest
from intersight.model.telemetry_druid_data_source import TelemetryDruidDataSource
from intersight.model.telemetry_druid_granularity import TelemetryDruidGranularity
from intersight.model.telemetry_druid_and_filter import TelemetryDruidAndFilter
from intersight.model.telemetry_druid_selector_filter import TelemetryDruidSelectorFilter
from intersight.model.telemetry_druid_aggregator import TelemetryDruidAggregator

api_instance = telemetry_api.TelemetryApi(api_client)

group_by_request = TelemetryDruidGroupByRequest(
     query_type = "groupBy",
     data_source = TelemetryDruidDataSource(
         type = "table",
         name = "PhysicalEntities"
     ),
     granularity = TelemetryDruidGranularity(
        type = "period",
        origin = None,
        time_zone = "Europe/Vienna",
        duration = None,
        period = "PT12H"
     ),
     intervals = ["2024-02-20T19:47:00.000Z/2024-02-28T19:47:00.000Z"],
     dimensions = [],
     filter = TelemetryDruidAndFilter(
         type = "and",
         fields = [TelemetryDruidSelectorFilter(
             type = "selector",
             dimension = "instrument.name",
             value = "hw.fan"
         )]
     ),
     aggregations = [TelemetryDruidAggregator(
         type = "longMax",
         name = "hw-fan-speed_max-Max",
         field_name = "hw.fan.speed_max"
     )]
)
api_response = api_instance.query_telemetry_group_by(group_by_request)
for entry in api_response:
    print(f"{entry['timestamp']} - {entry['event']}")

Let's go over this query. The interval and granularity determine the time period and the amount of results that can be retrieved. You can use all the same settings described in the generic API documentation. Same goes for the data_source. Based on the metric you want to query, you need to adapt the naem of the table in the data_source. This is linked to the filter where you can select which rows you want to be used for the result. Intersight has many different events in one table, thus you need to filter for specific events, like "hw.Fan" - Fan-related data. Finally, to actually get data, we need to add an aggregation which determines the output of our query. In this case we are selecting the maximum value of the field "hw.fan.speed_max". Please refer back to the generic documentation for any specifics around each of these inputs.

Our query has an interval of 7 days with a period of 12 hours, thus we should get 14 results, each showing the maximum value of the fan speed during that period. This will result in an output like this:

2024-02-20 12:00:00+01:00 - {'hw-fan-speed_max-Max': 19110.0}
2024-02-21 00:00:00+01:00 - {'hw-fan-speed_max-Max': 19565.0}
ADDITIONAL RESULTS ARE HIDDEN

Of course you can format the output any way you want and use it as an input for other code.

Filters

The previous example retrieves the maximum value for all results. If we wanted to only retrieve the maximum value for one device type, we would have to add an additional filter. Let's say we want to filter for one specific model of server. This would result in a query like this:

from intersight.api import telemetry_api
from intersight.model.telemetry_druid_group_by_request import TelemetryDruidGroupByRequest
from intersight.model.telemetry_druid_data_source import TelemetryDruidDataSource
from intersight.model.telemetry_druid_granularity import TelemetryDruidGranularity
from intersight.model.telemetry_druid_and_filter import TelemetryDruidAndFilter
from intersight.model.telemetry_druid_selector_filter import TelemetryDruidSelectorFilter
from intersight.model.telemetry_druid_aggregator import TelemetryDruidAggregator

api_instance = telemetry_api.TelemetryApi(api_client)

group_by_request = TelemetryDruidGroupByRequest(
     query_type = "groupBy",
     data_source = TelemetryDruidDataSource(
         type = "table",
         name = "PhysicalEntities"
     ),
     granularity = TelemetryDruidGranularity(
        type = "period",
        origin = None,
        time_zone = "Europe/Vienna",
        duration = None,
        period = "PT12H"
     ),
     intervals = ["2024-02-20T19:47:00.000Z/2024-02-28T19:47:00.000Z"],
     dimensions = [],
     filter = TelemetryDruidAndFilter(
         type = "and",
         fields = [TelemetryDruidSelectorFilter(
             type = "selector",
             dimension = "instrument.name",
             value = "hw.fan"
         ), TelemetryDruidSelectorFilter(
             type = "selector",
             dimension = "model",
             value = "UCSC-C220-M6S"
         )]
     ),
     aggregations = [TelemetryDruidAggregator(
         type = "longMax",
         name = "hw-fan-speed_max-Max",
         field_name = "hw.fan.speed_max"
     )]
)
api_response = api_instance.query_telemetry_group_by(group_by_request)
for entry in api_response:
    print(f"{entry['timestamp']} - {entry['event']}")

In addition to the basic query, this also adds a filter that reduces the amount of rows that will be retrieved. Thus the maximum is only calculated on rows of data that have been retrieved from "UCSC-C220-M6S" servers. Thus the result will look slightly different even though the query was performed on the same account:

2024-02-20 12:00:00+01:00 - {'hw-fan-speed_max-Max': 19110.0}
2024-02-21 00:00:00+01:00 - {'hw-fan-speed_max-Max': 19110.0}
ADDITIONAL RESULTS ARE HIDDEN

You could add additional filters or modify this filter by adding more attributes to it based on your requirements. This will allow you to retrieve data only for those elements that you really need information for.

GroupBy clauses

The basic query retrieves one result for all endpoints. If you wanted to get results for certain groups of endpoints, or even individual endpoints, you would have to adjust the query. This can be achieved by adding "dimensions". Let's say we want to get the maximum fan speed for each host. The query for this would look something like this:

from intersight.api import telemetry_api
from intersight.model.telemetry_druid_group_by_request import TelemetryDruidGroupByRequest
from intersight.model.telemetry_druid_data_source import TelemetryDruidDataSource
from intersight.model.telemetry_druid_granularity import TelemetryDruidGranularity
from intersight.model.telemetry_druid_dimension_spec import TelemetryDruidDimensionSpec
from intersight.model.telemetry_druid_and_filter import TelemetryDruidAndFilter
from intersight.model.telemetry_druid_selector_filter import TelemetryDruidSelectorFilter
from intersight.model.telemetry_druid_aggregator import TelemetryDruidAggregator

api_instance = telemetry_api.TelemetryApi(api_client)

group_by_request = TelemetryDruidGroupByRequest(
     query_type = "groupBy",
     data_source = TelemetryDruidDataSource(
         type = "table",
         name = "PhysicalEntities"
     ),
     granularity = TelemetryDruidGranularity(
        type = "period",
        origin = None,
        time_zone = "Europe/Vienna",
        duration = None,
        period = "PT12H"
     ),
     intervals = ["2024-02-20T19:47:00.000Z/2024-02-28T19:47:00.000Z"],
     dimensions = [TelemetryDruidDimensionSpec(
         type = "default",
         dimension = "host.name",
         output_name = "host.name"
     )],
     filter = TelemetryDruidAndFilter(
         type = "and",
         fields = [TelemetryDruidSelectorFilter(
             type = "selector",
             dimension = "instrument.name",
             value = "hw.fan"
         )]
     ),
     aggregations = [TelemetryDruidAggregator(
         type = "longMax",
         name = "hw-fan-speed_max-Max",
         field_name = "hw.fan.speed_max"
     )]
)
api_response = api_instance.query_telemetry_group_by(group_by_request)
for entry in api_response:
    print(f"{entry['timestamp']} - {entry['event']}")

This looks like a very minor change, but the result actually changes drastically based on this:

2024-02-20 12:00:00+01:00 - {'host.name': 'Yosemite FI-A', 'hw-fan-speed_max-Max': 7317.0}
2024-02-20 12:00:00+01:00 - {'host.name': 'Yosemite FI-B', 'hw-fan-speed_max-Max': 8244.0}
ADDITIONAL RESULTS ARE HIDDEN

As you can see, we now have multiple results for the same timestamp. All unique combinations of dimensions will result in one result per timestamp. In our case we chose the Hostname as a dimension, thus we get one result per host and timestamp. You could use any type of grouping to get the output that is most helpful for you.

TopN requests

Sometimes you might want to understand if there are any outliers or problems in your environment. For that purpose, the highest or lowest values are usually most interesting. If you ran a request that retrieved one datapoint per server and timestamp, you could have results with 1000s of lines. To make the whole thing more manageable, we can limit the result set to only the top few datapoints. This often makes sense if you already have a GroupBy clause in the query. You can then limit the number of output rows by addding a LimitSpec like this:

from intersight.api import telemetry_api
from intersight.model.telemetry_druid_group_by_request import TelemetryDruidGroupByRequest
from intersight.model.telemetry_druid_data_source import TelemetryDruidDataSource
from intersight.model.telemetry_druid_granularity import TelemetryDruidGranularity
from intersight.model.telemetry_druid_dimension_spec import TelemetryDruidDimensionSpec
from intersight.model.telemetry_druid_default_limit_spec import TelemetryDruidDefaultLimitSpec
from intersight.model.telemetry_druid_and_filter import TelemetryDruidAndFilter
from intersight.model.telemetry_druid_selector_filter import TelemetryDruidSelectorFilter
from intersight.model.telemetry_druid_aggregator import TelemetryDruidAggregator

api_instance = telemetry_api.TelemetryApi(api_client)

group_by_request = TelemetryDruidGroupByRequest(
     query_type = "groupBy",
     data_source = TelemetryDruidDataSource(
         type = "table",
         name = "PhysicalEntities"
     ),
     granularity = TelemetryDruidGranularity(
        type = "period",
        origin = None,
        time_zone = "Europe/Vienna",
        duration = None,
        period = "PT12H"
     ),
     intervals = ["2024-02-20T19:47:00.000Z/2024-02-28T19:47:00.000Z"],
     dimensions = [TelemetryDruidDimensionSpec(
         type = "default",
         dimension = "host.name",
         output_name = "host.name"
     )],
     limit_spec = TelemetryDruidDefaultLimitSpec(
         limit = 2,
         offset = 0,
         columns = []
     ),
     filter = TelemetryDruidAndFilter(
         type = "and",
         fields = [TelemetryDruidSelectorFilter(
             type = "selector",
             dimension = "instrument.name",
             value = "hw.fan"
         )]
     ),
     aggregations = [TelemetryDruidAggregator(
         type = "longMax",
         name = "hw-fan-speed_max-Max",
         field_name = "hw.fan.speed_max"
     )]
)
api_response = api_instance.query_telemetry_group_by(group_by_request)
for entry in api_response:
    print(f"{entry['timestamp']} - {entry['event']}")

This query uses a LimitSpec to retrieve the first two two results from the dataset, thus your result will look something like this:

2024-02-20 12:00:00+01:00 - {'host.name': 'Yosemite FI-A', 'hw-fan-speed_max-Max': 7317.0}
2024-02-20 12:00:00+01:00 - {'host.name': 'Yosemite FI-B', 'hw-fan-speed_max-Max': 8244.0}

Notice that there are only two results and no further datapoints. This does also not select the maximum, it just selects the first N requests from the regular results. Ordering would normally be available via the "columns" parameter. This is currently not supported in Intersight.

Post-processing of results

Sometimes you might want to calculate a value from multiple values you retrieved through the request. This can be achived by an additional level of processing through "post_aggregations". A typical use case for this would be calculating an average value by dividing the sum of all datapoints by the count of all elements. This could look something like below:

from intersight.api import telemetry_api
from intersight.model.telemetry_druid_group_by_request import TelemetryDruidGroupByRequest
from intersight.model.telemetry_druid_data_source import TelemetryDruidDataSource
from intersight.model.telemetry_druid_granularity import TelemetryDruidGranularity
from intersight.model.telemetry_druid_and_filter import TelemetryDruidAndFilter
from intersight.model.telemetry_druid_selector_filter import TelemetryDruidSelectorFilter
from intersight.model.telemetry_druid_aggregator import TelemetryDruidAggregator
from intersight.model.telemetry_druid_post_aggregator import TelemetryDruidPostAggregator

api_instance = telemetry_api.TelemetryApi(api_client)

group_by_request = TelemetryDruidGroupByRequest(
     query_type = "groupBy",
     data_source = TelemetryDruidDataSource(
         type = "table",
         name = "PhysicalEntities"
     ),
     granularity = TelemetryDruidGranularity(
        type = "period",
        origin = None,
        time_zone = "Europe/Vienna",
        duration = None,
        period = "PT12H"
     ),
     intervals = ["2024-02-20T19:47:00.000Z/2024-02-28T19:47:00.000Z"],
     dimensions = [],
     filter = TelemetryDruidAndFilter(
         type = "and",
         fields = [TelemetryDruidSelectorFilter(
             type = "selector",
             dimension = "instrument.name",
             value = "hw.fan"
         )]
     ),
     aggregations = [TelemetryDruidAggregator(
         type = "longSum",
         name = "count",
         field_name = "hw.fan.speed_count"
     ),TelemetryDruidAggregator(
         type = "longSum",
         name = "hw.fan.speed-Sum",
         field_name = "hw.fan.speed"
     )],
     post_aggregations = [TelemetryDruidPostAggregator(
         type = "expression",
         name = "post_aggregation",
         expression = "(\"hw.fan.speed-Sum\" / \"count\")"
    )]
)
api_response = api_instance.query_telemetry_group_by(group_by_request)
for entry in api_response:
    print(f"{entry['timestamp']} - {entry['event']}")

As you can see in this example, we are retrieving two values via aggregations: the sum and the count. Then we use the post aggregation to calculate a new result by dividing the sum with the count. This results in an output like this:

2024-02-20 12:00:00+01:00 - {'hw.fan.speed-Sum': 144815919.0, 'count': 19284.0, 'post_aggregation': 7509.0}
2024-02-21 00:00:00+01:00 - {'hw.fan.speed-Sum': 545500500.0, 'count': 72000.0, 'post_aggregation': 7576.0}
ADDITIONAL RESULTS ARE HIDDEN

This output contains the two values we requested as well as the value computed from them. There are a lot of computation options available if you want to achieve a different goal.