Get started
1. Name your dataset and add it to your Worker
Add the following to your wrangler.toml
file to create a binding to a Workers Analytics Engine dataset. A dataset is like a table in SQL: the rows and columns should have consistent meaning.
wrangler.toml[[analytics_engine_datasets]]
binding = "<BINDING_NAME>"
dataset = "<DATASET_NAME>"
2. Write data points from your Worker
You can write data points to your Worker by calling the writeDataPoint()
method that is exposed on the binding that you just created.
async fetch(request, env) { env.WEATHER.writeDataPoint({ 'blobs': ["Seattle", "USA", "pro_sensor_9000"], // City, State 'doubles': [25, 0.5], 'indexes': ["a3cd45"] }); return new Response("OK!");
}
A data point is a structured event that consists of:
- Blobs (strings) — The dimensions used for grouping and filtering. Sometimes called labels in other metrics systems.
- Doubles (numbers) — The numeric values that you want to record in your data point.
- Indexes — (strings) — Used as a sampling key.
In the example above, suppose you are collecting air quality samples. Each data point written represents a reading from your weather sensor. The blobs define city, state, and sensor model — the dimensions you want to be able to filter queries on later. The doubles define the numeric temperature and air pressure readings. And the index is the ID of your customer. You may want to include context about the incoming request, such as geolocation, to add additional data to your datapoint.
Currently, the writeDataPoint()
API accepts ordered arrays of values. This means that you must provide fields in a consistent order. While the indexes
field accepts an array, you currently must only provide a single index. If you attempt to provide multiple indexes, your data point will not be recorded.
3. Query data using the SQL API
You can query the data you have written in two ways:
- SQL API — Best for writing your own queries and integrating with external tools like Grafana.
- GraphQL API — This is the same API that powers the Cloudflare dashboard.
For the purpose of this example, we will use the SQL API.
Create an API token
Create an API Token that has the Account Analytics Read
permission.
Write your first query
The following query returns the top 10 cities that had the highest average humidity readings when the temperature was above zero:
SELECT blob1 AS city, SUM(_sample_interval * double2) / SUM(_sample_interval) AS avg_humidityFROM WEATHERWHERE double1 > 0
GROUP BY cityORDER BY avg_humidity DESC
LIMIT 10
You can run this query by making an HTTP request to the SQL API:
curl "https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql" \
--header "Authorization: Bearer <API_TOKEN>" \
--data "SELECT blob1 AS city, SUM(_sample_interval * double2) / SUM(_sample_interval) AS avg_humidity FROM WEATHER WHERE double1 > 0 GROUP BY city ORDER BY avg_humidity DESC LIMIT 10"
Refer to the Workers Analytics Engine SQL Reference for a full list of supported SQL functionality.
Working with time series data
Workers Analytics Engine is optimized for powering time series analytics that can be visualized using tools like Grafana. Every event written from the runtime is automatically populated with a timestamp
field. It is expected that most time series will round, and then GROUP BY
the timestamp
. For example:
SELECT intDiv(toUInt32(timestamp), 300) * 300 AS t, blob1 AS city, SUM(_sample_interval * double2) / SUM(_sample_interval) AS avg_humidityFROM WEATHERWHERE timestamp >= NOW() - INTERVAL '1' DAY AND double1 > 0
GROUP BY t, cityORDER BY t, avg_humidity DESC
This query first rounds the timestamp
field to the nearest five minutes. Then, it groups by that field and city and calculates the average humidity in each city for a five minute period.
Refer to Querying Workers Analytics Engine from Grafana for more details on how to create efficient Grafana queries against Workers Analytics Engine.