I am a long term user of MRTG+RRDtool, and I am considering using TSDB for my monitoring data. I would like to do the same as MRTG does - to have:
5-min average data for the last day and half,
30-min aggregate for ~10 days back,
1h aggregate for the last month or so, and
daily aggregate kept indefinitely.
I am able to create various CA-materialized views either on top of the raw data, or on top of each other. But I would like to use them all together - to be able to draw for example a weekly graph not only for the last week, but also further back in history, even though there is only corasely-grained aggregate data available. Is there a better way how to do this than to use a complicated UNION of subselects on all four materialized views, as described here?
Also, when creating an aggregate on top of another aggregate, can the time bucket column use the same name on both aggregates? When I try to do so, I get the continuous aggregate view must include a valid time bucket function error:
=> create materialized view temperature_5min with (timescaledb.continuous) AS select id, time_bucket(interval '5 mins', time) AS time_bucket,
avg(value), max(value), min(value) from temperature GROUP BY id, time_bucket;
NOTICE: refreshing continuous aggregate "temperature_5min"
HINT: Use WITH NO DATA if you do not want to refresh the continuous aggregate on creation.
CREATE MATERIALIZED VIEW
=> create materialized view temperature_30min
with (timescaledb.continuous)
as
select time_bucket(interval '30 min', temperature_5min.time_bucket) as time_bucket,
id,
avg(avg), min(min), max(max)
from temperature_5min
group by id, temperature_5min.time_bucket;
ERROR: continuous aggregate view must include a valid time bucket function
I am probably missing a column alias ... AS something somewhere, but can’t figure out where. And all the examples I have seen so far seem to use a different name of the time column in the upper and lower aggregates.
@jonatasdp - thanks for the reply. I am not sure which part of the above DDL do you mean. My base table temperature contains colum time of type timestamptz NOT NULL. Adding an explicit conversion to ::timestamptz after time_bucket('30 mins', time_bucket) does not help.
Moreover, I think the problem is in the column name itself, because when I change the time-bucket column name to something else (e.g. time_bucket_30m instead of time_bucket), the second materialized view can be created without errors:
=> create materialized view temperature_30min
with (timescaledb.continuous)
AS select id, time_bucket('30 mins', time_bucket) AS time_bucket_30m,
avg(avg), max(max), min(min)
from temperature_5min
GROUP BY id, time_bucket_30m;
NOTICE: refreshing continuous aggregate "temperature_30min"
HINT: Use WITH NO DATA if you do not want to refresh the continuous aggregate on creation.
CREATE MATERIALIZED VIEW
Hi @Yenya, I guess this is a knowing bug. I remember the column orders was an issue:
But there are an workaround to make this work until next release. In the original hypertable place the time dimension column (“timestamp”) at the same order you do the time bucket in caggs (in this case as the 1st column).
Looking at the same problem again (upgraded to Pg16 and TSDB 2.19.3) - the problem is still there. But according to above mentioned issue, this is indeed a problem of column name aliasing, and is present in Pg itself. The workaround is to use numbers instead of column names in GROUP BY:
create materialized view temperature_30min
with (timescaledb.continuous)
as select time_bucket('30 mins', time) as time,
avg(avg), -- other columns here
from temperature_5min
group by 1, 2;
Also, I would now use counter_agg()/gauge_agg() in the first continuous aggregation, and rollup() in the other ones.