I have an interest in Process Automation and wish to transfer time-series data from an OPC UA Server to TimescaleDB. There are numerous products available (OPC Router, PTC KERserverEX) I also believe it was possible with Telegraf. My limited understanding is that this is now maintained by influxdata so perhaps the output to PostgreSQL is no longer available. Can anybody comment or suggest alternative approaches? Thanks.
I’m working for a Swiss Research organisation on Automation and Automatic Control, we don’t use OPC unified architecture, but I started googling on your question.
maybe the following would work:
OPC UA ↔ telegraf ↔ timescaleDB
given these two pages:
Thanks for the feedback. I have had a quick look at telegraf. I am slightly concerned as it is maintained by influxdb and it may not be in their interest in keep the Postgresql output updated. However, I configured the OPC UA input but currently struggling with Postgresql on the output side.
Hey @karibi are you still working on this?
I use MQTT Telegraf
Timescalesbd for OT data. I have found it to work quite well!
The debugging Telegraf and the Postgres output is the most mind numbing part of the process.
Here are a few things to consider:
Data in Telegraf is passed from the inputs to the outputs. It is therefore essential that the data is clean and structured in the way we want the output destination to receive it.
Telegraf data model as 4 parts:
- Measurement (Metric) Name
- Tags (Metadata)
- Fields (Values)
- Timestamp
Inputs gather timestamped values along with metadata and group by a measurement or metric name
The PostgreSQL output plugin will dynamically CREATE a table or ALTER a table if the table or column does not exist based on the Measurement, tags and fields.
It does this using sql templates where the measurement (Metric) name is mapped to the table via {{ .table }}.
Tags and fields are mapped to columns via {{ .columns }}
Now ask your self…
- What is the table schema going to be like?
- Narrow (columns for Time, tag_id, value)
or - Wide ( a column for Time and each metric i.e. pump speed, psi, level…)
- Narrow (columns for Time, tag_id, value)
If you want a narrow table for a facility your data could something like this:
Measurement, tag_id=tag, value=value, time
Pump_Station_E, tag_id=Pump_1_Speed, value=60, 1750268574
Pump_Station_E, tag_id=Pump_2_Speed, value=54, 1750268574
This will CREATE a table called Pump_Station_E. with columns for tag_id, value and time
Pump_Station_E
| time | tag_id | value |
----------------------------------------
| 1750268574 | Pump_1_Speed | 60 |
| 1750268574 | Pump_2_Speed | 54 |
If you want a wide table for a facility your data could something like this:
Measurement, tag=value, time
If your time stamps match you will get this…
Pump_Station_E, Pump_1_Speed=60, Pump_2_Speed=54, 1750268574
This will CREATE a table called Pump_Station_E. with columns for EACH unique tag_id(containing the value) and time
Pump_Station_E
| time | Pump_1_Speed | Pump_2_Speed |
----------------------------------------------
| 1750268574 | 60 | 54 |
If your time stamps DO NOT match you will get this…
Pump_Station_E, Pump_1_Speed=60, 1750268574
Pump_Station_E, Pump_2_Speed=54, 1750268570
Pump_Station_E
| time | Pump_1_Speed | Pump_2_Speed |
----------------------------------------------
| 1750268570 | null | 54 |
| 1750268574 | 60 | null |
There are many other ways to structure your tables and their columns—much of this has to do with how much data you are handling and what data will be queried together.
I shoot for a query that is efficient and not overly complicated by joins if not necessary.
Let me know If this is helpful, I am happy to share config snippets if you need.
-Grant
Thank you, Grant, for your extensive notes. Frankly, I struggled to get this to work with PostgreSQL, so ultimately I used a different approach. I have a reasonable amount of experience with process historians. So, ultimately, I used a slightly different approach. I went from my OPC UA Server via Telegraf → InfluxDB V3 community edition. Essentially, I replaced PostgreSQL with InfluxDB. Part of my reasoning is that the influxDB people maintain Telegraf. Hence, I thought that going forward I would be more confident that Telegraf would continue to work with InfluxDB and less confident about the interface to PostgreSQL. However, I am not suggesting this this approach is technically better as I don’t know. Did you go with a narrow or wide table format? I certainly agree about trying to keep the queries simple and minimising the number of joins, thus did you end up with a wide format? In a wide table format I believe the maximum number of columns in a single table is 1600 although practically it may be a bit less. In my application, this could be limiting. You could probably get Telegraf to route information to different tables, but now I was back to doing battle with the telegraf to PostgreSQL configuration file again. How many rows do you have in PostgreSQL, and what sort of retrieval times are you getting?
Chris
I currently have narrow tables by facility.
Indexing on the tag_id.
I find the wide table to be a bit messy as the db scales. Especially if devices are added or removed from the system.
How has your experience been with InfluxDB v3 on aging data older than a week?
Also what’s your polling rate on the OPCUA server?
Speaking to the Telegraf support of PostgreSQL…
It is my understanding that while Telegraf is maintained by influxdata, many of the Plug-ins inputs, processors and outputs are maintained by the community as it is an open source project.
You have given me much food for thought. Would you mind providing me with your telegraf conf file so that I can see how you have configured your PostgreSQL output? Currently, my system is a small prototype, and I only have around 120 tags polled at 5 second intervals. I haven’t see any particular issues with ageing data, but the application only queries the immediate history, so nothing older than 6 weeks. One of the limitations of the community edition is that you are limited to a timespan of 72 hours in a single query. How many rows are you expecting to have in PostgreSQL?
Chris
Can anybody provide me with an example telegraf.conf file showing how to output data in a narrow format suitable for PostgreSQL? I am trying to achieve something like this: Pump_Station_E
| time | tag_id | value |
| 1750268574 | Pump_1_Speed | 60 |
| 1750268574 | Pump_2_Speed | 54 |
Thanks
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
debug = false
quiet = false
logfile = ""
omit_hostname = true
[[inputs.mqtt_consumer]]
[[outputs.postgresql]]
connection = "host=localhost port=5432 user=postgres password=postgres dbname=guardsman_data sslmode=disable"
create_templates = [
'''CREATE TABLE {{ .table }} ({{ .columns }})''',
'''SELECT create_hypertable({{ .table|quoteLiteral }}, 'time', chunk_time_interval => INTERVAL '7d')''',
'''ALTER TABLE {{ .table }} SET (timescaledb.compress)'''
]
This config will automatically create a timescale hypertable table based on your Metric.
Note in the following line in the [agent]:
omit_hostname = true
This removes the host tag that telegraf passes to all data points by default. Remember that all tags will become columns so if you don’t want it in your DB you have to remove the tag before it gets to the output.
I am working on updating the create_template to help provision all of the elements necessary to optimize a timescaledb table (adding indexes and continuous aggregates automatically) Stay tuned for that.