kdb+ for PMA High Speed Analytics
2021-10-25 | 7 min read
In the current era of digitization where data is everywhere -- starting from human to machine -- everyone is generating tons and tons of data. It’s overwhelming. With the amount of data being generated, getting the insight from it is becoming more and more challenging.
Take the electronic manufacturing industry which depends highly on the testing and accuracy of integrated components. Having insight over components behavior not only increases productivity, but also reveals new opportunities to improve and design better components.
Keysight’s PathWave Manufacturing Analytics Solution (PMA), a proven groundbreaking application delivers on manufacturing improvements through acquisition, automation, and advanced analytics. It enables the smart factory to mitigate risk of failure and downtime with real-time optimized anomaly detection algorithms and send only the most important across different IoT devices under test.
Why kdb+ Database?
In the electronics manufacturing industry, huge amounts of data are generated from machine/IOT devices every second. All these data points are received as a series of measurements or events that are tracked, monitored, down sampled, and aggregated over time. Considering the nature and the velocity of data, it’s not a simple task to process, analyze and build analytics using traditional technologies. With the evolution of data (structure, unstructured, timeseries, etc.), the tools and technologies to store and process high speed data have also evolved.
Databases like TSDB (Time Series Database) are specially designed to efficiently store, process, and build Fast Analytics over time series data. One of the most popular and fastest growing Time Series databases is kdb+.
KX contains a relational and columnar database, kdb+, and an inbuilt query language, q. It's well known for exceptionally fast analytics on large scale datasets in motion and at rest. It is fully 64-bit and has built-in multi-core processing and multi-threading. The same architecture is used for real-time and historical data. As the query language is integral to the database it operates directly on data, thereby obviating the need data transfer and dramatically increases the speed of analytics.
What makes kdb+ so fast?
As an in-memory, time-series database, kdb+ enables data to be ingested and immediately available for queries. It stores the data in memory first, which supports much higher ingestion rates of many millions of readings per second on a single server compared to other technologies.
The data ingestion process excels both the performance parameter of sequential write to disk and instant availability of data from memory. Also, the columnar storage format allows efficient bulk writes to tables on disk.
Some of the highlights below
1. kdb+ is a vector-oriented database with a built-in programming and query language
2. The entire kdb+ database and query language have a very small footprint (800 KB)
3. kdb+ is optimized for data storage
Because of above advantages, it's able to support large data volumes over a single server.
kdb+/ tick Architecture
Below illustrates the kdb+ High Level Architecture.
The Data Feeds are time series data that are mostly collected from different DUT (Device under test) devices. For e.g. (Keysight i3070, i1000, etc.) these feeds are parsed by the Feed Handler to a format that can be ingested by kdb+.
Once the data is parsed by the feed handler, it goes to the ticker-plant.
Ticker Plant is a kdb+ application. Its tables can be queried using q like any other kdb+ database. It captures the initial data feed, writes it to the log file and publishes these messages to any registered subscribers (e.g. Real time Database).
At the end of a business day, the log file is deleted, a new one created, and the real-time database is saved onto the historical database. Once all the data is saved onto the historical database, the real-time database purges its tables.
Log file: Ticker plant logs the q messages it receives from the feed handler into the log file. It is used for recovery; if the RDB must restart, the log file is replayed to return to the current state.
How Keysight PathWave Manufacturing Analytics uses kdb+?
Keysight PMA utilizes kdb+ as a main backend engine powering complex and real time analytics. It’s used for storing all the data and events collected from different devices and process then in real time. Because of the high performance and lower specs requirements it simplifies a lot of storage and analytics needs.
Looking into the future, the kdb+ scalability and fault tolerant architecture make things much faster, safer and secure for PMA.
Tackling the data growth over time
The velocity and volume of data continues to grow, along with the need for performing analyses ever faster, challenging traditional approaches and databases that were never designed to support these demands.
kdb+ is ideally suited for these demands because of its unique combination of a higher performance in-memory, columnar and relational database with an integrated vector-oriented programming system.
Do visit Keysight’s PathWave Manufacturing Analytics Solution (PMA) for more information on PathWave Manufacturing Analytics.