Databases are created to store data, but the way they are designed depends on your business objectives. Most business applications store data in an OLTP (On-Line Transaction Processing) database, which is accessed by numerous users to perform fast, simple queries. When you go to the supermarket, the Point-of-Sale system at the cash register uses an OLTP database. Another example is that of a bank, with tellers storing data for each transaction in an OLTP system. Even when you send a text from your smartphone to someone, an OLTP system is running on the backend.
OLTP is designed to store day-to-day business transactions and is well-suited for querying specific records, for instance, the email address of customer ABC. Thousands of such queries can be run simultaneously on an OLTP database, but when you need a strategic view of business data, queries start to get increasingly complex and require aggregations among numerous tables. For instance, a query for compiling year-over-year profits is best suited for an OLAP (On-Line Analytical Processing) database, which provides a multi-dimensional view of enterprise data rather than a transaction-level view.
Together, OLTP and OLAP form the two sides of the data warehousing coin. OLTP systems are the original, disparate data sources across the enterprise. On the other hand, OLAP systems integrate data from these transactional sources and present a multi-dimensional view for reporting and analytics.
Analogy-Based Explanation
Think of a business directory in the form of a book that stores transactions as they happen. If you want to see a specific customer’s transactions, you’d go to the index to see how the book’s contents are sorted and jump to the relevant page. Easy enough, but say you want to see business revenue over the last two months. You’d have to read the whole book, compile a list of transactions made over the last two months, and then add the dollar value. This would take you considerable time and resources.
You could also have another, separate book that stores contents in a different manner. It could have chapters that list Revenues, Expenditures, Sales, Inventory, etc., where transactions are grouped accordingly. Now if you want to see revenues for the last two months, you could use the index to see where data for your time period is listed and add revenues from the revenue chapter. On the other hand, if you want to a specific customer’s purchase amount, you’d have to go through all purchases and then filter them for that specific customer.
The first example is that of an OLTP database while the second depicts the functionality of an OLAP database. Let’s take a deeper look at the technical differences between the two types of databases.
Differences Between OLTP and OLAP Databases
OLTP | OLAP | |
Purpose | Manage and control fundamental business operations | Provide a consolidated view of enterprise data for reporting, decision-making, and planning |
Data source | The original source of data where business data is being recorded in real-time | Data is integrated from different OLTP systems |
Query type | Simple, standardized queries based on line items. Typically returning a few records. | Large, complex queries used for business decisions. Aggregation of tables across multiple databases is often required. |
Query form | Centered on INSERT, DELETE, and UPDATE commands | Transactions don’t need to be recorded, so mostly SELECT command is used for reporting |
View of data | A view of day-to-day business transactions is presented | A multi-dimensional view of enterprise data is presented |
Data processing | Fast processing because queries are small and simple | Complex query processing may take hours. Full-load from OLTP in batches is also time-intensive |
Data integrity | ACID-Compliance is necessary to preserve the accuracy and integrity of each transaction to keep them organized | OLAP is not frequently modified, so maintaining accuracy and integrity of data isn’t an issue |
Design requirements | Design changes as per industry requirements. Retail, airline, medical, etc. will all be using specifically designed OLTP databases | Design changes as per reporting subject. Sales, inventory, marketing, etc. will all have design changes specific to the subject |
Type of user | Used by employees on the front-line, like clerks and cashiers, to record and review transactions | Used by knowledge workers, like C-suite and analysts, to run complex queries for analysis |
Normalization | Data is stored in 3rd Normal Form (3NF) to facilitate simple queries | Data is denormalized to improve query performance when aggregation must be performed |
Backup | Data is backed up regularly without fail because it stores all functional data for the business. Legal compliance is also a factor. | Backups are rarely needed because OLAP is built to persist data. If recovery is needed, OLTP backups can be restored. |
Storage requirements | Relatively smaller because OLTP is based on current data. Most businesses archive historical data. | Significantly larger because OLAP contains data from many sources and requires storage of aggregate structures and numerous indexes to optimize query performance |
Bringing OLTP and OLAP Together to Build Your Data Warehouse
Now that you’re clear on the differences, you can start considering how to build your data warehouse by bringing together OLTP and OLAP databases. Ideally, you’d need a solution that allows you to perform Extract-Transform-Load (ETL) operations visually so you can integrate data from various OLTP databases, and preferably model them within the same solution to build your OLAP system.
Modern data warehouse automation tools reduce OLAP database creation time and effort. They allow for automated ETL code generation, provide simple interfaces for data modeling, and offer built-in transformation libraries to format data for consistency when integrating numerous OLTP systems across the enterprise.
2 comments