OLAP (short for Online Analytical Processing cubes) refers to multidimensional databases optimized for data warehouse requirements and specific OLAP applications. The OLAP cube is a technique of storing data (or measures) in a multidimensional system, usually for reporting purposes. OLAP’s online nature makes the multidimensional data model a crucial part of it. With OLAP, analysts post complex iterative queries during interactive sessions instead of overnight batch jobs. Multidimensional modeling allows OLAP cubes to resolve these complex queries in real time. The query time over relational databases is further enhanced by the pre-summarization of OLAP cubes across dimensions.
The model also enforces simplicity. It is composed of logical cubes, measures, hierarchies, dimensions, attributes, and levels, and is inherently simple as it defines objects representing real-world business entities. This simplicity, as Ralph Kimball says in The Data Warehouse Toolkit, is “the central attraction of the dimensional model of a business…(it) is the fundamental key that allows users to understand databases, and allows software to navigate databases efficiently.” When using OLAP, analysts will understand the business measures they need to examine, the attributes and dimensions that make these measures meaningful, and the organization of their business dimensions into levels and hierarchies.
How Are OLAP Cubes Structured?
OLAP cubes store data just like a traditional database, but they are structured quite differently. Where databases have historically been designed according to the needs of the IT systems, OLAP cubes are created for business users for advanced analytics, using business logic and understanding. OLAP cubes need to be capable of simultaneously reporting on millions of records and are thus optimized for analysis. In fact, business users can use plain English to query OLAP cubes.
OLAP cubes are made of dimensions – these are lists or categories of related items that express a specific facet of a business and are derived from dimension tables. For instance, if the business in question is a clothing brand, the dimensions could be internal departments, store locations or phone models. Dimensions are akin to the rows and columns that compose a pivot table. A data set of multiple dimensions makes a cube, which can hold data, calculate data, and collect data from the users. One cube will generally store data relevant to one business aspect, like revenue or items in stock. A three-dimensional cube arises when you create duplicate spreadsheets with the same data structure.
Within OLAP cubes, measures are the numeric facts categorized by dimensions; they are derived from records in the fact table and give you information about the specific quantities you are interested in. Dimensions are akin to a category of related items; in this vein, hierarchies can be thought of as subcategories. Hierarchies contain multiple levels and let you “drill up” or “drill down” on your data, thus permitting data analysis at different granularity levels.
Do You Need to Use OLAP Cubes?
The answer depends on your specific needs. OLAP’s functionality derives from how it analyzes data through multiple dimensions and processes large data sets in a way that is impossible using a standard spreadsheet. Using cubes, you could even make data discoveries that would not have been possible otherwise.
In comparison to two-dimensional relational tables, the cube shape represents the physical storage of multidimensional measures in an analytic workplace. This shape allows for rotation – a key advantage of dimensional modeling. Different analysts will need to view data in different ways, and OLAP allows them to manipulate and view the data according to their unique requirements. It makes for a powerful online visualization system with faster response time. Users can, therefore, slice and dice cube data using various filters, measures, and dimensions. It is also easier to find outliers in the clusters with OLAP.
Businesses can use OLAP cubes for a range of activities, including budgeting, planning, analysis, and reporting. All information reporting and calculations remain consistent in an OLAP cube, and it allows for the quick creation and analysis of “what if” scenarios that help with forecasting. What’s more, the OLAP database is easily searchable for both specific and broad terms. OLAP also provides the building blocks of data mining tools, business modeling tools, and performance reporting tools. Additionally, it is an especially productive software when analyzing time series. These properties make OLAP especially suitable within the context of big data and data warehouses.
That said, dimensional models exclude naturally non-hierarchal data like network structured data, thanks to an inherent assumption of an underlying data hierarchy. This means the OLAP system cannot access transactional data. The number of dimensions a single OLAP cube can contain is also restricted. Any time a modification is required, the entire cube needs to be fully updated in a time-consuming process. Finally, OLAP cubes require data to be organized into a snowflake or star schema, both of which are difficult to administer or implement.
Wondering if you need OLAP cubes for effective data warehouse management? Contact our data architects to find out more about dimensional modeling.
One comment