In-Database Analytics
What is In-Database Analytics?
Enterprise data analysis can be performed in many different ways. It is not a single set of defined processes that work for every data set or every type of business question.
In-database analytics is one way to analyze data where the processing of data is performed within a database. It has multiple benefits over traditional analytics techniques, where databases and their processing are separated.
The traditional BI approach to data analysis
Traditionally, BI data analysis is achieved by moving data out of a database or data warehouse and into a BI product before you can perform the operations to answer your complex business questions.
When databases and data warehouses were small enough that the time taken to extract the data as an initial step was minimal and not a huge burden for BI tools.
In 2003, an Encyclopedia of Information Systems article noted: “For large organizations, several gigabytes of data may be recorded in the enterprise database within a week or even a day.”
In 18 years, gigabytes have turned into terabytes and petabytes of data collected each week. The volume of business data has exploded due to automated internal logging, IoT data capture activities, and app and cloud integration data extraction.
The data extraction step now comes at a high cost in performance.
Most BI products still use an import from database function before processing data. Despite the performance cost, it remains an easy method to manipulate data, as vendors can create custom ways to model data and extract queries.
Problems with traditional business intelligence data analysis import
There are multiple problems BI products encounter with this conventional approach to data analysis.
1. Importing large datasets impacts performance
Despite increasing bandwidth capabilities, importing very large datasets to BI platforms takes time. This means you are waiting to perform any analysis. Performance and scalability are both hindered.
2. Importing large datasets cut off rapidly changing data
Lengthy import times result in quickly changing data not being included in your data set. This is a show-stopper if you need near-real-time analysis.
3. Security implications with full data import
Your data in motion on external networks are at risk as it imports from the database to your BI tools. Besides enhancing your security, having only your analysis results moving across the network limits your valuable business data exposure.
4. Proprietary data manipulation techniques
While manipulating data within a BI product may be easy for experienced users, proprietary data modeling and manipulation techniques mean that you have vendor lock-in. You may need to start again from scratch with modeling and analysis if you want to change products.
In-database analytics changes the BI game
In-database analytics does analytical querying in the database rather than importing the BI tool’s data and then performing other operations.
This is a push function from the BI tool rather than a traditional pull.
How do we work directly with the database across relational, NoSQL, and columnar databases to speak a language they all recognize? The answer is SQL. SQL remains the language of databases since the 70s, supported by every database worth its money.
BI products convert the business question to a SQL query for in-database analytics, then pass it along to the database.
With operations performed on the database, rather than in the BI tool, you remove the resource-heavy pull step, plus you now have portable code within the database you can take to another tool. Only the modeling, query, and results are sent between the database and your BI tool regarding security. The entire dataset is never in motion.
How other BI tools do in-database analytics
Some BI tools use additional data constructs inside the database, such as C++ or Java modules, to perform complex querying. This results in queries that cannot be ported, as the constructs are not standard across the different types of databases, unlike pure SQL.
The cloud has given rise to a slew of business intelligence tools that promise insights by helping you write and organize SQL queries. These tools are excellent for single-use reports and analyses by technical users who can write SQL. These tools are overly simplistic and do not allow for ad-hoc analyses and reports by business users.
Even with pure SQL, querying large and complex datasets is prone to errors, not reusable, sometimes tricky to understand, and not easy to collaborate on code.
How bipp performs in-database analytics
The bipp approach to in-database analytics is different. Unlike our competitors, we support the modern data stack and are SQL native.
Instead of slow-moving cubes or redundant data extracts (which means users can wait days for outdated dashboards), bipp integrates data analytics into your enterprise data warehouse.
Using this model, there’s no layer between the data and the dashboard so that users can rest assured the data they’re looking at is the freshest, most real-time data available. bipp lets you scale by directly hitting your database every time a report is fetched, or the dashboard is refreshed.
To see in-database analytics in action…