Thursday, June 21

Author: Lan Zagar

pgpredict – Predictive analytics in PostgreSQL

2ndQuadrant, Data Mining, Lan's PlanetPostgreSQL, PostgreSQL
We all realize how important it is to be able to analyze the data we gather and extract useful information from it. 2UDA is a step in that direction and aims to bring together data storage and management (PostgreSQL) with data mining and analysis (Orange). pgpredict is a project in development and aims to be the next step that will bring it all full circle. Starting with data (in our case stored in a database), we first need to give access to it to experts who can analyze it with specialized tools and methods. But afterwards, when for example they train a predictive model that can solve something important and beneficial for us, they need to be able to convey those results back so we can exploit them. This is precisely what pgpredict tries to solve - deploying predictive models directly (more…)

2UDA – New features in Orange (Part 2)

Data Mining
Orange is continuously being improved and made more friendly and useful for the users based on their feedback and experiences. Some new features were already described in Part 1 of this blog series. Two other new features that appeared recently (available in the latest 2UDA package) are the Color widget and the reporting functionality. The Color widget introduces a handy new way to assign colors to variable values, appending this information to the data so that it is used in all subsequent visualizations. In older versions the user had the option to set the color palette in individual visualization widgets, which meant that the process had to be repeated in each widget. To try out the new feature with 2UDA, start Orange, and use the Sql Table widget to open the sample_cars table from the (more…)

2UDA RC1 – New features in Orange (Part 1)

Data Mining
The 2UDA installation package was updated recently to include the newly released PostgreSQL 9.5 RC1. Also found in the new package is an updated version of Orange bringing some new features, improvements, and bug fixes. Summary of the more noticeable changes can be found in 2UDA release notes. In this first of a series of posts, I will explore changes related to working on data stored in PostgreSQL databases: logging, approximate preprocessing, materializing queries, and schema selection. There are lots of other features to talk about, stay tuned to read the subsequent blogs. Logging Let’s start with the new logging functionality. For anyone interested in exploring larger databases or performing more complex analyses with 2UDA, it is now much easier to see where most of the time is (more…)