Skip to main content

Informix NoSQL: hybrid storage and access details.


On December 17th, John Miller talked about Informix NoSQL hybrid storage.

The presentation and audio will soon be posted at: http://ibm.co/JFkr1B

In Q&A, there was a question and comment by Lyn Robison from Gartner on advising customers to use the right data model for the right problem/application. We completely agree. We'd like customers to use the right model for the right problem. With informix not only you can use either standard relational model or flexible schema with JSON for your appdev. Obviously, this can create data silos if your API is tied to specific data model. We believe data model should not restrict data access. Hence the hybrid access.






So, how does this hybrid access work in Informix NoSQL?
Informix provides both SQL and NoSQL semantics. You can define regular relational schema or simply use MongoAPI and Informix creates the database, collections and documents just like MongoDB does. Let’s look at the implementation details. Some of these are already in our detailed deepdive at:slidesha.re/1gEGXW6

Data representation:

SQL:  SQL (relational) data is stored in regular tables with rows and columns.  Logical schema is distinct from the physical schema.

NoSQL: Flexible schema for the data means, no upfront definition of tables or rows or columns or their types. Each row could have random values.  To achieve this, NoSQL databases like MongoDB store data in JSON (actually in its binary form called BSON).  JSON is a series of key-value pair.  You can nest key-value pars within other key value pairs to form hierarchical structures or arrays.  This is generally referred to as document-structure.

For each NoSQL collection, we create a table with a BSON data type. BSON is Binary JSON (http://bsonspec.org/). JSON is Java Script Object Notation (http://www.json.org/).  All the MongoDB APIs exchange information with the server using BSON.   When the client sends data as BSON, it’s stored AS IS in BSON.

NoSQL pays more attention to application flexibility and agile appdev instead of storage efficiency.  So, for now, additional space for key in key-value pair is fine.  Eventually, all the databases will be looking at making JSON/BSON storage efficient. Within Informix, you can use compression to get space savings. 

Query Processing:
NoSQL API on SQL data:

Relational data (relations or resultset) can treated as “regular” JSON documents; columnname-value becomes key-value pair.  So, translating between relational data to JSON and vice-versa becomes easy.  

This helps enterprises with large dataset in relational tables to use it in the web appdev stack like MEAN.  

SELECT partner, pnum, country from partners;

 partner                 pnum            Country
 Pronto                   1748                    Australia
 Kazer                    1746                    USA           
 Diester                  1472                    Spain
 Consultix                1742                    France

{parnter: “Pronto”, pnum:”1748”, Country: “Australia”}
{parnter: “Kazer”, pnum:”1746”, Country: “USA”}
{parnter: “Diester”, pnum:”1472”, Country: “Spain”}
{parnter: “Consultix”, pnum:”1742”, Country: “France”}

Listner translates the query and the data object between relational and JSON/BSON form.  

db.partners.find({name:”Pronto”}, {pnum:1, country:1}).sort({b:-1})

SELECT a, b FROM t WHERE a = 2.0 ORDER BY b DESC;

db.partners.save({pnum:1632, name:”EuroTop”, Country: “Belgium”});
INSERT into partners(pnum, name, country values(1632, ”EuroTop”, “Belgium”);
db.partners.delete({name:”Artics”});
DELETE FROM PARTNERS WHERE name = “Artics”;
Db.partners.update({country:”Holland”},{$set:{country:”Netherland”}}, {multi: true});
UPDATE partners SET country = “Netherland” WHERE country = “Holland”;

The moment you have this kind of access, from MongoDB API, you can exploit the relational database features like transactions, views, joins, grouping, OLAP window functions, stored procedures, etc.

In this case, if a JSON query references a non-existent column, they’ll get the error.  The intent is not to simply extend existing relational schema, but to make the existing enterprise data available to new APIs seamlessly.

SQL on NoSQL data:

If you have the following JSON collections:

JSON Collection V:
{ "_id" : ObjectId("526a1bb1e1819d4d98a5fa4b"), "c1" : 1, "c2" : 2 }

JSON Collection w:
{ "_id" : ObjectId("526b005cfb9d36501fe31605"), "x" : 255, "y" : 265, "z" : 395}
{ "_id" : ObjectId("52716cdcf278706cb7bf9955"), "x" : 255, "y" : 265, "z" : 395}
{ "_id" : ObjectId("5271c749fa29acaeba572302"), "x" : 1, "y" : 2, "z" : 3 }

Ideally, you would want to use SQL like this:

SELECT V.c1, V.c2, W.x,W.y,W.z
FROM V, W
WHERE V.c1 = W.x;


Currently, you’ll have to use expressions and dotted notations to extract the specific key-value pairs.

SELECT bson_value_int(jc1.data, 'x'),
       bson_value_lvarchar(jc1.data, 'y'), 
        bson_value_int(jc1.data, 'z') , 
        bson_value_int(jc2.data, 'c1'), 
        bson_value_lvarchar(jc2.data, 'c2') 
FROM w jc1, v jc2
WHERE  bson_value_int(jc1.data, 'x') =   bson_value_int(jc2.data, "c1");

You can also create views on top of these and make the access much simpler for application developers.

create view vwjc(jc1x, jc1y, jc1z, jc2c1, jc2c2) as
SELECT bson_value_int(jc1.data, 'x'),
       bson_value_lvarchar(jc1.data, 'y'), 
        bson_value_int(jc1.data, 'z') , 
        bson_value_int(jc2.data, 'c1'), 
        bson_value_lvarchar(jc2.data, 'c2') 
FROM w jc1, v jc2
WHERE bson_value_int(jc1.data, 'x') = bson_value_int(jc2.data, 'c1');

Summary:
You can model using relational or NoSQL concepts within the same database and access data from  either SQL and MongoDB API without replicating the data or having ETL.  Since you just have one copy of the data, you'll be accessing the consistent copy of the data.












Comments

Popular posts from this blog

Swami Vivekananda: The Monk That Nobody Sent to Chicago

  There’s a saying in Chicago: “We don’t want nobody that nobody sent.” This was the cold reception Swami Vivekananda faced when he arrived in the windy city in July 1893, determined to attend the World Parliament of Religions that September. He belonged to no organization, carried no letter of recommendation, his countrymen were nobody, and represented an alien religion to the Western world. As the days passed, his hope of attending the parliament dwindled. With money running out and the odds stacked against him, he left the Windy City and went to Boston, praying for a glimmer of opportunity.  Swamiji came to America to share India’s most profound gift: the wisdom of the Hindu sages, preserved through centuries of oral tradition and embodied by its monks. This was 1893, not 1993—India was under the British grip, its resources drained, and its spirit subdued. Swamiji’s mission was not just a cultural exchange; it was a bold step toward envisioning a future where India could re...

Why Should Databases Go Natural?

From search to CRM, applications are adopting natural language and intuitive interactions. Should databases follow? This article provides a strategic perspective. Amid the many technological evolutions in software and hardware (CISC/RISC, Internet, Cloud, and AI), one technology has endured:  Relational Database Systems   (RDBMS), aka SQL databases. For over 50 years, RDBMS has survived and thrived, overcoming many challenges. It has evolved and adopted beneficial features from emerging technologies like object-relational databases and now competes robustly with   NoSQL databases .  Today, RDBMS dominates the market, with four of the top five databases and seven of the top ten being relational. RDBMS has smartly borrowed ideas, like JSON support, from NoSQL, while NoSQL has also borrowed from RDBMS. NoSQL no longer rejects SQL. From a user perspective, all modern databases have SQL-inspired query language and a set of APIs. All applications manage the respective data...

iQ Interactive: Cool Things for Developers on Couchbase Capella iQ

  The landscape of software development is ever-evolving with the advent of new technologies. As we venture into 2023, natural language processing ( NLP ) is rapidly emerging as a pivotal aspect of programming. Unlike previous generations of tools that primarily aimed at enhancing coding productivity and code quality, the new generation of Artificial Intelligence ( GenAI ) tools, like iQ, is set to revolutionize every facet of a developer's workflow. This encompasses a wide range of activities: Reading, writing, and rewriting specifications Designing, prototyping, and coding Reviewing, refactoring, and verifying software Going through the iterative cycle of deploying, debugging, and improving the software Create a draft schema and sample data for any use case Natural language queries. Generate sample queries on a given dataset Fix the syntax error for a query Don't stop here. Let your imagination fly. Although the insights garnered from iQ are preliminary and should be treated ...