SQL is the only 22nd century tool available in 21st century
Here are the articles comparing architecture, database objects, data types and data modeling of Oracle with Couchbase. This will focus on SQL support.
Oracle was the first and staunch supporter of SQL. Oracle's SQL implementation beat IBM to market by two years. That changed the fortune of one or two people :-) All of the modern relational databases implement SQL. So much so, relational databases are sometimes called SQL databases, much to the chagrin of C. J. Date. Nations are known by their languages... English, French, and American(!). It's not a stretch for a class of database system to be known by their languages as well. SQL has been so effective, many big data and non-relational systems have picked up SQL as the lingua franca. SQL is here to stay, even for noSQL systems.
SQL > SQL
SQL may have started as a structured english query language, but it has grown to be so much more. Roughly, SQL now consists of:
- Data Definition Language (DDLs):
- to create the physical objects (indexes, tables, views, triggers) to represent the model
- Various data types and their exact definitions
- Declarative Query Language
- SELECT, INSERT, UPDATE, DELETE, MERGE, etc.
- Arithmetic, logical, set operators
- Transactions
- definition of the framework
- Control statements: BEGIN WORK, COMMIT, ROLLBACK, SAVEPOINT
- Procedural language
- Stored procedures (PL/SQL, T-SQL, SPL)
- Optimizer
- The query optimizer, especially the cost based optimizer,
- Modern data types, structures and features
- Spatial, text search, queues
There's ANSI SQL Standard, defined in 14 volumes. It's safe to say no one vendor implements all of the standard. Sometimes a feature is implemented in a product ahead of the standard. E.g. Hierarchical queries in Oracle is implemented with CONNECT BY extension, but the standard syntax is via recursive Common Table Expressions. Oracle supports both for backward compatibility.
SQL in Couchbase
SQL, the language and the underlying principles, has been unreasonably effective even in NoSQL databases. NoSQL databases went from "absolutely no SQL" to "Not Only SQL". Whatever works. :-). In reality, SQL itself wasn't the problem, it was the solution. When the operational databases went from relational to NoSQL model, it was difficult to implement on a distributed NoSQL database and still meet the performance and high-availability objectives. It took time, but now a number of NoSQL systems are doing that. Couchbase follows SQL closely and others to various degrees.
Couchbase has designed N1QL (SQL for JSON). Couchbase has two N1QL query engines, one for OLTP, another for OLAP. The use cases for each are straight forward and are explained further here.
High Level Comparison
(Click on the image to view PDF)
Query Processing
Oracle SQL: SQL takes one or more sets of tuples(rows), processes the set as specified by the query and give you another set of tuples (rows). While it's normal to think about row by row processing, SQL rules come from and apply to a set.
Couchbase N1QL: N1QL stands for Non 1st-normal Query Language. The goal of N1QL language and the engine is to give developers and enterprises an expressive, powerful, and complete language for querying, transforming, and manipulating JSON data. Means, it's designed to manipulate more than just numbers and strings. It can easily handled nested objects, arrays, arrays of objects, objects of arrays, arrays of arrays of objects and so forth. Real world data, and therefore JSON, can be nested and complex. You need a language to give that power to the developer.
Query Processing Architecture [Query service]
All databases have query, index, data layers. Couchbase has abstracted these into different processes interacting via APIs. Hence, you can multiple instance of these in multiple nodes of the cluster and can be combined in any combination on each node to meet specific and elastic workloads. The diagram below shows the logical flow of the query processing. Each query is run on a single node of a query service using one or more index and data nodes.
Couchbase analytics service supports N1QL, targeted for queries analyzing large amounts of data on its shared-nothing MPP engine. It can analyze data from one or more Couchbase clusters as well datasets in CSV, TSV, JSON formats. More formats are in the roadmap. It has DDLs to map data from data engine and create external data sets. It has extensive DML (SELECT) to run reports on the data, including joins, window functions, and user defined functions. See the details in the talk below. Rest of the article mainly focuses on the statements and feature in the query service targeted for OLTP (operational) workload.
Oracle | Couchbase |
CREATE DATABASE travel; Documentation Like any mature product, this statement has 100s of options for various levels of auditing, logging, character set, storage. Database->Schema->Table form the database object hierarchy. | #Create a bucket couchbase-cli bucket-create -c 192.168.1.5:8091 --bucket travel; Documentation Bucket->Scope->Collection form the database object hierarchy in Couchbase. In addition to usual options, observe two important paramters: "bucket-ramsize" and "bucket-replica" which indicates the number of data copies. "bucket-ramsize" tells the system how much memory to allocate for the bucket in EACH node -- critical for performance; "bucket-replica" is a number that tells how many copies of the data should be kept within the same cluster- critical for availability. |
CREATE SCHEMA AUTHORIZATION s1 CREATE TABLE t1(a int) GRANT SELECT on t1 to HR; Documentation | CREATE scope s1; |
CREATE TABLE t1(c1 int primary key, c2 varchar(255), c3 decimal(9,2)); CREATE TABLE t2(c1 JSON); Documentation create a table with all the columns, types, constraints, partitions defined.
| CREATE COLLECTION t1; Documentation Big difference with a table and a collection is the schema definition. The data is considered to be JSON implicitly. JSON is the model. Each document must have a user generated unique key, called document-key that lives outside the JSON document. The document itself can up to 20MiB. The collection is hash partitioned automatically into 1024 virtual partitions (called vbucket). It will also inherit the replication factor from the bucket setting. |
| ALTER TABLE
Documentation | NO ALTER necessary. Since there's no schema, columns (fields) can neither be added/dropped/modified for the whole collection. Since it's JSON, you can simply change the field type from one document to another or change the type in each document of the whole collection using UPDATE statement. The partitioning is always on the hash of the document key, which cannot be changed. |
| CREATE INDEX i1 ON t1(c2, c1);
Documentation | CREATE INDEX i1 ON t1(c2, c1);
Documentation
Indexing itself is a big topic and will be discussed in the next article. |
CREATE FUNCTION locations(vActivity)
RETURNS varchar(255)
vname varchar(255) := "";
BEGIN
SELECT name into vname
FROM `travel-sample`.inventory.landmark
WHERE activity = vActivity);
RETURN vint; END; Documentation
| CREATE FUNCTION CREATE FUNCTION locations(vActivity) { (
SELECT name
FROM `travel-sample`.inventory.landmark
WHERE activity = vActivity) }; Documentation
This shows equivalent function, each executing a SQL statement and returning some values. Except from the variable declaration, separate RETURN statement, it looks the same. Couchbase function can only have a single expression or a SELECT statement.
Use JavaScript function to write fully procedural function. Note Couchbase 7.0 has JavaScript functions and Couchbase 7.1(2022) will add the ability to execute N1QL statements (including transactions) within the JavaScript functions.
function calccircle() {
const pi = 3.141592653589793;
var radius, area;
rq = select r as rad from [1, 2, 3, 4, 5] as r;
var acc = [];
for (const row of rq) {
radius = row["rad"];
area = pi * (radius ** 2);
var qi = INSERT INTO b VALUES(UUID(), {"radius": $radius, "area": $area}) RETURNING * ;
for (const r1 of qi) {acc.push(r1);}
}
return acc;
}
create or replace function ps1() language javascript as "pselect1" at "p1" ;
EXECUTE FUNCTION calccircle();
SELECT cinfo
FROM calccircle() as cinfo; |
In addition to these frequently used DDL statements, Oracle has more than 50 different CREATE or DROP statements. Some types of objects are useful (e.g. sequences) and are unsupported in Couchbase. You'll have to use work arounds for that. Others like CREATE JAVA, CREATE INMEMORY JOIN GROUP may be obscure and unnecessary.
DATA MANIPULATION LANGUAGE STATEMENTS
In Oracle, SELECT, INSERT, UPDATE, DELETE, UPSERT, MERGE are the main data manipulation statements. Same is true for Couchbase N1QL language. SELECT is the most used statement - In Oracle, just the SELECT syntax diagram itself runs into 24 pages. It the most features and complexity. Other statements, while typically simple and straightforward, can use SELECT as a subquery and things can get complicated. Especially when the SQL is generated by tools. Let's compare the common features, things that are common and things are different.
Recommended prereq:
1. Couchbase N1QL boolean logic: here
2. Couchbase N1QL Data types: here
-- Comparison to Oracle Data types
3. Couchbase N1QL Literals: here
4. Couchbase N1QL Identifiers: here
DML Comparison:
ORACLE | Couchbase |
SELECT: Projection
The projection clause can have simple “*”, column references, expressions, subqueries, aggregate expressions, window functions.
| SELECT: Projection or simply SELECT clause
You’ll find the projection is similar to SQL’s projection with support for *, field references, and so forth. Couchbase even has support for window functions. Being a query engine for JSON, it also support JSON expressions in the projection. SELECT {"name": fn || ln } FROM cxdata; |
SELECT: FROM clause
The FROM clause can have a table reference, subqueries, or a table function. These tables can joined via one of the JOIN operations supported in Oracle. It supports a rich set of JOIN operations. | SELECT: FROM clause
Couchbase FROM clause can have a collection reference, subqueries or a table function. It can also have a JSON expression. Couchbase query supports INNER JOIN, LEFT OUTER JOIN and a limited RIGHT OUTER JOIN in the query. It also supports NEST, UNNEST, NEST OUTER, UNEST OUTER functionality to create nested and flatten the JSON document. |
SELECT: WHERE clause
Supports simple logical expressions, correlated, non-correlated subqueries. | SELECT: WHERE clause
Supports simple logical expressions, correlated, non-correlated subqueries. |
Standard clauses with GROUP BY, ORDER BY, HAVING clauses. | N1QL supports all these with the same syntax. |
CTE: Common Table Expressions Non-recursive CTEs Recursive CTEs
| CTE: Common Table Expressions N1QL supports only non-recursive CTEs. Recursive CTEs are unsupported (Feb, 2022). |
CONNECT BY hierarchical queries | Unsupported. Have to work around by client program. |
Window analytic functions & Window Clause | Window functions & Window clause |
Pagination ROWNUM OFFSET X ROWS FETCH NEXT Y ROWS ONLY; ROW_NUMBER() OVER()
| Pagination OFFSET X LIMIT Y ROW_NUMBER() OVER()
|
Set operators: UNION, UNION ALL, INTERSECT, MINUS | Set operators: UNION [ALL], INTERSECT [ALL], EXCEPT [ALL] |
INSERT statement Single table insert Insert into…select Multi table insert Some more options
| INSERT statement Single table insert Single table multi-value support Insert into…select
|
UPDATE, DELETE, MERGE statements | UPDATE, DELETE and MERGE statements are very similar to Oracle’s respective statements. Important to note that by default, Couchbase supports single document atomicity. To achieve the SQL like multi-document atomicity, use singleton transaction feature or a full multi-statement transaction with N1QL. |
This should give you an overview of commonly used SQL and N1QL statements in Oracle and Couchbase. Each product is too extensive to cover comprehensively. The official documentation will give you further details. We'll go deeper into DML. Next up are indexes, optimizer and transactions.
Comments
Post a Comment