Skip to main content

MongoDB to Couchbase: An Introduction to Developers and Experts


 

Six thousand years ago, the Sumerians invented writing for transaction processing - Gray and Reuter

By any measure, MongoDB is a popular document-oriented JSON database. In the last dozen years, it has grown from its humble beginnings of a single lock per database to a modern multi-document transaction with snapshot isolation. MongoDB University has trained a large number of developers to develop the MongoDB database.

There are many JSON databases now. While it's easy to start with MongoDB to learn NoSQL and flexible JSON schema, many customers choose Couchbase for performance, scale, and SQL. As you progress in your database evaluation and evolution, you should learn about other JSON databases. We're working on an online training course for MongoDB experts to learn Couchbase easily. Until we publish that, you'll have to read this article. :)

If you know RDBMS like Microsoft SQL Server and Oracle, we have easy to follow courses to learn do the mapping of your database knowledge to Couchbase with these two courses:

  1. CB116m - Intro to Couchbase for MSSQL Experts
  2. CB116o - Introduction to Couchbase for Oracle Experts

Summary

MongoDB and Couchbase have many things in common. Both are NoSQL distributed databases, use the JSON model, have high-level query languages with support for select-join-project operations, have secondary indexes, have an optimizer that chooses the query plan automatically, and support intra-cluster and inter-cluster replication.

As you'd expect, there are differences. Some are more significant than others. Couchbase is designed to be distributed from the get-go. For example, the data container Bucket is always distributed — nothing to share. Simply add new nodes and the system will automatically distribute. Intra cluster replication requires no new servers - simply set the number of replicas and you're all set. From the developer interaction perspective, the big difference is the query language itself - MongoDB has a proprietary query language and Couchbase has N1QL - SQL for JSON. MongoDB uses its B-Tree-based index for search as well and recently released $searchbeta for the Atlas service using Apache Lucene; Couchbase has a built-in Full-Text Search.

Hight-Level Topics:

  1. Resources
  2. Architecture
  3. Database Objects
  4. Data Types
  5. Data Model
  6. SDK
  7. Query Language
  8. Indexes
  9. Optimizer
  10. Transactions
  11. Analytics

Resources

Architecture


Laptop Version: 

Laptop version architecture infographic.MongoDB: Simply install and use the mongod on your laptop with the right parameters; you're up and running. A single process to deal with the whole database. This has changed a little bit in 4.2 where you'd need mongos to run your transactions. All of the MongoDB features (data, indexing, query) are available here, except full-text search, which is available only on the Atlas service.

Laptop version architecture infographic for Couchbase.Couchbase: Couchbase is different. It has abstracted each of the services (data, index, query, search, analytics, eventing) and you have the option to choose which of the features you'd want to run on your instance to optimize the resources. A typical installation has data, index, and query. Search, eventing, and analytics will run on your laptop - install and use them per your use case.

Cluster deployment: As with most NoSQL databases, both MongoDB and Couchbase can scale out. In MongoDB, you can scale by sharding the collection into multiple nodes. You can shard by hash or range. Without explicit shard, each collection remains in a single shard. The config servers store the metadata and configuration for the cluster.MongoDB is uniformly distributed and Couchbase is multi-dimensionally distributed. Mongod process (service) manages data, index, and query on every shard (node) whereas Mongos does the distributed query processing and merging from intermediate results and does not manage any data or index. Mongos acts as the coordinator and mongod is the worker bee.


MongoDB Cluster infographic.

Couchbase can be deployed in a uniform distribution with each node managing the data and all services; data, index, query, analytics, and eventing. Each service is a layer in the traditional database. These services are loosely coupled — they run in different process spaces and communicate via a network, hence they can be deployed uniformly in a single node or distributed multi-dimensionally on a cluster. 

The choice depends on your workload and SLAs. The data itself is stored in buckets. All the buckets are hash partitioned among given nodes — this is automatic and doesn't require any specification. When the application has the document keys, it can directly operate on the data without any intervening nodes. This is one of the key architectural differences contributing to the high performance and scale-out of Couchbase. 

In addition, there are no config servers. The metadata and its management are built into the core database. The data service manages data, clusters, and replication within a Couchbase cluster. Replication between multiple Couchbase clusters is managed by XDCR. Read this article to understand the replication mechanisms in MongoDB and Couchbase: Replication in NoSQL document databases (Mongo DB vs Couchbase).

Couchbase Cluster infographic.


Inside the Cluster Deployment

MongoDB's cluster components and deployment are explained here, and I assume that it is prior knowledge, so I'll avoid repeating.

Couchbase deployment starts with the key-value data service. This is the (consistent) hash distributed key-value data store. This also has intracluster replication built-in eliminating any need for separate replica servers or config servers. The query service orchestrates the execution of N1QL queries using GSI (Global Secondary Indexing) and FTS  (Full-Text Search) indexes as needed. FTS manages the full-text index and can be queried directly or via the N1QL query serviceThe Eventing function enables you to automatically trigger action (by executing Javascript function) upon data mutation. The Couchbase Analytics engine is an MPP data and query engine. It makes a copy of the data and redistributes it into its nodes, and it executes the query in parallel for the best performance possible. All of these can be seamlessly used by the rich set of APIs available in our SDKs available in all the popular languages.


Couchbase Cluster infographic.


Database Objects

MongoDB has a collection and database as the logical objects users have to work with. Couchbase traditionally had just the Buckets. Bucket worked both for resource management (e.g. amount of memory used), security as well as the data container. In 6.5, we introduced the notion of collection and scope as a developer preview. This bucket:scope:collection hierarchy is analogous to RDBMS's database:schema:table. This makes the database more secure and a better multi-tenant. In 6.5, without the developer preview, each bucket uses a default scope and collection, making the transition seamless.

MongoDB and Couchbase comparison infographic.

RDBMS

MongoDB

Couchbase

Database

Database

Bucket

Table

Collection

Bucket

Future: Collection

Row

Document (BSON)

Document (standard JSON)

Column

Field/Attribute

Field/Attribute

Partition (Table/collection/bucket)

Not partitioned by default.

Hash & range partitioning (sharding) is supported manually.

Partition (hash automatic)


Notes to Developers

In MongoDB, you start with your instance (deployment) and create databases, collections, and indexes.

In Couchbase, you start with your instance and create your buckets and indexes. Each bucket can have multiple types of documents, so each document should have an application designated field for recognizing its type. {"type": "parts"}. Since each bucket can have any number of types of documents, you should avoid creating too many buckets. This also means, when you create an index you'll be interested in creating an index for each type: customer, parts, orders, etc. So, the index creation will include a WHERE clause for the document type.

SQL
1
CREATE INDEX ix_customer_zip ON customer(zip) WHERE type = "customer"; SELECT * FROM customer WHERE zip = 94040 AND type = "customer"



Each MongoDB document contains an explicitly provided or implicitly generated document id field _id.

In Couchbase, the users should generate and insert an immutable document key for each document. When inserting via N1QL, you can use the UUID() function to generate one for you. But, it's a good practice to have a regular structure for the document key.

Data Types

MongoDB's data model is BSON and Couchbase's data model is JSON. The proprietary BSON type has some types, not JSON. JSON has string, numeric, boolean (true/false), array, object types. BSON has a string, numeric, boolean, array, object, binary, UTC DateTime, timestamp, and many other custom proprietary extensions, The most common difference is the DateTime and timestamp. In Couchbase, all time-related data is stored as string in ISO 8601 format. Couchbase N1QL has a plethora of functions to extract, convert, and calculate on the time. Full function details are available in this article.

Data Type

MongoDB

Couchbase

JSON

Numbers

BSON Number

JSON Number

{ “id”: 5, “balance”:2942.59 }

String

BSON String

JSON String

{ “name”: “Joe”,”city”: “Morrisville” }

boolean

BSON Boolean

JSON Boolean

{ “premium”: true, ”pending”: false}

datetime

Custom Data format

JSON ISO 8901 String with extract, convert and arithmetic functions

{ “soldate”: “2017-10-12T13:47:41.068-07:00” }

MongoDB:

{ “soldate”: ISODate(“2012-12-19T06:01:17.171Z”)}

spatial data

GeoJSON

Supports nearest neighbor and spatial distance.

“geometry”: {“type”: “Point”, “coordinates”: [-104.99404, 39.75621]}

MISSING

Unsupported

MISSING


NULL

JSON Null

JSON null

{ “last_address”: null }

Objects

Flexible JSON Objects

Flexible JSON Objects

{ “address”:  {“street”: “1, Main street”, “city”: Morrisville, “zip”:”94824″}}

Arrays

Flexible JSON Arrays

Flexible JSON Arrays

{ “hobbies”: [“tennis”, “skiing”, “lego”]}


All About MISSING

MISSING is the value of a field absent in the JSON document or literal.

{"name":"joe"} Everything but the field "name" is missing from the document. You can also set the value of a field to MISSING to make the field disappear. Traditional relational databases use three-valued logic with true, false, and NULL. With the addition of MISSING, N1QL  uses 4-value logic.

You have the following expressions with MISSING:

IS MISSING

Returns true if the document does not have a status field

FROM CUSTOMER WHERE status is MISSING;

IS NOT MISSING

Returns true if the document has a status field

FROM CUSTOMER WHERE status is NOT MISSING;

MISSING AND NULL

MISSING is a known missing quantity

null is a known UNKNOWN. You can check for null value similar to MISSING with IS NULL or IS NOT NULL expression.

Valid JSON:  {“status”: null}

MISSING value

Simply make the field of any type to disappear by setting it to MISSING

UPDATE CUSTOMER SET status = MISSING WHERE cxid = “xyz232”


Data Modeling

RelationshipMongoDBCouchbase 
1:1
  • Embedded Object (implicit)
  • Document Key Reference
  • Embedded Object (implicit)
  • Document Key Reference
1:N
  • Embedded Array of Objects
  • Document key Reference
  • Query with $lookup operator
  • Embedded Array of Objects
  • Document key Reference
  • Query with INNER, LEFT OUTER, RIGHT OUTER, NEST, UNNEST  joins
N:M
  • Embedded Array of Objects
  • Arrays of objects with references
  • Difficult to query with $lookup operator
  • Embedded Array of Objects
  • Arrays of objects with references
  • Query with INNER, LEFT OUTER, RIGHT OUTER, NEST, UNNEST  joins

Physical Space Management

Index TypeMongoDBCouchbase 
Table StorageFile system directoryFile system directory
Index StorageFile system directoryFile system directory
Partitioning – DataRange and hash sharding are supported.Hash partitioning


Stored in 1024 vbuckets

Partitioning – IndexTied to the collection sharding strategy since all (sub) indexes are local to each mongod node.Always detached from Bucket


Global Index (can use a different strategy than the bucket/collection)

Supports hash partitioning of the indexes.

Range partitioning, partial indexing is manual via partial indexes.


SDKs

My personal knowledge of both SDKs is limited.  There should be equivalent APIs, drivers, and connectors with the two products.  If not, please let us know.

SDKMongoDBCouchbase 
JavaMongoDB java driverCouchbase Java SDK, 


Simba & CDATA JDBC

CMongoDB C Driver


ODBC driver

Couchbase C SDK,


Simba & CDATA ODBC

.NET, LINQMongodb .NET provider.Couchbase .NET provider


LINQ provider

PHP, Python, Perl, Node.jsMongoDB SDK on all these languagesCouchbase SDK on all these languages
golangMongodb go sdkCouchbase go sdk

Query Language

SELECT Mongo has multiple APIs for selecting the documents. find(), aggregate() can both do the job of simple SELECT statements. We'll look at aggregate() later in the section.

SQL
1
/* MongoDB */
2
db.CUSTOMER.find({zip:94040})
3
 
4
/* Couchbase: N1QL */
5
SELECT * FROM CUSTOMER WHERE zip = 94040;




In MongoDB, providing _id is optional. If you don't provide its value, Mongo will generate the field value and save it. Providing document KEY is mandatory in Couchbase.

SQL
1
/* MongoDB */
2
db.CUSTOMER.save({_id: "xyz124", 
3
{“id”: “xyz124”, “name”: “Joe Montana”, “status”: “Premium”, “zip”: 94040})
4
 
5
/* Couchbase:N1QL */
6
INSERT INTO CUSTOMER(KEY, VALUE) VALUES
7
(‘xyz124’, {“id”: “xyz124”, “name”: “Joe Montana”, “status”: “Premium”, “zip”: 94040})



SQL
1
/* MongoDB */
2
db.CUSTOMER.update({_id:”xyz124’},{zip:94587})
3
 
4
/* Coudhbase:N1QL */
5
UPDATE CUSTOMER SET zip = 94587 WHERE id = ‘xyz124’



SQL
1
/* MongoDB */
2
db.CUSTOMER.remove({_id:‘pqr482’})
3
 
4
/* Couchbase:N1QL. One of the statements will do for this data/schema. */
5
DELETE FROM CUSTOMER WHERE id = ‘pqr482’;
6
DELETE FROM CUSTOMER WHERE META().id = ‘pqr482’;



MERGEMERGE operation on a set of JSON documents is often required as part of your ETL process or daily updates. MERGE statements can involve complex data sources with complex business rule-based predicates. Couchbase provides the standard MERGE operation with the same semantics. In MongoDB, you had to write a long program to do this, but then some of the set operation rules (e.g. each document should ONLY be updated once) are difficult to enforce from an application. In Couchbase, you can simply use the MERGE statement, just like RDBMS.

SQL

1
/* MongoDB */
2
Unavailable. Need to work around using aggregate(), custom-logic program, and update().
3
 
4
/* Couchbase:N1QL Second statement is ANSI SQL Compliant*/
5
MERGE INTO CUSTOMER 
6
     USING (SELECT id FROM CN WHERE x < 10) AS CN 
7
            ON KEY CN.id WHEN MATCHED THEN 
8
                  UPDATE SET CUSTOMER.o4=1;
9
 
10
MERGE INTO CUSTOMER 
11
     USING (SELECT id FROM CN WHERE x < 10) AS CN 
12
            ON (CN.id = META(CUSTOKMER).id) WHEN MATCHED THEN 
13
                  UPDATE SET CUSTOMER.o4=1;



DESCRIBE:

JSON data is self-describing and flexible. MongoDB Schema helper is available via Compass visualization in the Enterprise Edition only.

Couchbase has INFER to analyze the understand the schema. Both the query service and the analytic service can infer schema.

    1. Query service INFER command.
    2. Analytics Service has array_infer_schema() function.

Query Editor screenshot.


Here's the INFER output example:

JavaScript

1
INFER `travel-sample`;
2
 
3
{
4
    "requestID": "59c444b1-a468-486b-aac3-949be1ddaed1",
5
    "clientContextID": "634e367b-ac7c-4815-90da-1506d6902d78",
6
    "signature": null,
7
    "results": [
8
   [
9
       {
10
            "#docs": 816,
11
            "$schema": "http://json-schema.org/draft-06/schema",
12
            "Flavor": "`stops` = 0, `type` = \"route\"",
13
            "properties": {
14
                "airline": {
15
                    "#docs": 816,
16
                    "%docs": 100,
17
                    "samples": [
18
                        "9K",
19
                        "DL",
20
                        "KL",
21
                        "US",
22
                        "WN"
23
                   ],
24
                    "type": "string"
25
               },
26
                "airlineid": {
27
                    "#docs": 816,
28
                    "%docs": 100,
29
                    "samples": [
30
                        "airline_1629",
31
                        "airline_2009",
32
                        "airline_3090",
33
                        "airline_4547",
34
                        "airline_5265"
35
                   ],
36
                    "type": "string"
37
               },
38
                "destinationairport": {
39
                    "#docs": 816,
40
                    "%docs": 100,
41
                    "samples": [
42
                        "ACK",
43
                        "ATL",
44
                        "BWI",
45
                        "CMH",
46
                        "MAN"
47
                   ],
48
                    "type": "string"
49
               },
50
                "distance": {
51
                    "#docs": 816,
52
                    "%docs": 100,
53
                    "samples": [
54
                        49.792009674515775,
55
                        335.34343397923425,
56
                        775.5437991859698,
57
                        2524.506189235734,
58
                        6139.9648921034795
59
                   ],
60
                    "type": "number"
61
               },
62
                "equipment": {
63
                    "#docs": [
64
                        1,
65
                        815
66
                   ],
67
                    "%docs": [
68
                        0.12,
69
                        99.87
70
                   ],
71
                    "samples": [
72
                       [
73
                            null
74
                       ],
75
                       [
76
                            "73W 738",
77
                            "763",
78
                            "CNA",
79
                            "CRJ",
80
                            "ERJ CRJ"
81
                       ]
82
                   ],
83
                    "type": [
84
                        "null",
85
                        "string"
86
                   ]
87
               },
88
... 
89
See the rest of this at: 
90
https://blog.couchbase.com/introduction-to-couchbase-for-mongodb-developers-and-experts/
91
                          



EXPLAIN

Explain tells you the query plan for each query — the indexes are chosen, the predicates and other pushdowns, join types, join order, etc.  Both MongoDB and Couchbase produce explain in JSON form — a natural thing for JSON databases.

JSON

1
MongoDB Enterprise > db.CUSTOMER.find({zip:94040}).explain()
2
{
3
 "queryPlanner" : {
4
 "plannerVersion" : 1,
5
 "namespace" : "test.CUSTOMER",
6
 "indexFilterSet" : false,
7
 "parsedQuery" : {
8
 "zip" : {
9
 "$eq" : 94040
10
 }
11
 },
12
 "winningPlan" : {
13
 "stage" : "FETCH",
14
 "inputStage" : {
15
 "stage" : "IXSCAN",
16
 "keyPattern" : {
17
 "zip" : 1
18
 },
19
 "indexName" : "zip_1",
20
 "isMultiKey" : false,
21
 "multiKeyPaths" : {
22
 "zip" : [ ]
23
 },
24
 "isUnique" : false,
25
 "isSparse" : false,
26
 "isPartial" : false,
27
 "indexVersion" : 2,
28
 "direction" : "forward",
29
 "indexBounds" : {
30
 "zip" : [
31
 "[94040.0, 94040.0]"
32
 ]
33
 }
34
 }
35
 },
36
 "rejectedPlans" : [ ]
37
 },
38
 "serverInfo" : {
39
 "host" : "MacBook-Pro-4.attlocal.net",
40
 "port" : 27017,
41
 "version" : "4.0.0",
42
 "gitVersion" : "3b07af3d4f471ae89e8186d33bbb1d5259597d51"
43
 },
44
 "ok" : 1
45
}
46
MongoDB Enterprise > 



Couchbase EXPLAIN:

JSON

1
 
2
EXPLAIN SELECT * FROM CUSTOMER WHERE zip = 94040;
3
[
4
 {
5
    "plan": {
6
      "#operator": "Sequence",
7
      "~children": [
8
       {
9
          "#operator": "IndexScan3",
10
          "index": "ix_customer",
11
          "index_id": "b312ed00505a074d",
12
          "index_projection": {
13
            "primary_key": true
14
         },
15
          "keyspace": "CUSTOMER",
16
          "namespace": "default",
17
          "spans": [
18
           {
19
              "exact": true,
20
              "range": [
21
               {
22
                  "high": "94040",
23
                  "inclusion": 3,
24
                  "low": "94040"
25
               }
26
             ]
27
           }
28
         ],
29
          "using": "gsi"
30
       },
31
       {
32
          "#operator": "Fetch",
33
          "keyspace": "CUSTOMER",
34
          "namespace": "default"
35
       },
36
       {
37
          "#operator": "Parallel",
38
          "~child": {
39
            "#operator": "Sequence",
40
            "~children": [
41
             {
42
                "#operator": "Filter",
43
                "condition": "((`CUSTOMER`.`zip`) = 94040)"
44
             },
45
             {
46
                "#operator": "InitialProject",
47
                "result_terms": [
48
                 {
49
                    "expr": "self",
50
                    "star": true
51
                 }
52
               ]
53
             }
54
           ]
55
         }
56
       }
57
     ]
58
   },
59
    "text": "SELECT * FROM CUSTOMER WHERE zip = 94040;"
60
 }
61
]



The query workbench also has a visual explain along with profiling (for a different query).

Query Results infographic.

GROUP BY

MongoDB’s “GROUP BY” clause is part of the aggregate() API. Here’s the comparison.

Unlike SQL and N1QL, MongoDB query API has lot of implicit meaning without formal definitions.  With N1QL, you’re aware of the groupings (b and c) and aggregations (SUM(a)) explicitly.

SQL

1
/* MongoDB */
2
Grouping and aggregation is combined.
3
 
4
$group : {
5
         [
6
           { a:”$a”}, {b:”$b”}, {c: “$c”},
7
           count: { $sum: 1 }
8
         ]
9
       }
10
 
11
/* Couchbase: N1QL */
12
SELECT b, c, SUM(a)
13
FROM t
14
GROUP BY b, c



ORDER BY

SQL
1
/* MongoDB */ 
2
ORDER BY
3
     { $sort : { age : -1, posts: 1 } }
4
 
5
/* Couchbase: N1QLL */
6
ORDER BY age DESC, posts ASC



OFFSET and LIMIT

These are commonly used for the offset pagination method. both Mongo and Couchbase support. However, keyset pagination is a superior approach that uses fewer resources and performs better. Mongo users $skip and $limit clauses and N1QL uses OFFSET and LIMIT.  I’m unsure about the pagination optimizations done in MongoDB.

JOINs

Joins are generally discouraged in NoSQL databases and MongoDB in particular. But, the real world is complex and cannot be denormalized into a single collection. MongoDB has the $lookup operator for the join and does a nested loop between one collection (potentially sharded) to another collection (cannot be sharded). In couchbase, all the buckets are partitioned (sharded). JOINs operations (INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, joins with subqueries, NEST and UNNEST) We have a detailed article showing the equivalent operations between MongoDB and JSON. I recommend you read the article Joining JSON: Comparing Couchbase and MongoDB.

JOIN TypeMongoDBCouchbase 
INNER JOIN No.  $lookup is a limited left outer join on unsharded collections only. Applications have to do that and then remove the documents without the matching documents.  ON clause requires document key reference. Equi-join only
LEFT OUTER JOINLimited $lookup.  


Cannot join on arrays.  Need to flatten arrays manually before the join.

Full left outer join including array predicates in the ON clause.
RIGHT OUTER JOINUnsupported. Must be handled in the applicationLimited RIGHT OUTER JOIN support; Worked around with using other JOINs.
FULL OUTER JOINUnsupported. Must be handled in the applicationWorked around with using other JOINs.


GRANT and REVOKE

SQL
1
/* MongoDB */
2
db.grantPrivilegesToRole()
3
db.revokePrivilegesFromRole()
4
 
5
/* Couchbase: N1QL */
6
GRANT query_select ON orders, customers TO bill, linda;
7
REVOKE query_update ON `travel-sample` FROM debby
8
 



INDEXES

Below is an overview of the index capabilities of MongoDB and Couchbase. Both have a variety of indexes.  Couchbase index types and usage are well documented in the article: Create the Right Index and Get the Right Performance. In addition, Couchbase has a built-in index advisor for the individual statement as well as the workload and, in addition, has the Index Advisor Service, which is updated monthly.

Index TypeMongoDBCouchbase 
Primary IndexTable Scans, Primary IndexPrimary Index
Secondary IndexSecondary IndexSecondary Index
Composite IndexComposite IndexComposite Index
Functional Index 


(Expression Index)

UnavailableFunctional Index, Expression Index
Partial IndexUnavailablePartial Index
Range Partitioned IndexRange partitioned, Interval, List, Ref, Hash, Hybrid partitioned IndexManual range partitioned using partial Index
ARRAY Index1. B-Tree based index with one array-key per index.


2. The one array key can be simple or composite (multi-key).

1. B-tree based index with one array-key per index.


2. Array key can be composite

3.  Using SEARCH(): Inverted tree based index with unlimited number of array key per index.

Array Index on ExpressionsUnavailableYes
ObjectsYesYes


Full-Text Search

MongoDB product has built-in text search support and is now experimenting with integrating Lucene on their Atlas service via the $searchbeta feature. Couchbase has a built-in full-text search indexing service that you can run on your laptop and the cluster.  Again, we have a detailed article comparing the text search feature-by-feature, with examples.  Couchbase 6.5 integrates the FTS with N1QL, making the querying even further.

Optimizer

A query optimizer tries to rewrite the query for better optimization, to choose the most appropriate index, to decide index pushdown, to join order, to join type, and to create a plan that the engine can execute. Each database has a specialized optimizer that understands the capabilities and quirks of the engine.

FeatureMongoDBCouchbase 
Optimizer TypeQuery Shape based Rule based


Cost based (Preview in 6.5)

Index selectionQuery Shape based Rule based


Cost based (preview in 6.5)

Query RewriteNoYes, limited.
JOIN OrderAs written, procedural using the aggregation frameworkUser Specified (Left to Right)
Join TypeNested LoopNested Loop


Hash Join

HINTSYes. $hintYes. 


USE INDEX, USE HASH

EXPLAIN$explainEXPLAIN
Visual ExplainYesYes.
Query ProfilingYesyes


Transactions

NoSQL databases were invented to avoid SQL and transactions. Over time, each database is adding one, the other, or both!  MongoDB has added distributed multi-document transactions with snapshot isolation. Couchbase 6.5 has added distributed multi-document transactions with read committed isolation. Couchbase 7.0 provides distributed transactions for all of the data operations: N1QL statements, document updates. From "BEGIN WORK" to "COMMIT WORK" in Couchbase, functionality will look pretty familiar to SQL developers. The Java developers will love the easy-to-use lambda to program for transactions.

FeatureMongoDBCouchbase 
Index updatesIndexes are synchronously maintainedIndexes are asynchronously maintained
AtomicitySingle document

Multi-document (in 4.2)

Single Document

Multi-document (in 6.5)

Multi-Statement, Multi-Operation, Multi-document (7.0)

ConsistencyData and indexes are updated synchronously. By default, dirty read on Data and indexes. Data access is always consistent

Indexes have multiple consistency levels (UNBOUNDED, AT_PLUS, REQUEST_PLUS)

IsolationDefault: Dirty read


Transaction: Snapshot isolation

Optimistic locking with CAS checking


Transactions: Monotonic atomic isolation

DurabilityDurable with write majority option.Durable with confirmation after replication


Analytics

Couchbase Analytics is designed to bring you insights on your JSON data without ETL — NoETL for NoSQL. The JSON data in the key-value datastore is copied over to the analytics service which distributes the data into its storage.  The Couchbase query service, data service is designed to handle a large number of concurrent operations or queries to run the applications. The analytics service is designed to analyze a large number of documents to bring you insights into the business. In traditional terms, the Analytics service is designed for OLAP, and the rest are designed for OLTP.   MongoDB doesn’t have the equivalent analytics service.  You’d have to overload your existing cluster with both OLTP and OLAP workloads.  As you’ll learn, there’s no free lunch.  The large scans required for analytics workload will affect the latencies of your OLTP queries. Then, you start allocating new nodes for your secondary and tertiary copies of the data on which you can do the read-workload.  What will or should happen on a failover?  The secondary takes over but, again, affects your OLTP workload.

There’s a second reason for a distinct service — Query processing for analytics requires a different approach than the OLTP queries.  There area great set of resources for you to learn about this service, including the book by Don Chamberlin, co-inventor of SQL.

  1. SQL++ for SQL USERS: A TUTORIAL:  https://resources.couchbase.com/analytics/sql-book
  2. Couchbase Analytics: Under the Hood – Connect Silicon Valley 2018: https://www.youtube.com/watch?v=1dN11TUj58c
  3. From SQL to NoSQL
  4. NoETL for NoSQL – Real-Time Analytics With Couchbase: https://www.youtube.com/watch?v=MIno71jTOUI
  5. N1QL: To Query or To Analyze?
  6. Part 2: N1QL: To Query or To Analyze?

Summary 

Databases are extraordinarily useful.  They’re nuanced and are also sticky.  They’re essential to civilization.  Sumerians invented writing for transaction processing: to create a database out of clay tablets to keep track of taxes, land, gold, and find out information. There will be databases forever. Each database is different, whether they’re SQL databases or NoSQL databases. Not all SQL databases are the same. Not all NoSQL databases are the same. Understanding different databases enhance your organization’s flexibility and effectiveness.

Resources 

  1. SQL++ for SQL USERS: A TUTORIAL:  https://resources.couchbase.com/analytics/sql-book
  2. N1QL Practical guides
  3. Couchbase 6.5 blogs: https://blog.couchbase.com/tag/6.5/

Comments

Popular posts from this blog

Swami Vivekananda: The Monk That Nobody Sent to Chicago

  There’s a saying in Chicago: “We don’t want nobody that nobody sent.” This was the cold reception Swami Vivekananda faced when he arrived in the windy city in July 1893, determined to attend the World Parliament of Religions that September. He belonged to no organization, carried no letter of recommendation, his countrymen were nobody, and represented an alien religion to the Western world. As the days passed, his hope of attending the parliament dwindled. With money running out and the odds stacked against him, he left the Windy City and went to Boston, praying for a glimmer of opportunity.  Swamiji came to America to share India’s most profound gift: the wisdom of the Hindu sages, preserved through centuries of oral tradition and embodied by its monks. This was 1893, not 1993—India was under the British grip, its resources drained, and its spirit subdued. Swamiji’s mission was not just a cultural exchange; it was a bold step toward envisioning a future where India could re...

Why Should Databases Go Natural?

From search to CRM, applications are adopting natural language and intuitive interactions. Should databases follow? This article provides a strategic perspective. Amid the many technological evolutions in software and hardware (CISC/RISC, Internet, Cloud, and AI), one technology has endured:  Relational Database Systems   (RDBMS), aka SQL databases. For over 50 years, RDBMS has survived and thrived, overcoming many challenges. It has evolved and adopted beneficial features from emerging technologies like object-relational databases and now competes robustly with   NoSQL databases .  Today, RDBMS dominates the market, with four of the top five databases and seven of the top ten being relational. RDBMS has smartly borrowed ideas, like JSON support, from NoSQL, while NoSQL has also borrowed from RDBMS. NoSQL no longer rejects SQL. From a user perspective, all modern databases have SQL-inspired query language and a set of APIs. All applications manage the respective data...

iQ Interactive: Cool Things for Developers on Couchbase Capella iQ

  The landscape of software development is ever-evolving with the advent of new technologies. As we venture into 2023, natural language processing ( NLP ) is rapidly emerging as a pivotal aspect of programming. Unlike previous generations of tools that primarily aimed at enhancing coding productivity and code quality, the new generation of Artificial Intelligence ( GenAI ) tools, like iQ, is set to revolutionize every facet of a developer's workflow. This encompasses a wide range of activities: Reading, writing, and rewriting specifications Designing, prototyping, and coding Reviewing, refactoring, and verifying software Going through the iterative cycle of deploying, debugging, and improving the software Create a draft schema and sample data for any use case Natural language queries. Generate sample queries on a given dataset Fix the syntax error for a query Don't stop here. Let your imagination fly. Although the insights garnered from iQ are preliminary and should be treated ...