HVRDHVRD
Databases

NoSQL Databases

Comprehensive guide to understanding NoSQL databases, their types, operations, schema flexibility, and best practices.

NoSQL Databases are designed to handle unstructured or semi-structured data at large scale. Unlike traditional relational databases (SQL), NoSQL offers high flexibility in data models, horizontal scalability, and distributed architecture.

Why NoSQL?

  • Handle large volumes of diverse data (e.g., JSON documents, key-value pairs).
  • High horizontal scalability (add more machines).
  • Flexible schema enables faster development iterations.
  • Suitable for real-time big data applications, IoT, content management, etc.

Types of NoSQL Databases

1. Document Stores

Store data as documents, typically in JSON or BSON format.
Example systems: MongoDB, CouchDB.

Example Document (MongoDB)

{
    "_id": "507f1f77bcf86cd799439011",
    "username": "bob",
    "email": "bob@example.com",
    "createdAt": "2025-09-17T15:30:00Z"
}

2. Key-Value Stores

Store data as a key and its corresponding value.
Example systems: Redis, DynamoDB.

Example Key-Value Pair

"user:1001" => "{ username: 'alice', email: 'alice@example.com' }"

3. Wide-Column Stores

Store data in tables, rows, and dynamic columns.
Example systems: Cassandra, HBase.

4. Graph Databases

Store data as nodes and edges representing relationships.
Example systems: Neo4j, ArangoDB.


Schema: Flexible, Not Free-for-All

A common misconception:

NoSQL is schema-less.

Here’s the real story:

  • NoSQL databases don’t enforce a rigid schema at the database level.
  • That doesn’t mean there’s no expected structure.
  • In practice, your application assumes a consistent schema to function reliably.

Why Use a Schema in NoSQL?

  1. Data Consistency for the App:
    You don’t want some documents with username and others missing it.

  2. Query Reliability:
    Queries and indexes only work well if data structure is predictable.

  3. Validation & Data Integrity:
    Application-level schemas (e.g., using Mongoose) help enforce field types, required fields, and constraints.

Example: Schema Definition with Mongoose (MongoDB)

import mongoose from 'mongoose';

const userSchema = new mongoose.Schema({
    username: { type: String, required: true, unique: true },
    email: { type: String, required: true, unique: true },
    createdAt: { type: Date, default: Date.now }
});

const User = mongoose.model('User', userSchema);

This prevents bad data from sneaking in, despite MongoDB itself not enforcing structure.


Basic NoSQL Operations

Insert a Document (MongoDB)

db.users.insertOne({
    username: 'alice',
    email: 'alice@example.com',
    createdAt: new Date()
});

Query Documents

db.users.findOne({ username: 'alice' });

Update a Document

db.users.updateOne(
    { username: 'alice' },
    { $set: { email: 'alice123@example.com' } }
);

Delete a Document

db.users.deleteOne({ username: 'alice' });

CAP Theorem: NoSQL Realities

In distributed NoSQL databases, you can guarantee only two of the following at the same time:

  • C – Consistency: Every read gets the most recent write.
  • A – Availability: Every request gets a response (even if outdated).
  • P – Partition Tolerance: System keeps working despite network splits.

Most NoSQL databases choose Availability + Partition Tolerance (AP) over strict Consistency (eventual consistency).


When to Use NoSQL?

✅ Large volume of unstructured data (e.g., logs, social feeds).
✅ Rapid development without needing predefined schema.
✅ Massive horizontal scalability required.
✅ High throughput with relaxed consistency (eventual consistency).

🚫 Don’t use for apps that require complex joins or strong transactional guarantees (use relational DB instead).


Best Practices

  • Define expected schema at the application layer (use Mongoose, Joi, etc.).
  • Index frequently queried fields for performance.
  • Avoid storing huge binary files in the database; use object storage instead.
  • Sanitize inputs to prevent injection attacks.
  • Use TTL (time-to-live) features for ephemeral data (e.g., Redis).
  • Regular backups are essential even for schema-less systems.