nosql-expert

Name: nosql-expert
Author: sickn33

by @sickn33v1.0.0

0.0(0)

"Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems."

NoSQL DatabasesCassandraMongoDBDynamoDBDistributed DatabasesGitHub

安装方式

npx skills add sickn33/antigravity-awesome-skills --skill nosql-expert

compare_arrows

Before / After 效果对比

1 组

使用前

尝试将关系型数据库的范式建模应用于NoSQL数据库，导致查询效率低下，数据冗余，难以扩展，无法发挥NoSQL的优势。

使用后

采用查询优先建模、单表设计等NoSQL专家模式，优化数据结构，避免热分区，实现高并发和可伸缩性，充分利用NoSQL数据库的性能。

查询性能0%

使用前

低 (复杂Join)

使用后

高 (预聚合)

扩展性0%

使用前

差 (垂直扩展)

使用后

优 (水平扩展)

数据冗余0%

使用前

低 (范式化)

使用后

高 (为查询优化)

开发效率0%

使用前

慢 (复杂查询)

使用后

快 (简单查询)

查询性能

低 (复杂Join) → 高 (预聚合)

扩展性

差 (垂直扩展) → 优 (水平扩展)

数据冗余

低 (范式化) → 高 (为查询优化)

开发效率

慢 (复杂查询) → 快 (简单查询)

description SKILL.md

name: nosql-expert description: "Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems." risk: unknown source: community date_added: "2026-02-27"

NoSQL Expert Patterns (Cassandra & DynamoDB)

Overview

This skill provides professional mental models and design patterns for distributed wide-column and key-value stores (specifically Apache Cassandra and Amazon DynamoDB).

Unlike SQL (where you model data entities), or document stores (like MongoDB), these distributed systems require you to model your queries first.

When to Use

Designing for Scale: Moving beyond simple single-node databases to distributed clusters.
Technology Selection: Evaluating or using Cassandra, ScyllaDB, or DynamoDB.
Performance Tuning: Troubleshooting "hot partitions" or high latency in existing NoSQL systems.
Microservices: Implementing "database-per-service" patterns where highly optimized reads are required.

The Mental Shift: SQL vs. Distributed NoSQL

Feature	SQL (Relational)	Distributed NoSQL (Cassandra/DynamoDB)
Data modeling	Model Entities + Relationships	Model Queries (Access Patterns)
Joins	CPU-intensive, at read time	Pre-computed (Denormalized) at write time
Storage cost	Expensive (minimize duplication)	Cheap (duplicate data for read speed)
Consistency	ACID (Strong)	BASE (Eventual) / Tunable
Scalability	Vertical (Bigger machine)	Horizontal (More nodes/shards)

The Golden Rule: In SQL, you design the data model to answer any query. In NoSQL, you design the data model to answer specific queries efficiently.

Core Design Patterns

1. Query-First Modeling (Access Patterns)

You typically cannot "add a query later" without migration or creating a new table/index.

Process:

List all Entities (User, Order, Product).
List all Access Patterns ("Get User by Email", "Get Orders by User sorted by Date").
Design Table(s) specifically to serve those patterns with a single lookup.

2. The Partition Key is King

Data is distributed across physical nodes based on the Partition Key (PK).

Goal: Even distribution of data and traffic.
Anti-Pattern: Using a low-cardinality PK (e.g., status="active" or gender="m") creates Hot Partitions, limiting throughput to a single node's capacity.
Best Practice: Use high-cardinality keys (User IDs, Device IDs, Composite Keys).

3. Clustering / Sort Keys

Within a partition, data is sorted on disk by the Clustering Key (Cassandra) or Sort Key (DynamoDB).

This allows for efficient Range Queries (e.g., WHERE user_id=X AND date > Y).
It effectively pre-sorts your data for specific retrieval requirements.

4. Single-Table Design (Adjacency Lists)

Primary use: DynamoDB (but concepts apply elsewhere)

Storing multiple entity types in one table to enable pre-joined reads.

PK (Partition)	SK (Sort)	Data Fields...
`USER#123`	`PROFILE`	`{ name: "Ian", email: "..." }`
`USER#123`	`ORDER#998`	`{ total: 50.00, status: "shipped" }`
`USER#123`	`ORDER#999`	`{ total: 12.00, status: "pending" }`

Query: PK="USER#123"
Result: Fetches User Profile AND all Orders in one network request.

5. Denormalization & Duplication

Don't be afraid to store the same data in multiple tables to serve different query patterns.

Table A: users_by_id (PK: uuid)
Table B: users_by_email (PK: email)

Trade-off: You must manage data consistency across tables (often using eventual consistency or batch writes).

Specific Guidance

Apache Cassandra / ScyllaDB

Primary Key Structure: ((Partition Key), Clustering Columns)
No Joins, No Aggregates: Do not try to JOIN or GROUP BY. Pre-calculate aggregates in a separate counter table.
Avoid ALLOW FILTERING: If you see this in production, your data model is wrong. It implies a full cluster scan.
Writes are Cheap: Inserts and Updates are just appends to the LSM tree. Don't worry about write volume as much as read efficiency.
Tombstones: Deletes are expensive markers. Avoid high-velocity delete patterns (like queues) in standard tables.

AWS DynamoDB

GSI (Global Secondary Index): Use GSIs to create alternative views of your data (e.g., "Search Orders by Date" instead of by User).
- Note: GSIs are eventually consistent.
LSI (Local Secondary Index): Sorts data differently within the same partition. Must be created at table creation time.
WCU / RCU: Understand capacity modes. Single-table design helps optimize consumed capacity units.
TTL: Use Time-To-Live attributes to automatically expire old data (free delete) without creating tombstones.

Expert Checklist

Before finalizing your NoSQL schema:

Access Pattern Coverage: Does every query pattern map to a specific table or index?
Cardinality Check: Does the Partition Key have enough unique values to spread traffic evenly?
Split Partition Risk: For any single partition (e.g., a single user's orders), will it grow indefinitely? (If > 10GB, you need to "shard" the partition, e.g., USER#123#2024-01).
Consistency Requirement: Can the application tolerate eventual consistency for this read pattern?

Common Anti-Patterns

❌ Scatter-Gather: Querying all partitions to find one item (Scan). ❌ Hot Keys: Putting all "Monday" data into one partition. ❌ Relational Modeling: Creating Author and Book tables and trying to join them in code. (Instead, embed Book summaries in Author, or duplicate Author info in Books).

forum用户评价 (0)

发表评价

效果

易用性

文档

兼容性

暂无评价，来写第一条吧

统计数据

安装量0

评分0.0 / 5.0

版本1.0.0

更新日期2026年3月16日

对比案例1 组

用户评分