---
id: sm-senior-data-scientist
name: "senior-data-scientist"
url: https://skills.yangsir.net/skill/sm-senior-data-scientist
author: davila7
domain: data-ai
tags: ["data-science", "machine-learning", "statistical-modeling", "python", "predictive-analytics"]
install_count: 2700
rating: 4.40 (47 reviews)
github: https://github.com/davila7/claude-code-templates
---

# senior-data-scientist

> 此技能是一位世界级高级数据科学家，专注于生产级AI/ML/数据系统，提供核心工具和实验设计能力。

**Stats**: 2,700 installs · 4.4/5 (47 reviews)

## Before / After 对比

### 生产级AI/ML系统的数据科学支持

## Readme

# senior-data-scientist

# Senior Data Scientist

World-class senior data scientist skill for production-grade AI/ML/Data systems.

## Quick Start

### Main Capabilities

```
# Core Tool 1
python scripts/experiment_designer.py --input data/ --output results/

# Core Tool 2  
python scripts/feature_engineering_pipeline.py --target project/ --analyze

# Core Tool 3
python scripts/model_evaluation_suite.py --config config.yaml --deploy

```

## Core Expertise

This skill covers world-class capabilities in:

- Advanced production patterns and architectures

- Scalable system design and implementation

- Performance optimization at scale

- MLOps and DataOps best practices

- Real-time processing and inference

- Distributed computing frameworks

- Model deployment and monitoring

- Security and compliance

- Cost optimization

- Team leadership and mentoring

## Tech Stack

**Languages:** Python, SQL, R, Scala, Go
**ML Frameworks:** PyTorch, TensorFlow, Scikit-learn, XGBoost
**Data Tools:** Spark, Airflow, dbt, Kafka, Databricks
**LLM Frameworks:** LangChain, LlamaIndex, DSPy
**Deployment:** Docker, Kubernetes, AWS/GCP/Azure
**Monitoring:** MLflow, Weights & Biases, Prometheus
**Databases:** PostgreSQL, BigQuery, Snowflake, Pinecone

## Reference Documentation

### 1. Statistical Methods Advanced

Comprehensive guide available in `references/statistical_methods_advanced.md` covering:

- Advanced patterns and best practices

- Production implementation strategies

- Performance optimization techniques

- Scalability considerations

- Security and compliance

- Real-world case studies

### 2. Experiment Design Frameworks

Complete workflow documentation in `references/experiment_design_frameworks.md` including:

- Step-by-step processes

- Architecture design patterns

- Tool integration guides

- Performance tuning strategies

- Troubleshooting procedures

### 3. Feature Engineering Patterns

Technical reference guide in `references/feature_engineering_patterns.md` with:

- System design principles

- Implementation examples

- Configuration best practices

- Deployment strategies

- Monitoring and observability

## Production Patterns

### Pattern 1: Scalable Data Processing

Enterprise-scale data processing with distributed computing:

- Horizontal scaling architecture

- Fault-tolerant design

- Real-time and batch processing

- Data quality validation

- Performance monitoring

### Pattern 2: ML Model Deployment

Production ML system with high availability:

- Model serving with low latency

- A/B testing infrastructure

- Feature store integration

- Model monitoring and drift detection

- Automated retraining pipelines

### Pattern 3: Real-Time Inference

High-throughput inference system:

- Batching and caching strategies

- Load balancing

- Auto-scaling

- Latency optimization

- Cost optimization

## Best Practices

### Development

- Test-driven development

- Code reviews and pair programming

- Documentation as code

- Version control everything

- Continuous integration

### Production

- Monitor everything critical

- Automate deployments

- Feature flags for releases

- Canary deployments

- Comprehensive logging

### Team Leadership

- Mentor junior engineers

- Drive technical decisions

- Establish coding standards

- Foster learning culture

- Cross-functional collaboration

## Performance Targets

**Latency:**

- P50: < 50ms

- P95: < 100ms

- P99: < 200ms

**Throughput:**

- Requests/second: > 1000

- Concurrent users: > 10,000

**Availability:**

- Uptime: 99.9%

- Error rate: < 0.1%

## Security & Compliance

- Authentication & authorization

- Data encryption (at rest & in transit)

- PII handling and anonymization

- GDPR/CCPA compliance

- Regular security audits

- Vulnerability management

## Common Commands

```
# Development
python -m pytest tests/ -v --cov
python -m black src/
python -m pylint src/

# Training
python scripts/train.py --config prod.yaml
python scripts/evaluate.py --model best.pth

# Deployment
docker build -t service:v1 .
kubectl apply -f k8s/
helm upgrade service ./charts/

# Monitoring
kubectl logs -f deployment/service
python scripts/health_check.py

```

## Resources

- Advanced Patterns: `references/statistical_methods_advanced.md`

- Implementation Guide: `references/experiment_design_frameworks.md`

- Technical Reference: `references/feature_engineering_patterns.md`

- Automation Scripts: `scripts/` directory

## Senior-Level Responsibilities

As a world-class senior professional:

- 

**Technical Leadership**

Drive architectural decisions

- Mentor team members

- Establish best practices

- Ensure code quality

- 

**Strategic Thinking**

Align with business goals

- Evaluate trade-offs

- Plan for scale

- Manage technical debt

- 

**Collaboration**

Work across teams

- Communicate effectively

- Build consensus

- Share knowledge

- 

**Innovation**

Stay current with research

- Experiment with new approaches

- Contribute to community

- Drive continuous improvement

- 

**Production Excellence**

Ensure high availability

- Monitor proactively

- Optimize performance

- Respond to incidents

Weekly Installs2.0KRepository[davila7/claude-…emplates](https://github.com/davila7/claude-code-templates)GitHub Stars23.1KFirst SeenJan 20, 2026Security Audits[Gen Agent Trust HubPass](/davila7/claude-code-templates/senior-data-scientist/security/agent-trust-hub)[SocketPass](/davila7/claude-code-templates/senior-data-scientist/security/socket)[SnykPass](/davila7/claude-code-templates/senior-data-scientist/security/snyk)Installed onopencode1.6Kgemini-cli1.6Kcodex1.6Kgithub-copilot1.6Kkimi-cli1.5Kamp1.5K

---
*Source: https://skills.yangsir.net/skill/sm-senior-data-scientist*
*Markdown mirror: https://skills.yangsir.net/api/skill/sm-senior-data-scientist/markdown*