data-analysis
データ探索、レポート生成、データ一貫性の検証、意思決定支援に利用されます。AIエージェントスキルとして、作業効率と自動化能力を向上させます。
npx skills add supercent-io/skills-template --skill data-analysisBefore / After 効果比較
1 组従来のデータ分析プロセスは、手動でのデータクレンジング、変換、レポート生成を伴うことが多く、時間と手間がかかり、エラーが発生しやすいものでした。データの一貫性を検証し、意思決定を支援する際には、多大な手作業による介入が必要とされ、効率が低く、ビジネスニーズに迅速に対応することが困難でした。
data-analysisスキルを活用することで、データ探索、レポート生成、一貫性検証が自動化され、非常に効率的になります。これにより、複雑なデータセットを迅速に処理し、正確な洞察を提供し、意思決定プロセスを大幅に加速させ、ビジネスが市場の変化により迅速に適応できるようになります。
data-analysis
Data Analysis When to use this skill Data exploration: Understand a new dataset Report generation: Derive data-driven insights Quality validation: Check data consistency Decision support: Make data-driven recommendations Instructions Step 1: Load and explore data Python (Pandas): import pandas as pd import numpy as np # Load CSV df = pd.read_csv('data.csv') # Basic info print(df.info()) print(df.describe()) print(df.head(10)) # Check missing values print(df.isnull().sum()) # Data types print(df.dtypes) SQL: -- Inspect table schema DESCRIBE table_name; -- Sample data SELECT * FROM table_name LIMIT 10; -- Basic stats SELECT COUNT() as total_rows, COUNT(DISTINCT column_name) as unique_values, MIN(numeric_column) as min_val, MAX(numeric_column) as max_val, AVG(numeric_column) as avg_val FROM table_name; Step 2: Data cleaning # Handle missing values df['column'].fillna(df['column'].mean(), inplace=True) df.dropna(subset=['required_column'], inplace=True) # Remove duplicates df.drop_duplicates(inplace=True) # Type conversions df['date'] = pd.to_datetime(df['date']) df['category'] = df['category'].astype('category') # Remove outliers (IQR method) Q1 = df['value'].quantile(0.25) Q3 = df['value'].quantile(0.75) IQR = Q3 - Q1 df = df[(df['value'] >= Q1 - 1.5IQR) & (df['value'] <= Q3 + 1.5*IQR)] Step 3: Statistical analysis # Descriptive statistics print(df['numeric_column'].describe()) # Grouped analysis grouped = df.groupby('category').agg({ 'value': ['mean', 'sum', 'count'], 'other': 'nunique' }) print(grouped) # Correlation correlation = df[['col1', 'col2', 'col3']].corr() print(correlation) # Pivot table pivot = pd.pivot_table(df, values='sales', index='region', columns='month', aggfunc='sum' ) Step 4: Visualization import matplotlib.pyplot as plt import seaborn as sns # Histogram plt.figure(figsize=(10, 6)) df['value'].hist(bins=30) plt.title('Distribution of Values') plt.savefig('histogram.png') # Boxplot plt.figure(figsize=(10, 6)) sns.boxplot(x='category', y='value', data=df) plt.title('Value by Category') plt.savefig('boxplot.png') # Heatmap (correlation) plt.figure(figsize=(10, 8)) sns.heatmap(correlation, annot=True, cmap='coolwarm') plt.title('Correlation Matrix') plt.savefig('heatmap.png') # Time series plt.figure(figsize=(12, 6)) df.groupby('date')['value'].sum().plot() plt.title('Time Series of Values') plt.savefig('timeseries.png') Step 5: Derive insights # Top/bottom analysis top_10 = df.nlargest(10, 'value') bottom_10 = df.nsmallest(10, 'value') # Trend analysis df['month'] = df['date'].dt.to_period('M') monthly_trend = df.groupby('month')['value'].sum() growth = monthly_trend.pct_change() * 100 # Segment analysis segments = df.groupby('segment').agg({ 'revenue': 'sum', 'customers': 'nunique', 'orders': 'count' }) segments['avg_order_value'] = segments['revenue'] / segments['orders'] Output format Analysis report structure # Data Analysis Report ## 1. Dataset overview - Dataset: [name] - Records: X,XXX - Columns: XX - Date range: YYYY-MM-DD ~ YYYY-MM-DD ## 2. Key findings - Insight 1 - Insight 2 - Insight 3 ## 3. Statistical summary | Metric | Value | |------|-----| | Mean | X.XX | | Median | X.XX | | Std dev | X.XX | ## 4. Recommendations 1. [Recommendation 1] 2. [Recommendation 2] Best practices Understand the data first: Learn structure and meaning before analysis Incremental analysis: Move from simple to complex analyses Use visualization: Use a variety of charts to spot patterns Validate assumptions: Always verify assumptions about the data Reproducibility: Document analysis code and results Constraints Required rules (MUST) Preserve raw data (work on a copy) Document the analysis process Validate results Prohibited (MUST NOT) Do not expose sensitive personal data Do not draw unsupported conclusions References Pandas Documentation Matplotlib Gallery Seaborn Tutorial Examples Example 1: Basic usage Example 2: Advanced usageWeekly Installs12.3KRepositorysupercent-io/sk…templateGitHub Stars53First SeenJan 24, 2026Security AuditsGen Agent Trust HubPassSocketFailSnykPassInstalled oncodex12.2Kgemini-cli12.2Kopencode12.2Kgithub-copilot12.2Kcursor12.2Kamp12.1K
ユーザーレビュー (0)
レビューを書く
レビューなし
統計データ
ユーザー評価
この Skill を評価