TerraPulse Data Lab — Tools & Stack
The scientific analysis environment for TerraPulse research.
Quick Start
# 1. Activate the terrapulse environment
pyenv activate terrapulse
# 2. Start Jupyter from the project root
cd ~/projects/terrapulse
jupyter notebook
# 3. Open the template notebook
# → workspaces/notebook-template.ipynb
# 4. Or create a new workspace
python -c "
from terrapulse.lab.workspace import create_workspace
create_workspace('my-study', 'My Analysis', 'Your Name')
"
Python Stack
Core Analysis
| Package | Version | Purpose | Install |
|---|---|---|---|
| Polars | 1.39+ | Fast DataFrame operations (Rust-powered) | pip install polars |
| NumPy | 2.4+ | Numerical arrays, linear algebra | pip install numpy |
| SciPy | 1.17+ | Statistics, signal processing, optimization | pip install scipy |
| scikit-learn | 1.8+ | Machine learning, clustering, regression | pip install scikit-learn |
Visualization
| Package | Version | Purpose | Install |
|---|---|---|---|
| Plotly | 6.6+ | Interactive charts (HTML + static export) | pip install plotly |
| Kaleido | 1.2+ | Static image export for Plotly (PNG, SVG, PDF) | pip install kaleido |
| KaTeX | CDN | LaTeX equation rendering in admin/web | Loaded via CDN |
AI / Inference
| Package | Version | Purpose | Install |
|---|---|---|---|
| TinyGrad | 0.12+ | Lightweight ML inference and training | pip install tinygrad |
Notebook
| Package | Version | Purpose | Install |
|---|---|---|---|
| Jupyter | 4.x | Interactive notebook environment | pip install jupyter |
Data Access
| Package | Purpose |
|---|---|
| terrapulse.lab.extract | PostgreSQL → Parquet extraction |
| terrapulse.lab.workspace | Workspace management |
| httpx | TerraPulse API client |
| psycopg2 | Direct PostgreSQL access |
| DuckDB | Local analytical queries |
Install Everything
pip install polars scikit-learn plotly kaleido scipy tinygrad jupyter numpy
Or from the project:
pip install -e ".[dev]"
Workspace Structure
workspaces/{slug}/
├── index.md # Hypothesis, methodology, findings (LaTeX supported)
├── workspace.json # Metadata (author, status, tags)
├── data/ # Extracted datasets (Parquet, DuckDB, CSV)
├── scripts/ # Analysis scripts (Python)
├── tmp/ # Scratch files (gitignored)
└── www/ # Published output (HTML charts, PNG figures, PDF papers)
Creating a Workspace
from terrapulse.lab.workspace import create_workspace
create_workspace(
slug="my-analysis",
title="Temperature Trends in Global Cities",
author="Your Name",
description="Analyzing NASA POWER data for warming signals.",
)
Extracting Data
from terrapulse.lab.extract import extract_metric, extract_multi, extract_sql
# Single metric
extract_metric("earthquake_magnitude", "workspaces/my-analysis/data/eq.parquet", days=365)
# Multiple metrics (for correlation)
extract_multi(
["temperature_2m", "us_aqi"],
"workspaces/my-analysis/data/temp_aqi.parquet",
days=30,
)
# Arbitrary SQL
extract_sql(
"SELECT * FROM observations WHERE country = 'United States' AND metric = 'drought_area_pct'",
"workspaces/my-analysis/data/us_drought.parquet",
)
Using the API as a Data Source
import httpx
API = "https://terrapulse.info/api/v1"
# Time-series
ts = httpx.get(f"{API}/observations/timeseries", params={"metric": "earthquake_magnitude", "days": 30}).json()
# Stats
stats = httpx.get(f"{API}/stats").json()
# Geocoding
geo = httpx.get(f"{API}/geo/reverse", params={"lat": 37.77, "lon": -122.42}).json()
# Alerts
alerts = httpx.get(f"{API}/alerts/").json()
# Space weather events
events = httpx.get(f"{API}/space-weather/events", params={"days": 7}).json()
LaTeX in Research Notes
Workspace index.md files support LaTeX via KaTeX:
Inline: $r = 0.413$ renders as $r = 0.413$
Display:
$$r = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2 \sum_{i=1}^{n}(y_i - \bar{y})^2}}$$
Supported: fractions, Greek letters, summations, integrals, matrices, subscripts/superscripts, and all standard LaTeX math.
Jupyter Notebook
A template notebook is provided at workspaces/notebook-template.ipynb.
To start Jupyter:
cd ~/projects/terrapulse
jupyter notebook
The notebook demonstrates:
- Extracting data from PostgreSQL via
terrapulse.lab.extract - Using the TerraPulse API as a data source
- Cross-metric analysis with Polars
- Interactive Plotly visualizations
- Available metrics and stats
Publishing Workflow
- Create workspace —
create_workspace("my-study", "Title") - Extract data —
extract_metric(...)→data/Parquet files - Analyze — Python scripts in
scripts/, Jupyter notebooks - Visualize — Plotly charts exported to
www/(HTML + PNG) - Document — Write findings in
index.mdwith LaTeX equations - Publish — Update workspace status to "published"
- Review — Admin UI shows figures with captions, rendered LaTeX, and interactive charts
Research Standards
TerraPulse Lab aims to produce statistically sound, reproducible analysis:
- All data extracts are versioned Parquet files (reproducible)
- Statistical significance requires $p < 0.05$ minimum
- Effect sizes must be reported alongside p-values
- Multiple comparison corrections (Bonferroni, FDR) when testing many hypotheses
- Confidence intervals preferred over point estimates
- All code is committed alongside findings
- Visualizations must include axes labels, units, and sample sizes