API Reference
Welcome to the highentDCA Python API documentation. This section provides detailed information about the modules, classes, and functions available in the highentDCA package.
Overview
highentDCA extends the adabmDCA framework with specialized functionality for entropy-decimated DCA models. The package is organized into several modules:
- Checkpoint: Checkpoint strategies for saving model state
- Training: Training functions for graph-based DCA models
- edDCA Model: Entropy decimation algorithm implementation
- CLI: Command-line interface entry point
- Parser: Argument parsing utilities
- Entropy Computation: Thermodynamic integration for entropy calculation
Quick Links
Core Modules
| Module | Description |
|---|---|
highentDCA.models.edDCA |
Entropy decimation fitting algorithm |
highentDCA.training |
Graph training utilities |
highentDCA.checkpoint |
Checkpoint management classes |
highentDCA.parser |
CLI argument parsers |
highentDCA.scripts.entropy |
Entropy computation via thermodynamic integration |
Common Imports
# Model training
from highentDCA.models.edDCA import fit
# Training utilities
from highentDCA.training import train_graph
# Checkpoint classes
from highentDCA.checkpoint import Checkpoint, DecCheckpoint
# Argument parsing
from highentDCA.parser import add_args_train, add_args_edDCA
# Entropy computation
from highentDCA.scripts.entropy import compute_entropy
Usage Examples
Example 1: Basic edDCA Training
import torch
from pathlib import Path
from adabmDCA.dataset import DatasetDCA
from adabmDCA.utils import init_chains, init_parameters, get_device
from adabmDCA.sampling import get_sampler
from highentDCA.models.edDCA import fit
from highentDCA.checkpoint import DecCheckpoint
# Configuration
device = get_device("cuda")
dtype = torch.float32
# Load dataset
dataset = DatasetDCA(
path_data="data/protein_family.fasta",
alphabet="protein",
device=device,
dtype=dtype,
)
# Initialize parameters and chains
params = init_parameters(L=dataset.L, q=dataset.q, device=device, dtype=dtype)
chains = init_chains(
nchains=10000,
L=dataset.L,
q=dataset.q,
device=device,
dtype=dtype,
)
log_weights = torch.zeros(chains.shape[0], device=device, dtype=dtype)
# Set up sampler
sampler = get_sampler("gibbs")
# Configure checkpoint
checkpoint = DecCheckpoint(
file_paths={
"log": Path("output/training.log"),
"params": Path("output/params.dat"),
"chains": Path("output/chains.fasta"),
},
tokens=dataset.tokens,
args={
"model": "edDCA",
"data": "data/protein_family.fasta",
"alphabet": "protein",
"density": 0.02,
"drate": 0.01,
# ... other args
},
target_density=0.02,
)
# Train edDCA model
fit(
sampler=sampler,
chains=chains,
log_weights=log_weights,
fi_target=dataset.fi,
fij_target=dataset.fij,
params=params,
mask=torch.ones_like(params["coupling_matrix"]),
lr=0.01,
nsweeps=10,
target_pearson=0.95,
target_density=0.02,
drate=0.01,
checkpoint=checkpoint,
)
Example 2: Custom Checkpoint Strategy
from highentDCA.checkpoint import DecCheckpoint
# Create custom density checkpoints
custom_densities = [0.9, 0.7, 0.5, 0.3, 0.1, 0.05, 0.02]
checkpoint = DecCheckpoint(
file_paths={
"log": Path("output/custom.log"),
"params": Path("output/params.dat"),
"chains": Path("output/chains.fasta"),
},
tokens="protein",
args=training_args,
checkpt_steps=custom_densities,
target_density=0.02,
)
Example 3: Computing Entropy
from highentDCA.scripts.entropy import compute_entropy
from adabmDCA.io import load_params
from adabmDCA.sampling import get_sampler
# Load trained model
params, tokens, L, q = load_params("output/params.dat")
# Initialize chains
chains = init_chains(nchains=10000, L=L, q=q, device="cuda")
# Get sampler
sampler = get_sampler("gibbs")
# Compute entropy
entropy = compute_entropy(
params=params,
path_targetseq="data/target_sequence.fasta",
sampler=sampler,
chains=chains,
output="output/entropy",
label="density_0.020",
tokens=tokens,
theta_max=5.0,
nsteps=100,
nsweeps=100,
device="cuda",
)
print(f"Model entropy: {entropy:.4f}")
Example 4: Training on Specific Graph
from highentDCA.training import train_graph
import torch
# Create sparse mask (e.g., contact map)
mask = torch.zeros(L, q, L, q, device=device, dtype=torch.bool)
# ... populate mask with desired interactions ...
# Train on this specific graph
chains, params, log_weights, history = train_graph(
sampler=sampler,
chains=chains,
mask=mask,
fi=dataset.fi,
fij=dataset.fij,
params=params,
nsweeps=10,
lr=0.01,
max_epochs=10000,
target_pearson=0.95,
checkpoint=checkpoint,
)
# Access training history
import matplotlib.pyplot as plt
plt.plot(history["epochs"], history["pearson"])
plt.xlabel("Epochs")
plt.ylabel("Pearson Correlation")
plt.show()
Integration with adabmDCA
highentDCA is built on top of adabmDCA, so you have access to all adabmDCA functionality:
# Import adabmDCA utilities
from adabmDCA.fasta import import_from_fasta, write_fasta
from adabmDCA.stats import get_freq_single_point, get_freq_two_points
from adabmDCA.io import load_params, save_params
from adabmDCA.sampling import gibbs_sampling, metropolis
from adabmDCA.statmech import compute_energy, compute_log_likelihood
from adabmDCA.graph import decimate_graph, compute_density
# Use with highentDCA
from highentDCA.models.edDCA import fit
from highentDCA.checkpoint import DecCheckpoint
Type Hints
highentDCA uses Python type hints for better code documentation and IDE support:
from typing import Dict, Callable
import torch
def fit(
sampler: Callable,
chains: torch.Tensor,
log_weights: torch.Tensor,
fi_target: torch.Tensor,
fij_target: torch.Tensor,
params: Dict[str, torch.Tensor],
mask: torch.Tensor,
lr: float,
nsweeps: int,
target_pearson: float,
target_density: float,
drate: float,
checkpoint: Checkpoint,
fi_test: torch.Tensor | None = None,
fij_test: torch.Tensor | None = None,
args=None,
) -> None:
...
Module Details
Click on the links below for detailed documentation of each module:
- Checkpoint: Learn about checkpoint strategies
- Training: Understand graph training functions
- edDCA Model: Deep dive into entropy decimation
- CLI: Command-line interface implementation
- Parser: Argument parsing utilities
- Entropy Computation: Thermodynamic integration details
Contributing
To contribute to the API:
- Follow PEP 8 style guidelines
- Add type hints to all function signatures
- Write comprehensive docstrings (Google style)
- Include examples in docstrings where appropriate
- Update this documentation when adding new features