Tutorial: High-Throughput Screening¶
This tutorial covers systematic screening of compositions, defects, and adsorption configurations.
Learning Objectives¶
- Set up parameter space screening
- Manage large numbers of calculations
- Analyze screening results
Overview¶
High-throughput screening automates exploration of: - Vacancy concentrations - Nitrogen fractions - Dopant elements - Adsorption sites
Step 1: Define Parameter Space¶
from nh3sofc.workflows import ScreeningWorkflow
workflow = ScreeningWorkflow(
base_structure=surface,
parameter_space={
"vacancy_concentration": [0.0, 0.05, 0.10, 0.15, 0.20],
"nitrogen_fraction": [0.5, 0.67, 0.75],
},
work_dir="./oxynitride_screening",
encut=520,
kspacing=0.03,
)
This creates 5 × 3 = 15 calculations.
Step 2: Setup Calculations¶
calc_dirs = workflow.setup()
print(f"Created {len(calc_dirs)} calculation directories")
# Directory structure:
# oxynitride_screening/
# ├── vac0.00_N0.50/
# ├── vac0.00_N0.67/
# ├── vac0.00_N0.75/
# ├── vac0.05_N0.50/
# ...
Step 3: Monitor Progress¶
status = workflow.get_status()
for name, state in status.items():
print(f"{name}: {state}")
# Output:
# vac0.00_N0.50: completed
# vac0.00_N0.67: running
# vac0.05_N0.50: pending
# ...
Step 4: Parse Results¶
import pandas as pd
results = workflow.parse_results()
# Returns DataFrame with columns:
# - vacancy_concentration
# - nitrogen_fraction
# - energy
# - converged
# - ...
print(results.head())
Step 5: Find Optimal¶
optimal = workflow.find_optimal(metric="energy")
print(f"Optimal parameters: {optimal}")
# Or custom criteria
best = results.loc[results["energy"].idxmin()]
print(f"Lowest energy: {best}")
Specialized Screening¶
Composition Screening¶
from nh3sofc.workflows import CompositionScreening
workflow = CompositionScreening(
base_structure=surface,
elements_to_vary={
"La": ["La", "Sr", "Ba", "Ca"],
"V": ["V", "Ti", "Nb", "Ta"],
},
work_dir="./dopant_screening",
)
workflow.setup()
# Creates 4 × 4 = 16 calculations
Adsorbate Screening¶
from nh3sofc.workflows import AdsorbateScreening
workflow = AdsorbateScreening(
surface=surface,
adsorbates=["NH3", "NH2", "NH", "N", "H"],
sites=["ontop_La", "ontop_V", "bridge", "hollow"],
work_dir="./adsorbate_screening",
)
paths = workflow.setup()
# Creates 5 × 4 = 20 calculations
Generating Adsorbate Configurations¶
For fine-grained control over adsorbate placement, use AdsorbatePlacer directly.
All placement methods use uniform sampling on SO(3) for unbiased molecular
orientations, ensuring proper coverage of configurational space.
from nh3sofc.structure import AdsorbatePlacer, SlabStructure
# Load surface
slab = SlabStructure.from_file("surface.vasp")
placer = AdsorbatePlacer(slab)
# Method 1: Random placement with random orientations
configs = placer.add_random(
"NH3",
n_configs=20,
height=2.0,
random_seed=42,
)
# Generates 20 configurations with uniformly sampled positions and orientations
# Method 2: Grid-based placement with multiple orientations per point
configs = placer.add_grid(
"NH3",
grid_size=(3, 3), # 3x3 grid of positions
orientations=4, # 4 random orientations per grid point
height=2.0,
random_seed=42,
)
# Generates 3 × 3 × 4 = 36 configurations
# Method 3: Site-specific placement (most physically meaningful)
# Step 1: Identify surface atoms (within 2 Å of topmost atom)
# Step 2: Filter to only La and V atoms among those surface atoms
configs = placer.add_on_site(
"NH3",
atom_types=["La", "V"], # Select La/V from surface atoms only
n_orientations=5, # 5 orientations per site
height=2.0,
layer_tolerance=2.0, # Top bilayer (default for perovskites)
random_seed=42,
)
# Places above La and V atoms in the TOP BILAYER only
# Bulk La/V atoms are automatically excluded
# For only the topmost layer (e.g., just VO2 termination)
configs = placer.add_on_site(
"NH3",
atom_types=["V"],
n_orientations=3,
layer_tolerance=1.0, # Only topmost ~1 Å
random_seed=42,
)
# Method 4: Collision-aware random placement
configs = placer.add_with_collision(
"NH3",
n_configs=10,
min_distance=2.0, # Minimum atom-atom distance
height=2.0,
random_seed=42,
)
Rotation Sampling¶
All methods apply uniform random rotations using proper SO(3) sampling:
# The rotation uses: β = arccos(1 - 2u) for uniform spherical coverage
# This avoids clustering near the "poles" that occurs with naive sampling
# For reproducibility, always set random_seed
configs_a = placer.add_random("NH3", n_configs=10, random_seed=42)
configs_b = placer.add_random("NH3", n_configs=10, random_seed=42)
# configs_a and configs_b are identical
Filtering Unique Configurations¶
After generating many configurations, filter duplicates based on adsorbate-only RMSD:
from nh3sofc.structure.adsorbates import filter_unique_configs, save_configs
# Remove duplicates (compares only adsorbate atoms, not the slab)
# n_slab_atoms = number of atoms BEFORE adding adsorbate
n_slab_atoms = len(slab.atoms)
unique = filter_unique_configs(
configs,
threshold=0.5, # RMSD threshold in Angstroms
n_slab_atoms=n_slab_atoms
)
print(f"Filtered {len(configs)} → {len(unique)} unique configurations")
# Save for calculations
paths = save_configs(unique, output_dir="./adsorbate_configs", format="vasp")
Why n_slab_atoms? Adsorbate atoms are appended to the end of the structure.
This parameter tells the filter to compare only atoms at index n_slab_atoms onwards,
ignoring the identical slab atoms that would otherwise dominate the RMSD.
Visualizing Results¶
import matplotlib.pyplot as plt
results = workflow.parse_results()
# Heatmap
pivot = results.pivot(
index="vacancy_concentration",
columns="nitrogen_fraction",
values="energy"
)
plt.figure(figsize=(8, 6))
plt.imshow(pivot.values, cmap='RdYlGn_r')
plt.colorbar(label='Energy (eV)')
plt.xlabel('Nitrogen Fraction')
plt.ylabel('Vacancy Concentration')
plt.savefig('screening_heatmap.png', dpi=150)
Complete Example¶
from ase.io import read
from nh3sofc.workflows import ScreeningWorkflow
# 1. Load base structure (POSCAR format)
surface = read("work/surfaces/LaVO3_001/surface.vasp", format="vasp")
# 2. Define screening with meaningful work directory
workflow = ScreeningWorkflow(
base_structure=surface,
parameter_space={
"vacancy_concentration": [0.0, 0.10, 0.20],
"nitrogen_fraction": [0.5, 0.67],
},
work_dir="work/screening/LaVON_composition",
encut=520,
)
# 3. Setup (generates POSCAR files with atoms sorted by element)
workflow.setup()
# 4. After completion, analyze
results = workflow.parse_results()
# 5. Find optimal
optimal = workflow.find_optimal("energy")
print(f"Best: {optimal}")
# 6. Save results
results.to_csv("work/screening/LaVON_composition/results.csv")
Best Practices¶
- Start small - Test with subset before full screening
- Use database - Store all results in ASE database
- Check convergence - Verify all calculations completed
- Automate submission - Use job arrays for HPC
Next Steps¶
- Surface Comparison - Rank screened surfaces
- MACE Force Fields - Use ML for faster screening