Plantation Monitoring System

A system for analyzing and visualizing plantation data using satellite imagery and ground truth data. The project focuses on processing GeoJSON plantation data and creating comprehensive visualizations for analysis.

Project Structure

plantation-monitoring/
├── data/
│   └── Plantations Data.geojson
├── src/
│   ├── data_preprocessing.py
│   ├── data_visualization.py
│   ├── model.py
│   ├── train.py
│   └── inference.py
├── notebooks/
│   └── visualization_analysis.ipynb
├── docker/
│   ├── Dockerfile
│   └── docker-compose.yml
├── requirements.txt
└── README.md

Plantation Detection Model (pretraining)

Model Architecture and Selection

Model Choice: DeepLabV3 with ResNet-50

The plantation detection system uses DeepLabV3 with a ResNet-50 backbone for semantic segmentation. This choice was made for several reasons:

Architecture Benefits
- Strong feature extraction through ResNet-50 backbone
- Atrous Spatial Pyramid Pooling (ASPP) for multi-scale processing
- Effective handling of varying plantation sizes
- Built-in handling of global context
Model Configuration
```
model = deeplabv3_resnet50(pretrained=False)
model.classifier[-1] = nn.Conv2d(256, 2, kernel_size=(1, 1))  # Binary classification
```
- Modified for binary segmentation (plantation vs. non-plantation)
- Leverages pretrained weights for feature extraction
- Custom classification head for our specific task
Input Processing
- Image size: 256x256 pixels
- Three-channel input (RGB from Sentinel-2)
- Normalization per batch
- Optional data augmentation:
  - Random horizontal/vertical flips
  - Random rotation
  - Color jittering

Training Process

Data Preparation

# Image preprocessing
image = (image - image.mean()) / image.std()  # Normalization
image = torch.from_numpy(image).float()       # Convert to tensor

Loss Function

Binary Cross Entropy with Logits Loss
Handles class imbalance through weighting

criterion = nn.BCEWithLogitsLoss(
    pos_weight=torch.tensor([pos_weight])  # Adjusted for class balance
)

Optimization

Adam optimizer with learning rate scheduling

optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(
    optimizer, mode='min', patience=3
)

Training Schedule
- Number of epochs: 10 (default)
- Batch size: 4 (adjustable based on GPU memory)
- Learning rate: 0.001 with reduction on plateau
- Checkpointing every 5 epochs

Model Performance Considerations

Advantages
- Robust to varying plantation sizes
- Good handling of spatial context
- Efficient inference time
- Memory-efficient training
Limitations
- Requires good quality RGB imagery
- May struggle with very small plantations
- Sensitive to cloud cover in imagery
Performance Metrics
- IoU (Intersection over Union)
- Precision and Recall
- F1 Score
- Confusion Matrix analysis

Alternative Models Considered

U-Net
- Pros: Lighter weight, good for medical imaging
- Cons: Less context awareness, no pretrained weights
Mask R-CNN
- Pros: Instance segmentation capability
- Cons: Overkill for binary segmentation, slower inference
FCN (Fully Convolutional Network)
- Pros: Simpler architecture
- Cons: Less accurate on boundary details

DeepLabV3 was chosen as it provided the best balance of:

Accuracy in plantation boundary detection
Reasonable training time
Good inference speed
Memory efficiency
Ability to handle varying plantation sizes

Notebooks Documentation

1. Model Pretraining (`src/pretrain.py`)

The pretraining script handles the preparation of training data and model training for plantation detection.

Key Components:

Data Preparation prepare_training_data(geojson_path, output_dir, patch_size=256, max_samples=100)
- Downloads Sentinel-2 imagery for plantation areas
- Creates binary masks from plantation polygons
- Saves georeferenced image-mask pairs
- Parameters:
  - geojson_path: Path to plantation polygons
  - output_dir: Directory to save processed data
  - patch_size: Size of image patches (default: 256x256)
  - max_samples: Maximum number of samples to process
Dataset Class python class PlantationDataset(Dataset)
- Handles data loading and preprocessing
- Performs normalization and augmentation
- Returns image-mask pairs for training
Model Training python train_model(data_dir, num_epochs=10, batch_size=4, learning_rate=0.001)
- Uses DeepLabV3 with ResNet-50 backbone
- Binary segmentation for plantation detection
- Saves model checkpoints during training

2. Inference and Analysis (`notebooks/plantation_analysis_inference.ipynb`)

The inference notebook provides comprehensive analysis and visualization tools.

Features of notebook:

Dataset Analysis
- Ground truth statistics
- Spatial distribution visualization
- Interactive maps of plantation areas
- Size distribution analysis
Model Performance Analysis showcase_best_predictions(model, data_dir, num_samples=15)
- Visualizes top predictions
- Calculates performance metrics:
  - IoU (Intersection over Union)
  - Precision
  - Recall
  - F1 Score
- Saves results to outputs/best_predictions/
Coordinate-based Prediction predict_on_coordinates(model, lon, lat, patch_size=256)
- Downloads recent Sentinel-2 imagery for given coordinates
- Runs model prediction
- Visualizes results with overlays
- Calculates area statistics
Error Analysis
- Confusion matrix visualization
- IoU score distribution
- False positive/negative analysis
- Edge case examination

Features

1. Data Processing (`data_preprocessing.py`)

Reads and processes GeoJSON plantation data
Handles coordinate transformations and spatial calculations
Prepares data for visualization and analysis

2. Data Visualization (`data_visualization.py` & `visualization_analysis.ipynb`)

Spatial distribution analysis
- Interactive maps showing plantation locations
- Area-based visualization
- District-wise distribution
Temporal analysis
- Planting date patterns
- Growth progression
- Coppicing cycles
Growth analysis
- Quality distribution
- Height analysis
- Area correlations
Species analysis
- Distribution patterns
- Growth characteristics
- Area relationships

3. Machine Learning Pipeline

Data preprocessing for model training
Model training setup
Inference pipeline

Model Details

Data Input Structure (8 channels total)

def forward(self, x):
    # x shape: [batch_size, 8, height, width]
    
    # Split input into different sources
    s2_input = x[:, :4]     # Sentinel-2: RGB + NIR bands
    s1_input = x[:, 4:6]    # Sentinel-1: VV + VH bands (radar)
    modis_input = x[:, 6:]  # MODIS: NDVI + EVI (vegetation indices)

Separate Encoders for Each Source:

class MultiSourceUNet(nn.Module):
    def __init__(self, n_classes=1):
        # Specialized encoders for each data type
        self.s2_encoder = self._create_encoder(4)  # Optical data
        self.s1_encoder = self._create_encoder(2)  # Radar data
        self.modis_encoder = self._create_encoder(2)  # Time-series data

Feature Extraction Process:

def _encode_single_source(self, x, encoder):
    features = []
    for layer in encoder:
        x = layer(x)
        features.append(x)  # Store features for skip connections
    return features

Feature Fusion:

# Fusion layer combines features intelligently
self.fusion = nn.Sequential(
    nn.Conv2d(512 * 3, 512, 1),  # Combine features from all sources
    nn.BatchNorm2d(512),
    nn.ReLU(inplace=True)
)

# In forward pass
fused_features = self.fusion(torch.cat([
    s2_features[-1],  # High-level Sentinel-2 features
    s1_features[-1],  # High-level Sentinel-1 features
    modis_features[-1]  # High-level MODIS features
], dim=1))

Advantages over Standard U-Net:

Multi-Source Capability:

# Standard U-Net can only handle one type of input:
class StandardUNet:
    def __init__(self):
        self.encoder = single_encoder(in_channels=3)  # Only RGB

# Our MultiSourceUNet handles multiple sources:
class MultiSourceUNet:
    def __init__(self):
        self.s2_encoder = specialized_encoder(4)  # RGB + NIR
        self.s1_encoder = specialized_encoder(2)  # Radar
        self.modis_encoder = specialized_encoder(2)  # Time series

Complementary Information:

# Each source provides unique information:
s2_features  # Spectral information (vegetation, water)
s1_features  # Radar backscatter (structure, moisture)
modis_features  # Temporal patterns (seasonal changes)

Robustness to Missing Data:

def forward(self, x):
    # Can still work if some data is missing
    if has_s2_data:
        s2_features = self.s2_encoder(x[:, :4])
    else:
        s2_features = self.get_default_features()
    
    # Continue processing with available data

Source-Specific Feature Learning:

def _create_encoder(self, in_channels):
    """Each encoder is optimized for its data type"""
    if in_channels == 4:  # Sentinel-2
        # Optimize for spectral data
        first_conv.weight.data[:, :3] = resnet.conv1.weight.data
        first_conv.weight.data[:, 3] = resnet.conv1.weight.data.mean(dim=1)
    elif in_channels == 2:  # Sentinel-1
        # Optimize for radar data
        # Different initialization for radar features

Skip Connections with Rich Features:

# Decoding with rich feature combinations
dec4 = self.decoder4(torch.cat([
    dec5,  # High-level fused features
    s2_features[-2]  # Skip connection with optical features
], dim=1))

Installation

Local Setup

Clone the repository:

git clone https://github.com/iabhi7/plantation-monitoring.git
cd plantation-monitoring

Create and activate virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Authenticate with Google Earth Engine:

earthengine authenticate

(Optional) Set up service account:

Create a service-account.json file with your Google Earth Engine credentials
Place it in the project root directory

Usage

1. Data Visualization

Run the visualization script:

python src/data_visualization.py

This generates:

spatial_distribution.png: Geographic distribution map
interactive_map.html: Interactive web visualization
temporal_distribution.png: Time-based analysis
growth_analysis.png: Growth patterns
species_analysis.png: Species distribution
dashboard.html: Combined interactive dashboard

2. Jupyter Notebook Analysis

Start Jupyter and open visualization_analysis.ipynb:

jupyter notebook notebooks/visualization_analysis.ipynb

The notebook provides:

Detailed data exploration
Interactive visualizations
Statistical analysis
Custom analysis options

3. Data Preprocessing

Process the plantation data:

python src/data_preprocessing.py

Basic Usage

Run the complete pipeline sequentially:

python run_pipeline.py

Parallel Processing

For faster data processing, you can use parallel execution:

Run with default parallel settings (8 workers):

python run_pipeline.py --parallel

Run with custom number of workers:

python run_pipeline.py --parallel --workers 4

Note: The number of workers should be adjusted based on your system's capabilities. More workers may speed up processing but will also use more memory.

Dependencies

Core requirements:

geopandas
matplotlib
seaborn
folium
pandas
numpy
plotly
contextily
jupyter

Full list available in requirements.txt

Testing

Run the test suite:

python test_setup.py

Docker Setup

Build Docker Image

docker-compose build

Sequential Processing

# Run with sequential processing
docker-compose up plantation-monitor

Parallel Processing

# Run with parallel processing (4 workers)
docker-compose up parallel-monitor

# Or specify custom workers
docker-compose run --rm parallel-monitor python run_pipeline.py --parallel --workers 8

Memory Considerations

Sequential service is limited to 8GB memory
Parallel service is limited to 16GB memory
Adjust memory limits in docker-compose.yml based on your system

Service Account

To use a service account:

Set GOOGLE_APPLICATION_CREDENTIALS environment variable:

export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

The file will be mounted automatically in the container

Pipeline Components

The pipeline consists of several stages:

Data preprocessing (sequential or parallel)
Model training
Inference
Visualization

Data Requirements

Place your plantation data in data/Plantations Data.geojson

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
docker		docker
notebooks		notebooks
outputs		outputs
src		src
tests		tests
.DS_Store		.DS_Store
README.md		README.md
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py
setup.py		setup.py
test_setup.py		test_setup.py

iabhi7/DeepForestVision

Folders and files

Latest commit

History

Repository files navigation

Plantation Monitoring System

Project Structure

Plantation Detection Model (pretraining)

Model Architecture and Selection

Model Choice: DeepLabV3 with ResNet-50

Training Process

Model Performance Considerations

Alternative Models Considered

Notebooks Documentation

1. Model Pretraining (src/pretrain.py)

Key Components:

2. Inference and Analysis (notebooks/plantation_analysis_inference.ipynb)

Features of notebook:

Features

1. Data Processing (data_preprocessing.py)

2. Data Visualization (data_visualization.py & visualization_analysis.ipynb)

3. Machine Learning Pipeline

Model Details

Advantages over Standard U-Net:

Installation

Local Setup

Usage

1. Data Visualization

2. Jupyter Notebook Analysis

3. Data Preprocessing

Basic Usage

Parallel Processing

Dependencies

Testing

Docker Setup

Build Docker Image

Sequential Processing

Parallel Processing

Memory Considerations

Service Account

Pipeline Components

Data Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Model Pretraining (`src/pretrain.py`)

2. Inference and Analysis (`notebooks/plantation_analysis_inference.ipynb`)

1. Data Processing (`data_preprocessing.py`)

2. Data Visualization (`data_visualization.py` & `visualization_analysis.ipynb`)

Packages