FAIR² Drone Dataset Card: Quick-Start Guide

For researchers creating or documenting drone-based wildlife datasets

Below is a streamlined checklist for creating a WildFAIRx compliant drone dataset card. Review the full dataset card template for detailed guidance.

Before You Start

Gather image/video files and annotation files (if applicable)
Collect drone/sensor specifications (model numbers)
Locate flight logs or mission notes
Find research permits and approval numbers
Create species list with scientific names (use GBIF)
Set aside 2-3 hours for completion

Choose Your Template

Select the appropriate template based on your primary use case:

Object Detection: Detection Template (~2 hours) - Core + Darwin Core + COCO annotations
Multi-Object Tracking: Tracking Template (~2.5 hours) - Detection + MOT format + ID protocols
Behavior Recognition: Behavior Template (~3 hours) - Detection + Ethogram + temporal labels
Robotics Benchmarking: Platform Template (~2.5 hours) - Core + Full telemetry + Minimal annotations
Multiple Tasks: Comprehensive Template (~3-4 hours) - All modules

YAML Front Matter

License: Specify license type (e.g., cc-by-4.0)
Pretty Name: Provide descriptive dataset name
Task Categories: List relevant tasks (e.g., object-detection, image-classification)
Tags: Include relevant tags (e.g., wildlife-monitoring, species names, locations)
Size Categories: Specify dataset size (e.g., n<1k, 1k<n<10k)

Dataset Overview

Title: Clear, descriptive title
Description: 2-3 paragraph summary of dataset purpose and content
Authors: List curators/authors with affiliations
Contact Information: Provide contact details or point to discussion forum
Repository: Link to GitHub repository
Homepage: Provide link if available
DOI: Assign or note as pending

Data Structure

Directory Structure: Document file organization with tree diagram
File Formats: List formats for images (e.g., JPG, PNG) and annotations (e.g., COCO JSON, YOLO)
Naming Convention: Explain file naming pattern with examples
Data Splits: Describe train/val/test splits with counts and creation method
Example Files: Link to at least one representative image

!!! warning "Consistent Naming Conventions"

Ensure file naming is consistent across videos, telemetry files, and occurrence data:
- Use the same date format throughout (e.g., `YYYY_MM_DD` or `DD_MM_YY`)
- Match session/flight identifiers exactly between CSV files and data folders
- Document any naming changes (e.g., `session_1` → `flight_1`) to maintain data linkages
- Test that scripts can locate files using your naming pattern

Darwin Core Metadata

Event Records

For each survey location/date, document:

!!! tip "Extracting GPS Data from Telemetry"

If your videos have embedded telemetry (SRT files, EXIF data, or flight logs), you can extract GPS coordinates programmatically:
- Use `exiftool` for EXIF GPS data from images
- Parse DJI `.SRT` files for frame-level GPS coordinates
- Extract launch points, min/max bounds, and altitude ranges
- Aggregate video-level data to session/mission-level events
- See [KABR scripts](https://github.com/Imageomics/kabr-behavior-telemetry/tree/main/scripts) for Python examples

Occurrence Records

Scientific Names: Verify all species names with authority via GBIF
Taxonomic Hierarchy: Complete Kingdom, Phylum, Class, Order, Family, Genus, Species for each taxon
Occurrence Table: Create table linking events to species observations

Platform Specifications

Platform Type: Drone type (multirotor, fixed-wing, hybrid)
Manufacturer and Model: Full platform identification
Physical Specs: Weight, dimensions, flight time, max speed, wind resistance
Sensor Details: Camera manufacturer, model, resolution, sensor size, focal length, field of view
Additional Sensors: List any thermal, LiDAR, multispectral sensors
Gimbal: Type and axes of stabilization
Autonomy Mode: Manual, waypoint, or fully autonomous
Flight Features Used: Grid, orbit, follow, terrain-following, etc.

Mission Parameters

Flight Altitude: Range in meters AGL
Flight Speed: Speed in m/s
Flight Pattern: Description (grid, transect, adaptive)
Coverage: Area covered per mission
Image Overlap: Forward/side overlap percentages
Environmental Conditions: Weather, temperature, wind, visibility
Telemetry Data: Available flight logs and formats (GPS, IMU, battery, etc.)
Permits: Research permits, IRB/IACUC approvals, aviation regulations followed
Animal Welfare: Minimum altitudes, disturbance protocols

Annotations

Format and Coverage

Supported Tasks: Detection, tracking, segmentation, behavior, re-ID, keypoints
Annotation Format: Specify format for each task (COCO, MOT, etc.)
Label Set: List all classes/species, behaviors, attributes
Total Counts: Images, annotations, annotations per image (min/max/avg)
Per-Class Distribution: Count per class/species

Quality Metrics

Creation Method: Manual, semi-automatic, or automatic
Annotation Tool: Software used (CVAT, Label Studio, etc.)
Annotators: Who created annotations (experts, students, crowd workers)
Quality Assurance: Number of annotators, inter-annotator agreement, review process
Confidence Scores: Whether included in annotation files
Known Issues: Annotation gaps, difficult cases, systematic biases

Annotation Difficulty

Occlusion: Percentage of instances with none/partial/heavy occlusion
Crowd Density: Distribution across sparse/moderate/dense scenarios
Scale Variation: Range of object sizes
Environmental Challenges: Glare, shadows, motion blur, etc.

Dataset Statistics

Temporal Coverage: Date range, seasons represented
Spatial Coverage: Number of locations, total area surveyed
Class Balance: Distribution across classes/species
Baseline Results: Performance metrics if available

Limitations and Considerations

Known Biases: Geographic, temporal, species, environmental
Limitations: Technical constraints, coverage gaps, quality issues
Recommendations: Guidance for appropriate dataset use
Ethical Considerations: Privacy, animal welfare, cultural sensitivities
Reporting Issues: Link to issue tracker or community forum

Licensing

License Details: Confirm license choice and any special conditions
Component Licenses: Note if images, annotations, or code have different licenses
Attribution Requirements: Specify how to cite the dataset

Citation

Dataset Citation: Provide BibTeX entry for the dataset
Associated Paper: Include citation for related publications

Acknowledgements

Funding Sources: List grants and funding agencies
Contributors: Acknowledge field teams, annotators, collaborators
Institutional Support: Note supporting organizations

Optional Sections

Glossary: Define technical terms or specialized calculations
Additional Information: Any other relevant context
Related Datasets: Links to complementary datasets
Multimodal Linkages: Connections to other sensor data

Validation Checklist

WildFAIRx Compliance

DOI assigned or pending
License clearly stated
All required (*) fields completed in template
Machine-readable YAML front matter filled
Contact information provided

Darwin Core Compliance

Event records complete (dates, locations, protocol)
Occurrence records have scientific names with authorities
Taxonomic hierarchy filled (minimum to family level)
Sampling effort quantified
Coordinates in WGS84 with uncertainty

Practical Completeness

Directory structure documented
File naming convention explained
Data splits clearly defined
Annotation format specified
At least one example image linked
Known limitations acknowledged

Common Mistakes to Avoid

Vague Descriptions ❌ "We used a drone to collect wildlife images" ✓ "DJI Matrice 300 RTK with Zenmuse H20T camera flew grid patterns at 60m AGL"

Missing Geographic Precision ❌ "Collected in Tanzania" ✓ "Serengeti National Park, Mara Region (-2.3456, 34.8123 ±5m)"

Unclear Sampling Effort ❌ "Multiple flights" ✓ "45 missions totaling 30 flight hours, covering 2,500 hectares"

Incomplete Species Names ❌ "elephants, zebras, giraffes" ✓ "Loxodonta africana, Equus quagga, Giraffa camelopardalis"

Undocumented Splits ❌ "Split into train/val/test" ✓ "Stratified by location and season: missions 1-300 (train), 301-350 (val), 351-400 (test)"

Hidden Biases ❌ "Representative wildlife dataset" ✓ "Dry season only; large-bodied species overrepresented; morning flights bias against nocturnal species"

Time-Saving Tips

Before Starting:

Gather all information before opening the template
Copy from existing paper methods sections
Look up drone/camera specs online from manufacturer sites

While Completing:

Start with easy sections (Overview, Structure) to build momentum
Mark sections to revisit with TODO notes
Use EXIF data from images for missing camera/GPS information
Estimate reasonably if exact values unavailable (note as approximations)

Automation Opportunities:

Write scripts to extract GPS/altitude from telemetry files (SRT, EXIF, flight logs)
Generate occurrence records from detection annotations
Calculate statistics (image counts, annotation distributions) programmatically
Aggregate frame-level or video-level data to session-level events
Validate Darwin Core format compliance automatically

For Missing Information:

Document what's unavailable rather than leaving blank
Contact original research team if retrofitting dataset
Use "not available" or "not recorded" explicitly

Resources

Validation Tools

Python scripts for template generation and validation (coming soon)
Darwin Core export tools for GBIF submission
HuggingFace conversion utilities

External Resources

Getting Help

!!! question "Questions, Comments, or Concerns?"

For assistance:

Report issues or unclear sections via GitHub Issues
Contribute example cards or improvements
Share feedback on the template

Ready to start?

Download the appropriate template for your task
Gather your information using the checklist above
Set aside 2-3 hours to complete the card
Follow the template section by section
Validate your completed card
Publish your WildFAIRx compliant dataset!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAIR² Drone Dataset Card: Quick-Start Guide

Before You Start

Choose Your Template

YAML Front Matter

Dataset Overview

Data Structure

Darwin Core Metadata

Event Records

Occurrence Records

Platform Specifications

Mission Parameters

Annotations

Format and Coverage

Quality Metrics

Annotation Difficulty

Dataset Statistics

Limitations and Considerations

Licensing

Citation

Acknowledgements

Optional Sections

Validation Checklist

WildFAIRx Compliance

Darwin Core Compliance

Practical Completeness

Common Mistakes to Avoid

Time-Saving Tips

Resources

Validation Tools

External Resources

Getting Help

FilesExpand file tree

QUICKSTART_GUIDE.md

Latest commit

History

QUICKSTART_GUIDE.md

File metadata and controls

FAIR² Drone Dataset Card: Quick-Start Guide

Before You Start

Choose Your Template

YAML Front Matter

Dataset Overview

Data Structure

Darwin Core Metadata

Event Records

Occurrence Records

Platform Specifications

Mission Parameters

Annotations

Format and Coverage

Quality Metrics

Annotation Difficulty

Dataset Statistics

Limitations and Considerations

Licensing

Citation

Acknowledgements

Optional Sections

Validation Checklist

WildFAIRx Compliance

Darwin Core Compliance

Practical Completeness

Common Mistakes to Avoid

Time-Saving Tips

Resources

Validation Tools

External Resources

Getting Help