All Tools
A
DataFreeOpen Source
APACHE NIFI
Data flow automation and pipeline orchestration
Apache-2.0
ABOUT
Engineering teams building ML pipelines must ingest, transform, and route data from dozens of disparate sources (databases, APIs, file shares, message queues, IoT devices) into ML training and inference systems. Manually wiring these integrations is brittle, lacks observability, and makes it impossible to trace data lineage when a model produces unexpected results.
INSTALL
# Download and extract
curl -sSOL https://dlcdn.apache.org/nifi/2.0.0/nifi-2.0.0-bin.tar.gz
tar -xzf nifi-2.0.0-bin.tar.gz
cd nifi-2.0.0 && ./bin/nifi.sh start
INTEGRATION GUIDE
1. Ingest and route streaming data from databases, APIs, and IoT devices into ML training pipelines with automated back-pressure handling
2. Track end-to-end data provenance for every record with built-in data lineage and provenance reporting
3. Visually design and monitor complex ETL workflows through a drag-and-drop web interface with 300+ built-in processors
TAGS
data-pipelinedata-integrationetlworkflow-automationdata-engineering