data_engineering   151

« earlier    

Apache NiFi
An easy to use, powerful, and reliable system to process and distribute data.
5 weeks ago by dstarr1
cgarciae/pypeln: Concurrent data pipelines made easy

Pypeline is a simple yet powerful python library for creating concurrent data pipelines.

- Pypeline was designed to solve simple medium data tasks that require concurrency and parallelism but where using frameworks like Spark or Dask feel exaggerated or unnatural.
- Pypeline exposes an easy to use, familiar, functional API.
- Pypeline enables you to build pipelines using Processes, Threads and asyncio.Tasks via the exact same API.
- Pypeline allows you to have control over the memory and cpu resources used at each stage of your pipeline.
via:flav  is:repo  data  data_engineering  python  data_pipeline  big_data  async 
11 weeks ago by andrewsardone
gaia-pipeline/gaia: Build powerful pipelines in any programming language.
gaia is an open source automation platform which makes it easy and fun to build powerful pipelines in any programming language. Based on HashiCorp's go-plugin and gRPC, gaia is efficient, fast, lightweight and developer friendly. Gaia is currently alpha! Do not use it for mission critical jobs yet!
go  golang  build  ci  devops  data  data_engineering  engineering  development  is:repo 
july 2018 by andrewsardone

« earlier    

related tags

ai  analytics  apache_beam  apache_kafka  articles  async  aws  bestpractices  big_data  build  ci  computer_vision  culture  data  data_analysis  data_annotation  data_engineer  data_lake  data_pipeline  data_science  data_viz  data_wrangling  database  datasets  deep_learning  development  devops  engineering  error_handling  etl  facebook  go  golang  google_data_engineering  google_dataflow  hadoop  infrastructure  is:repo  java  kafka  machine_learning  machine_learning_data  mapreduce  metrics  microservices  monitoring  nlp  nosql  php  postgres  presentation  programming  python  r  recommendation_systems  redshift  rnn  scribe  slides  slideshare  spark  sql  streaming  tensorflow  tr-2017-06  tr-2018-01  tr-2018-03  tr-2018-04  videos  walmart  web 

Copy this bookmark: