Configure and Deploy Lambda Pipeline in code

Question

I was wondering if there are any AWS Services or projects which allow us to configure a data pipeline using AWS Lambdas in code. I am looking for something like below. Assume there is a library called pipeline

from pipeline import connect, s3, lambda, deploy
p = connect(s3('input-bucket/prefix'),
            lambda(myPythonFunc, dependencies=[list_of_dependencies])
            s3('output-bucket/prefix'))
deploy(p)

There can be many variations of this idea of course. This use case assumes only one s3 bucket for e.g. There could be a list of input s3 buckets.

Can this be done by AWS Data Pipeline? The documentation I have(quickly) read says that Lambda is used to trigger a pipeline.

Ryan Gross · Accepted Answer · 2016-12-14 04:30:31Z

1

I think the closest thing that is available is the State Machine functionality within the newly released Lambda Step Functions. With these you can coordinate multiple steps that transform your data. I don't believe that they support standard event sources, so you would have to create a standard lambda function (potentially using the Serverless Application Model) to read from S3 and trigger your State Machine.

answered Dec 14, 2016 at 4:30

Ryan Gross

6,5652 gold badges35 silver badges46 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

RAbraham Over a year ago

I think the Serverless Application Model fits what I need. I have to now investigate how to do that in Python :). Thanks!

Collectives™ on Stack Overflow

Configure and Deploy Lambda Pipeline in code

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related