Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

AWS Lambda

AWS Lambda

What is Serverless?

  • Function as a Service - FaaS

  • 0 cost for deployment

  • $ for usage

  • No root access, just a simple function (or application)

  • No server maintenance costs.

What is it good for?

  • For short batch processes (5 min limit)
  • ETL
  • Real time data stream collection and manipulation
  • Scheduled jobs (cron-jobs)
  • REST API web endpoints
  • Bots

AWS Lambda Limits

  • AWS Lambda Limits

  • 5 minutes execution

  • disk capacity 500 MG

  • 1024 threads

  • 3Gb memory

  • 50 MB deployment zip

Development

  • In the Lambda console using Cloud9
  • Local environment
  • Local environment using SAM

How does FaaS work?

Trigger (Event Source) -> Function -> Resources (output)

Event Sources (Triggers)

  • Supported Event Sources
  • Amazon API Gateway
  • Scheduled Events
  • Amazon S3
  • Amazon DynamoDB
  • Amazon Simple Email Service
  • ...

Function

  • Python 2.7 and 3.6
  • Node.JS 6.10 and 8.10
  • Java 8
  • C# .NET Core
  • Go 1.x

Resources (via IAM)

  • See IAM - Identity and Access Management for more resources
  • S3
  • Amazon DynamoDB
  • SES - Simple Email Service
  • SNS - Simple Notification Service (Sending SMS)
  • Redshift (Data warehouse)
  • ElastiCache clusters
  • RDS - Relational Database Service
  • ...
  • External services

Create an AWS account

  • Create Free Account
  • You will have to supply a credit card, but AWS provides a Free tier.
  • You will need to supply a phone number and they will call you.

Start with AWS Lambda

Task 1 - Hello World URL

  • Create script in Python that can be accessesed via curl.

Hello World in AWS Lambda

  • Press "Create function"

  • Name: "hello"

  • Runtime: Python 3.6

  • Role: Create new role from template(s)

  • Role name: "myrole"

  • Policy templates: Basic Edge Lambda Permission

  • Press "Create function"

The default code will look like this:

def lambda_handler(event, context):
    # TODO implement
    return 'Hello World!'
  • Test it (click on "Test")
  • First it will make you create a new test-case.

API Gateway

  • "Add triggers" - select API Gateway

  • Configure Required

  • Create a new API

  • API name: demo

  • Deployment stage: v0

  • Security: Open

  • Once we "save" it, we'll be able to see the "Invoke URL"

  • https://s94rb025f9.execute-api.us-east-1.amazonaws.com/v0/hello

  • curl ...

ERROR 502 Bad Gateway

or

{"message": "Internal server error"}

Add header

  • The function needs to return a dictionary with the status code and the headers.
  • At least the Content-type.
def lambda_handler(event, context):
    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'text/html' },
        'body': 'Hello World!'
    }
  • curl ...

Send JSON

import json

def lambda_handler(event, context):
    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({ 'message': 'Hello World!' })
    }
  • curl ...
{"message": "Hello World!"}

Event details

import json

def lambda_handler(event, context):
    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({ 'message': 'Hello World!', 'event': event })
    }
  • curl ...
{
    "event": {
        "body": null,
        "headers": {
            "Accept": "*/*",
            "CloudFront-Forwarded-Proto": "https",
            "CloudFront-Is-Desktop-Viewer": "true",
            "CloudFront-Is-Mobile-Viewer": "false",
            "CloudFront-Is-SmartTV-Viewer": "false",
            "CloudFront-Is-Tablet-Viewer": "false",
            "CloudFront-Viewer-Country": "IL",
            "Host": "qspmdah6oj.execute-api.us-east-1.amazonaws.com",
            "User-Agent": "curl/7.54.0",
            "Via": "2.0 2905d0bd25e66c3f788fb2134262d52a.cloudfront.net (CloudFront)",
            "X-Amz-Cf-Id": "T9sXw9uX439w32EXXkQ13hHLoyQCIdTwPkTwpxAdPe1GMd6SYLOI4w==",
            "X-Amzn-Trace-Id": "Root=1-5b2287d5-47b383c8205ec4288b78e2d8",
            "X-Forwarded-For": "84.108.218.54, 54.182.243.56",
            "X-Forwarded-Port": "443",
            "X-Forwarded-Proto": "https"
        },
        "httpMethod": "GET",
        "isBase64Encoded": false,
        "path": "/hello",
        "pathParameters": null,
        "queryStringParameters": null,
        "requestContext": {
            "accountId": "476778636099",
            "apiId": "qspmdah6oj",
            "extendedRequestId": "IeopXGY3oAMFTFA=",
            "httpMethod": "GET",
            "identity": {
                "accessKey": null,
                "accountId": null,
                "caller": null,
                "cognitoAuthenticationProvider": null,
                "cognitoAuthenticationType": null,
                "cognitoIdentityId": null,
                "cognitoIdentityPoolId": null,
                "sourceIp": "84.108.218.54",
                "user": null,
                "userAgent": "curl/7.54.0",
                "userArn": null
            },
            "path": "/v0/hello",
            "protocol": "HTTP/1.1",
            "requestId": "872a7d1f-6fe6-11e8-bec3-9f97ccc17519",
            "requestTime": "14/Jun/2018:15:20:53 +0000",
            "requestTimeEpoch": 1528989653567,
            "resourceId": "rnvjfd",
            "resourcePath": "/hello",
            "stage": "v0"
        },
        "resource": "/hello",
        "stageVariables": null
    },
    "message": "Hello World!"
}

Exercise 1

  • Create your own hello function.

Task 2 - Accept URL GET parameters

  • Accept parameter in the GET request and echo it back

Accept Parameters

import json

def lambda_handler(event, context):
    name = event['queryStringParameters']['name']
    
    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({ 'message': 'Hello {}!'.format(name) })
    }

  • Save and Click "Test"
  • Observe the error
{
  "errorMessage": "'queryStringParameters'",
  "errorType": "KeyError",
  "stackTrace": [
    [
      "/var/task/lambda_function.py",
      4,
      "lambda_handler",
      "name = event['queryStringParameters']['name']"
    ]
  ]
}
  • Our Python code is not safe enough, we assume a field "name" to be passed in.

Error via the API

  • Before we fix the code, let's see what happens if we access the URL using curl ?

  • curl ...

{"message": "Internal server error"}
  • To see the error log, visit:

  • Monitoring

  • Invocation errors

  • Jump to logs

Test Event for API Gateway

  • Before we fix the code, we can fix the test:
{
    "queryStringParameters" : {
        "name": "Foo Bar"
    }
}
  • Try this using the "Test" button.

Also, try it from the console using curl or in your browser (use your own URL).

  • curl 'https://qspmdah6oj.execute-api.us-east-1.amazonaws.com/v0/hello?name=Foo%20Bar'
{"message": "Hello Foo Bar!"}

Exercise 2

  • Fix the Python code so even if the user does not supply the "name" field it still won't crash.

  • Instead make it return a JSON structure with status "400 Bad Request"

  • Use curl -I or curl -D err.txt to check the headers as well.

  • Create another function that will accept 2 numbers (parameters a and b) and add them together returning a JSON that looks like this:

{
   'a' : 23,
   'b' : 19,
   'op' : '+'
   'result' : 42
}

Solution 2 - echo

import json

def lambda_handler(event, context):
    if event['queryStringParameters'] == None or 'name' not in event['queryStringParameters']:
        return {
            'statusCode': 400,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing "name" field' })
        }


    name = event['queryStringParameters']['name']

    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({ 'message': 'Hello {}!'.format(name) })
    }

Solution 2 - add

import json

def lambda_handler(event, context):
    if 'queryStringParameters' not in event:
        return {
            'statusCode': 500,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing queryStringParameters' })
        }
        
    
    if event['queryStringParameters'] == None or 'a' not in event['queryStringParameters'] or 'b' not in event['queryStringParameters']:
        return {
            'statusCode': 400,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing field' })
        }
        
    result = int(event['queryStringParameters']['a']) + int(event['queryStringParameters']['b'])

    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({
            'a' : event['queryStringParameters']['a'],
            'b' : event['queryStringParameters']['b'],
            'op' : '+',
            'result': result,
        })
    }

Task 3 - Multi file application

  • Create an application that has more than one files.

Multi-file application - json

  • Create a file called a.json with some JSON content in it.
{
    "name" : "Apple"
}
  • Change the code to read the file and return it
import json

def lambda_handler(event, context):
    with open('a.json') as fh:
       data = json.load(fh)
    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({ 'data' : data })
    }

Multi-file application - python module

import json
from mymod import add

def lambda_handler(event, context):
    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({ 'data' : add(19, 23) })
    }

def add(x, y):
   return x+y

Local development

mkdir project
cd project
vim lambda_function.py
import json

def lambda_handler(event, context):
    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({ 'message': 'Hello from a zip file' })
    }
zip ../project.zip *
  • Upload a .ZIP file.
  • Save.
  • Try it using curl.

Exercise 3

  • Create a 'calculator' application that accepts two numbers 'a' and 'b' and an 'operation' that can be either 'add' or 'multiply'.

  • Return the appropirate result.

  • Create it on you computer in two files. A main web serving file and a module with two functions 'add' and 'multiply'

  • On your local computer create a directory for a project.

  • In the directory create a file called 'lambda_function.py' this will hold the main function.

  • Create also a file called 'mymodule.py'.

  • Upload the whole thing using zip.

Solution 3

import json
import mymodule

def lambda_handler(event, context):
    if 'queryStringParameters' not in event:
        return {
            'statusCode': 500,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing queryStringParameters' })
        }
        
    
    if event['queryStringParameters'] == None or 'a' not in event['queryStringParameters'] or 'b' not in event['queryStringParameters'] or 'operation' not in event['queryStringParameters']:
        return {
            'statusCode': 400,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing field' })
        }

    operation = event['queryStringParameters']['operation']
    if operation == 'add':
        result = mymodule.add(int(event['queryStringParameters']['a']), int(event['queryStringParameters']['b']))
        op = '+'
    elif operation == 'multiply':
        result = mymodule.multiply(int(event['queryStringParameters']['a']), int(event['queryStringParameters']['b']))
        op = '*'
    else:
        return {
            'statusCode': 400,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Unsupported operation: "{}"'.format(operation) })
        }

    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({
            'a' : event['queryStringParameters']['a'],
            'b' : event['queryStringParameters']['b'],
            'op' : op,
            'result': result,
        })
    }

def add(x, y):
   return x+y

def multiply(x, y):
   return x*y

Task 4 - Use 3rd party Python modules.

Development machine

import json
import sys

sys.path.insert(0, 'pypi')
import pylev

def lambda_handler(event, context):
    if 'queryStringParameters' not in event:
        return {
            'statusCode': 500,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing queryStringParameters' })
        }
        
    
    if event['queryStringParameters'] == None or 'a' not in event['queryStringParameters'] or 'b' not in event['queryStringParameters']:
        return {
            'statusCode': 400,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing field' })
        }

    distance = pylev.levenshtein(event['queryStringParameters']['a'], event['queryStringParameters']['b'])
    

    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({
            'a' : event['queryStringParameters']['a'],
            'b' : event['queryStringParameters']['b'],
            'distance': distance,
        })
    }

mkdir app_pylev
cd app_pylev
pip install pylev -t pypi
zip -r ../project.zip *
  • curl 'https://qspmdah6oj.execute-api.us-east-1.amazonaws.com/v0/hello?a=abd&b=acx'

Error: must supply either home or prefix/exec-prefix - not both

On OSX you might get the above error. Create the 'setup.cfg' file.

{% embed include file="src/examples/app_pylev/setup.cfg)

Third party not pure-python

  • editdistance is a Levenshtein distance module written in C++ and Cython
  • See
import json
import sys

sys.path.insert(0, 'pypi')
import editdistance

def lambda_handler(event, context):
    if 'queryStringParameters' not in event:
        return {
            'statusCode': 500,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing queryStringParameters' })
        }
        
    
    if event['queryStringParameters'] == None or 'a' not in event['queryStringParameters'] or 'b' not in event['queryStringParameters']:
        return {
            'statusCode': 400,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing field' })
        }

    distance = editdistance.eval(event['queryStringParameters']['a'], event['queryStringParameters']['b'])

    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({
            'a' : event['queryStringParameters']['a'],
            'b' : event['queryStringParameters']['b'],
            'editdistance': distance,
        })
    }

if __name__ == '__main__':
    print(lambda_handler({
       'queryStringParameters': {
          'a': 'xyz',
          'b': 'xrp',
       }
    }, {}))

{% embed include file="src/examples/app_editdistance/requirements.txt)

  • Needs a linux box either locally or on Amazon AWS.

Docker to build 3rd party modules

amazonlinux

FROM amazonlinux
RUN yum install -y python36
RUN yum install -y findutils which wget

RUN wget https://bootstrap.pypa.io/get-pip.py && \
    python3 get-pip.py

WORKDIR /opt

docker build -t aws .
  • In the project directory:
rm -rf pypi
docker run -v $(pwd):/opt  --rm aws pip install -r requirements.txt -t pypi
zip -r ../project.zip *
  • Upload the zip file.

Exercise 4

  • Web Client: A function that uses 'requests' to fetch a URL and return the text of the page.

  • A function that will accept the name of two cities.

  • Call the API of Open Weather Map and return the temprature difference in the two places.

Solution: Web client

import json
import sys

sys.path.insert(0, 'pypi')
import requests

def lambda_handler(event, context):
    if 'queryStringParameters' not in event:
        return {
            'statusCode': 500,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing queryStringParameters' })
        }
        
    
    if event['queryStringParameters'] == None or 'url' not in event['queryStringParameters']:
        return {
            'statusCode': 400,
            'headers': { 'Content-Type': 'application/json' },
            'body': json.dumps({ 'error': 'Missing field' })
        }

    r = requests.get(event['queryStringParameters']['url'])

    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({
            'url' : event['queryStringParameters']['url'],
            'content': r.text,
        })
    }

{% embed include file="src/examples/web_client/requirements.txt)

{% embed include file="src/examples/web_client/setup.cfg)

pip install -r requirements.txt -t pypi
zip -r ../project.zip *
curl 'https://qspmdah6oj.execute-api.us-east-1.amazonaws.com/v0/hello?url=https://httpbin.org/get'

Task 5 - Using S3 - Simple Storage Service

  • Buckets - (all the users share a single name-space for buckets, so a lot of them are already taken)

  • Create a bucket in s3 (eg. your-name)

  • Add permission to the role Attach policy AmazonS3FullAccess

  • boto 3

S3 List bucket

import json
import boto3


def lambda_handler(event, context):
    client = boto3.client('s3')
    response = client.list_objects(
        Bucket='szabgab',
    )
    objects = [ o['Key'] for o in response['Contents'] ]
    
    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({ 'objects': objects, 'res': str(response) })
    }

S3 write object from Lambda

import json
import boto3

def lambda_handler(event, context):
    client = boto3.client('s3')
    
    res = client.put_object(
        Bucket='szabgab',
        Key='abc.json',
        Body=json.dumps({"name": "Foo"}),
    )
    
    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        'body': json.dumps({ 'res': res })
    }

Read S3 object

import json
import boto3


def lambda_handler(event, context):
    client = boto3.client('s3')

    res = client.get_object(
        Bucket='szabgab',
        Key='abc.json',
    )
    #obj = res["Body"] # botocore.response.StreamingBody
    content = res["Body"].read()
    data = json.loads(content)
    
    
    return {
        'statusCode': 200,
        'headers': { 'Content-Type': 'application/json' },
        #'body': json.dumps({ 'res': str(res) }),
        #'body': json.dumps({ 'res': str(content) }),
        'body': json.dumps({ 'res': data }),

    }

Trigger Lambda by S3

{
    "name" : "Apple"
}

Trigger by using:

aws s3 cp data.json s3://szabgab/new/data.json

Download the resulting file using

aws s3 cp s3://szabgab/out.json .

Handle S3 event in Lambda

import json
import boto3


def lambda_handler(event, context):
    client = boto3.client('s3')

    res = client.get_object(
        Bucket=event["Records"][0]['s3']['bucket']['name'],
        Key=event["Records"][0]['s3']['object']['key'],
    )
    content = res["Body"].read()
    data = json.loads(content)
    
    res = client.put_object(
        Bucket='szabgab',
        Key='out.json',
        Body=json.dumps({"data" : data, "event": event}),
    )

Exercise 5

  • Create a function that can be triggered by a URL request passing in parameters like this: ?name="Foo Bar"

  • Create an object in S3 called something like "in/time.json" (with the current time)

  • Create another function that will be triggered by a newaobject with a perfix "in/"

  • Load that object and creare a new object called "out/time.json" using the same object name, but adding 3 new fields to it called

new_time:  the time portion of the name of the object
end_time:  the time when we read the object in the second function.
elapsed:   the difference.

AWS Resources

Resources