AWS Lambda
AWS Lambda
What is Serverless?
-
Function as a Service - FaaS
-
0 cost for deployment
-
$ for usage
-
No root access, just a simple function (or application)
-
No server maintenance costs.
What is it good for?
- For short batch processes (5 min limit)
- ETL
- Real time data stream collection and manipulation
- Scheduled jobs (cron-jobs)
- REST API web endpoints
- Bots
AWS Lambda Limits
-
5 minutes execution
-
disk capacity 500 MG
-
1024 threads
-
3Gb memory
-
50 MB deployment zip
Development
- In the Lambda console using Cloud9
- Local environment
- Local environment using SAM
How does FaaS work?
Trigger (Event Source) -> Function -> Resources (output)
Event Sources (Triggers)
- Supported Event Sources
- Amazon API Gateway
- Scheduled Events
- Amazon S3
- Amazon DynamoDB
- Amazon Simple Email Service
- ...
Function
- Python 2.7 and 3.6
- Node.JS 6.10 and 8.10
- Java 8
- C# .NET Core
- Go 1.x
Resources (via IAM)
- See IAM - Identity and Access Management for more resources
- S3
- Amazon DynamoDB
- SES - Simple Email Service
- SNS - Simple Notification Service (Sending SMS)
- Redshift (Data warehouse)
- ElastiCache clusters
- RDS - Relational Database Service
- ...
- External services
Create an AWS account
- Create Free Account
- You will have to supply a credit card, but AWS provides a Free tier.
- You will need to supply a phone number and they will call you.
Start with AWS Lambda
-
Log in
-
Pick a region: e.g. N. Virginia us-east-1
Task 1 - Hello World URL
- Create script in Python that can be accessesed via curl.
Hello World in AWS Lambda
-
Press "Create function"
-
Name: "hello"
-
Runtime: Python 3.6
-
Role: Create new role from template(s)
-
Role name: "myrole"
-
Policy templates: Basic Edge Lambda Permission
-
Press "Create function"
The default code will look like this:
def lambda_handler(event, context):
# TODO implement
return 'Hello World!'
- Test it (click on "Test")
- First it will make you create a new test-case.
API Gateway
-
"Add triggers" - select API Gateway
-
Configure Required
-
Create a new API
-
API name: demo
-
Deployment stage: v0
-
Security: Open
-
Once we "save" it, we'll be able to see the "Invoke URL"
-
https://s94rb025f9.execute-api.us-east-1.amazonaws.com/v0/hello
-
curl ...
ERROR 502 Bad Gateway
or
{"message": "Internal server error"}
Add header
- The function needs to return a dictionary with the status code and the headers.
- At least the Content-type.
def lambda_handler(event, context):
return {
'statusCode': 200,
'headers': { 'Content-Type': 'text/html' },
'body': 'Hello World!'
}
- curl ...
Send JSON
import json
def lambda_handler(event, context):
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'message': 'Hello World!' })
}
- curl ...
{"message": "Hello World!"}
Event details
import json
def lambda_handler(event, context):
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'message': 'Hello World!', 'event': event })
}
- curl ...
{
"event": {
"body": null,
"headers": {
"Accept": "*/*",
"CloudFront-Forwarded-Proto": "https",
"CloudFront-Is-Desktop-Viewer": "true",
"CloudFront-Is-Mobile-Viewer": "false",
"CloudFront-Is-SmartTV-Viewer": "false",
"CloudFront-Is-Tablet-Viewer": "false",
"CloudFront-Viewer-Country": "IL",
"Host": "qspmdah6oj.execute-api.us-east-1.amazonaws.com",
"User-Agent": "curl/7.54.0",
"Via": "2.0 2905d0bd25e66c3f788fb2134262d52a.cloudfront.net (CloudFront)",
"X-Amz-Cf-Id": "T9sXw9uX439w32EXXkQ13hHLoyQCIdTwPkTwpxAdPe1GMd6SYLOI4w==",
"X-Amzn-Trace-Id": "Root=1-5b2287d5-47b383c8205ec4288b78e2d8",
"X-Forwarded-For": "84.108.218.54, 54.182.243.56",
"X-Forwarded-Port": "443",
"X-Forwarded-Proto": "https"
},
"httpMethod": "GET",
"isBase64Encoded": false,
"path": "/hello",
"pathParameters": null,
"queryStringParameters": null,
"requestContext": {
"accountId": "476778636099",
"apiId": "qspmdah6oj",
"extendedRequestId": "IeopXGY3oAMFTFA=",
"httpMethod": "GET",
"identity": {
"accessKey": null,
"accountId": null,
"caller": null,
"cognitoAuthenticationProvider": null,
"cognitoAuthenticationType": null,
"cognitoIdentityId": null,
"cognitoIdentityPoolId": null,
"sourceIp": "84.108.218.54",
"user": null,
"userAgent": "curl/7.54.0",
"userArn": null
},
"path": "/v0/hello",
"protocol": "HTTP/1.1",
"requestId": "872a7d1f-6fe6-11e8-bec3-9f97ccc17519",
"requestTime": "14/Jun/2018:15:20:53 +0000",
"requestTimeEpoch": 1528989653567,
"resourceId": "rnvjfd",
"resourcePath": "/hello",
"stage": "v0"
},
"resource": "/hello",
"stageVariables": null
},
"message": "Hello World!"
}
Exercise 1
- Create your own hello function.
Task 2 - Accept URL GET parameters
- Accept parameter in the GET request and echo it back
Accept Parameters
import json
def lambda_handler(event, context):
name = event['queryStringParameters']['name']
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'message': 'Hello {}!'.format(name) })
}
- Save and Click "Test"
- Observe the error
{
"errorMessage": "'queryStringParameters'",
"errorType": "KeyError",
"stackTrace": [
[
"/var/task/lambda_function.py",
4,
"lambda_handler",
"name = event['queryStringParameters']['name']"
]
]
}
- Our Python code is not safe enough, we assume a field "name" to be passed in.
Error via the API
-
Before we fix the code, let's see what happens if we access the URL using
curl
? -
curl ...
{"message": "Internal server error"}
-
To see the error log, visit:
-
Monitoring
-
Invocation errors
-
Jump to logs
Test Event for API Gateway
- Before we fix the code, we can fix the test:
{
"queryStringParameters" : {
"name": "Foo Bar"
}
}
- Try this using the "Test" button.
Also, try it from the console using curl
or in your browser (use your own URL).
- curl 'https://qspmdah6oj.execute-api.us-east-1.amazonaws.com/v0/hello?name=Foo%20Bar'
{"message": "Hello Foo Bar!"}
Exercise 2
-
Fix the Python code so even if the user does not supply the "name" field it still won't crash.
-
Instead make it return a JSON structure with status "400 Bad Request"
-
Use
curl -I
orcurl -D err.txt
to check the headers as well. -
Create another function that will accept 2 numbers (parameters a and b) and add them together returning a JSON that looks like this:
{
'a' : 23,
'b' : 19,
'op' : '+'
'result' : 42
}
Solution 2 - echo
import json
def lambda_handler(event, context):
if event['queryStringParameters'] == None or 'name' not in event['queryStringParameters']:
return {
'statusCode': 400,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing "name" field' })
}
name = event['queryStringParameters']['name']
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'message': 'Hello {}!'.format(name) })
}
Solution 2 - add
import json
def lambda_handler(event, context):
if 'queryStringParameters' not in event:
return {
'statusCode': 500,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing queryStringParameters' })
}
if event['queryStringParameters'] == None or 'a' not in event['queryStringParameters'] or 'b' not in event['queryStringParameters']:
return {
'statusCode': 400,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing field' })
}
result = int(event['queryStringParameters']['a']) + int(event['queryStringParameters']['b'])
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({
'a' : event['queryStringParameters']['a'],
'b' : event['queryStringParameters']['b'],
'op' : '+',
'result': result,
})
}
Task 3 - Multi file application
- Create an application that has more than one files.
Multi-file application - json
- Create a file called a.json with some JSON content in it.
{
"name" : "Apple"
}
- Change the code to read the file and return it
import json
def lambda_handler(event, context):
with open('a.json') as fh:
data = json.load(fh)
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'data' : data })
}
Multi-file application - python module
import json
from mymod import add
def lambda_handler(event, context):
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'data' : add(19, 23) })
}
def add(x, y):
return x+y
Local development
mkdir project
cd project
vim lambda_function.py
import json
def lambda_handler(event, context):
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'message': 'Hello from a zip file' })
}
zip ../project.zip *
- Upload a .ZIP file.
- Save.
- Try it using
curl
.
Exercise 3
-
Create a 'calculator' application that accepts two numbers 'a' and 'b' and an 'operation' that can be either 'add' or 'multiply'.
-
Return the appropirate result.
-
Create it on you computer in two files. A main web serving file and a module with two functions 'add' and 'multiply'
-
On your local computer create a directory for a project.
-
In the directory create a file called 'lambda_function.py' this will hold the main function.
-
Create also a file called 'mymodule.py'.
-
Upload the whole thing using zip.
Solution 3
import json
import mymodule
def lambda_handler(event, context):
if 'queryStringParameters' not in event:
return {
'statusCode': 500,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing queryStringParameters' })
}
if event['queryStringParameters'] == None or 'a' not in event['queryStringParameters'] or 'b' not in event['queryStringParameters'] or 'operation' not in event['queryStringParameters']:
return {
'statusCode': 400,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing field' })
}
operation = event['queryStringParameters']['operation']
if operation == 'add':
result = mymodule.add(int(event['queryStringParameters']['a']), int(event['queryStringParameters']['b']))
op = '+'
elif operation == 'multiply':
result = mymodule.multiply(int(event['queryStringParameters']['a']), int(event['queryStringParameters']['b']))
op = '*'
else:
return {
'statusCode': 400,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Unsupported operation: "{}"'.format(operation) })
}
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({
'a' : event['queryStringParameters']['a'],
'b' : event['queryStringParameters']['b'],
'op' : op,
'result': result,
})
}
def add(x, y):
return x+y
def multiply(x, y):
return x*y
Task 4 - Use 3rd party Python modules.
Development machine
- Levenshtein distance
- Using pylev which is pure Python.
import json
import sys
sys.path.insert(0, 'pypi')
import pylev
def lambda_handler(event, context):
if 'queryStringParameters' not in event:
return {
'statusCode': 500,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing queryStringParameters' })
}
if event['queryStringParameters'] == None or 'a' not in event['queryStringParameters'] or 'b' not in event['queryStringParameters']:
return {
'statusCode': 400,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing field' })
}
distance = pylev.levenshtein(event['queryStringParameters']['a'], event['queryStringParameters']['b'])
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({
'a' : event['queryStringParameters']['a'],
'b' : event['queryStringParameters']['b'],
'distance': distance,
})
}
mkdir app_pylev
cd app_pylev
pip install pylev -t pypi
zip -r ../project.zip *
curl 'https://qspmdah6oj.execute-api.us-east-1.amazonaws.com/v0/hello?a=abd&b=acx'
Error: must supply either home or prefix/exec-prefix - not both
On OSX you might get the above error. Create the 'setup.cfg' file.
{% embed include file="src/examples/app_pylev/setup.cfg)
Third party not pure-python
- editdistance is a Levenshtein distance module written in C++ and Cython
- See
import json
import sys
sys.path.insert(0, 'pypi')
import editdistance
def lambda_handler(event, context):
if 'queryStringParameters' not in event:
return {
'statusCode': 500,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing queryStringParameters' })
}
if event['queryStringParameters'] == None or 'a' not in event['queryStringParameters'] or 'b' not in event['queryStringParameters']:
return {
'statusCode': 400,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing field' })
}
distance = editdistance.eval(event['queryStringParameters']['a'], event['queryStringParameters']['b'])
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({
'a' : event['queryStringParameters']['a'],
'b' : event['queryStringParameters']['b'],
'editdistance': distance,
})
}
if __name__ == '__main__':
print(lambda_handler({
'queryStringParameters': {
'a': 'xyz',
'b': 'xrp',
}
}, {}))
{% embed include file="src/examples/app_editdistance/requirements.txt)
- Needs a linux box either locally or on Amazon AWS.
Docker to build 3rd party modules
FROM amazonlinux
RUN yum install -y python36
RUN yum install -y findutils which wget
RUN wget https://bootstrap.pypa.io/get-pip.py && \
python3 get-pip.py
WORKDIR /opt
docker build -t aws .
- In the project directory:
rm -rf pypi
docker run -v $(pwd):/opt --rm aws pip install -r requirements.txt -t pypi
zip -r ../project.zip *
- Upload the zip file.
Exercise 4
-
Web Client: A function that uses 'requests' to fetch a URL and return the text of the page.
-
A function that will accept the name of two cities.
-
Call the API of Open Weather Map and return the temprature difference in the two places.
Solution: Web client
import json
import sys
sys.path.insert(0, 'pypi')
import requests
def lambda_handler(event, context):
if 'queryStringParameters' not in event:
return {
'statusCode': 500,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing queryStringParameters' })
}
if event['queryStringParameters'] == None or 'url' not in event['queryStringParameters']:
return {
'statusCode': 400,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'error': 'Missing field' })
}
r = requests.get(event['queryStringParameters']['url'])
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({
'url' : event['queryStringParameters']['url'],
'content': r.text,
})
}
{% embed include file="src/examples/web_client/requirements.txt)
{% embed include file="src/examples/web_client/setup.cfg)
pip install -r requirements.txt -t pypi
zip -r ../project.zip *
curl 'https://qspmdah6oj.execute-api.us-east-1.amazonaws.com/v0/hello?url=https://httpbin.org/get'
Task 5 - Using S3 - Simple Storage Service
-
Buckets - (all the users share a single name-space for buckets, so a lot of them are already taken)
-
Create a bucket in s3 (eg. your-name)
-
Add permission to the role Attach policy AmazonS3FullAccess
S3 List bucket
import json
import boto3
def lambda_handler(event, context):
client = boto3.client('s3')
response = client.list_objects(
Bucket='szabgab',
)
objects = [ o['Key'] for o in response['Contents'] ]
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'objects': objects, 'res': str(response) })
}
S3 write object from Lambda
import json
import boto3
def lambda_handler(event, context):
client = boto3.client('s3')
res = client.put_object(
Bucket='szabgab',
Key='abc.json',
Body=json.dumps({"name": "Foo"}),
)
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
'body': json.dumps({ 'res': res })
}
Read S3 object
import json
import boto3
def lambda_handler(event, context):
client = boto3.client('s3')
res = client.get_object(
Bucket='szabgab',
Key='abc.json',
)
#obj = res["Body"] # botocore.response.StreamingBody
content = res["Body"].read()
data = json.loads(content)
return {
'statusCode': 200,
'headers': { 'Content-Type': 'application/json' },
#'body': json.dumps({ 'res': str(res) }),
#'body': json.dumps({ 'res': str(content) }),
'body': json.dumps({ 'res': data }),
}
Trigger Lambda by S3
{
"name" : "Apple"
}
Trigger by using:
aws s3 cp data.json s3://szabgab/new/data.json
Download the resulting file using
aws s3 cp s3://szabgab/out.json .
Handle S3 event in Lambda
import json
import boto3
def lambda_handler(event, context):
client = boto3.client('s3')
res = client.get_object(
Bucket=event["Records"][0]['s3']['bucket']['name'],
Key=event["Records"][0]['s3']['object']['key'],
)
content = res["Body"].read()
data = json.loads(content)
res = client.put_object(
Bucket='szabgab',
Key='out.json',
Body=json.dumps({"data" : data, "event": event}),
)
Exercise 5
-
Create a function that can be triggered by a URL request passing in parameters like this: ?name="Foo Bar"
-
Create an object in S3 called something like "in/time.json" (with the current time)
-
Create another function that will be triggered by a newaobject with a perfix "in/"
-
Load that object and creare a new object called "out/time.json" using the same object name, but adding 3 new fields to it called
new_time: the time portion of the name of the object
end_time: the time when we read the object in the second function.
elapsed: the difference.