Combine customized functions with AWS Lake Formation – Half 1

0
20
Combine customized functions with AWS Lake Formation – Half 1


AWS Lake Formation makes it easy to centrally govern, safe, and globally share information for analytics and machine studying (ML).

With Lake Formation, you’ll be able to centralize information safety and governance utilizing the AWS Glue Knowledge Catalog, letting you handle metadata and information permissions in a single place with acquainted database-style options. It additionally delivers fine-grained information entry management, so you may make positive customers have entry to the proper information right down to the row and column degree.

Lake Formation additionally makes it easy to share information internally throughout your group and externally, which helps you to create a knowledge mesh or meet different information sharing wants with no information motion.

Moreover, as a result of Lake Formation tracks information interactions by position and consumer, it supplies complete information entry auditing to confirm the proper information was accessed by the proper customers on the proper time.

On this two-part sequence, we present find out how to combine customized functions or information processing engines with Lake Formation utilizing the third-party companies integration function.

On this submit, we dive deep into the required Lake Formation and AWS Glue APIs. We stroll by way of the steps to implement Lake Formation insurance policies inside customized information functions. For example, we current a pattern Lake Formation built-in utility carried out utilizing AWS Lambda.

The second half of the sequence introduces a pattern net utility constructed with AWS Amplify. This net utility showcases find out how to use the customized information processing engine carried out within the first submit.

By the top of this sequence, you’ll have a complete understanding of find out how to prolong the capabilities of Lake Formation by constructing and integrating your personal customized information processing parts.

Combine an exterior utility

The method of integrating a third-party utility with Lake Formation is described intimately in How Lake Formation utility integration works.

On this part, we dive deeper into the steps required to determine belief between Lake Formation and an exterior utility, the API operations which can be concerned, and the AWS Id and Entry Administration (IAM) permissions that have to be set as much as allow the combination.

Lake Formation utility integration exterior information filtering

In Lake Formation, it’s doable to manage which third-party engines or functions are allowed to learn and filter information in Amazon Easy Storage Service (Amazon S3) places registered with Lake Formation.

To take action, you’ll be able to navigate to the Software integration settings web page on the Lake Formation console and allow Enable exterior engines to filter information in Amazon S3 places registered with Lake Formation, specifying the AWS account IDs from the place third-party engines are allowed to entry places registered with Lake Formation. As well as, it’s important to specify the allowed session tag values to establish trusted requests. We focus on in later sections how these tags are used.

LakeFormation Application integration

Lake Formation utility integration concerned AWS APIs

The next is a listing of the primary AWS APIs wanted to combine an utility with Lake Formation:

  • sts:AssumeRole – Returns a set of non permanent safety credentials that you need to use to entry AWS assets.
  • glue:GetUnfilteredTableMetadata – Permits a third-party analytical engine to retrieve unfiltered desk metadata from the Knowledge Catalog.
  • glue:GetUnfilteredPartitionsMetadata – Retrieves partition metadata from the Knowledge Catalog that comprises unfiltered metadata.
  • lakeformation:GetTemporaryGlueTableCredentials – Permits a caller in a safe surroundings to imagine a job with permission to entry Amazon S3. To vend such credentials, Lake Formation assumes the position related to a registered location, for instance an S3 bucket, with a scope down coverage that restricts the entry to a single prefix.
  • lakeformation:GetTemporaryGluePartitionCredentials – This API is similar to GetTemporaryTableCredentials besides that it’s used when the goal Knowledge Catalog useful resource is of kind Partition. Lake Formation restricts the permission of the vended credentials with the identical scope down coverage that restricts entry to a single Amazon S3 prefix.

Later on this submit, we current a pattern structure illustrating how you need to use these APIs.

Exterior utility and IAM roles to entry information

For an exterior utility to entry assets in an Lake Formation surroundings, it must run below an IAM principal (consumer or position) with the suitable credentials. Let’s take into account a situation the place the exterior utility runs below the IAM position MyApplicationRole that’s a part of the AWS account 123456789012.

In Lake Formation, you have got granted entry to varied tables and databases to 2 particular IAM roles:

To allow MyApplicationRole to entry the assets which were granted to AccessRole1 and AccessRole2, you could configure the belief relationships for these entry roles. Particularly, you could configure the next:

  • Enable MyApplicationRole to imagine every of the entry roles (AccessRole1 and AccessRole2) utilizing the sts:AssumeRole
  • Enable MyApplicationRole to tag the assumed session with a selected tag, which is required by Lake Formation. The tag key ought to be LakeFormationAuthorizedCaller, and the worth ought to match one of many session tag values specified within the Software integration settings web page on the Lake Formation console (for instance, “application1“).

The next code is an instance of the belief relationships configuration for an entry position (AccessRole1 or AccessRole2):

[
    {
        "Effect": "Allow",
        "Principal": {
            "AWS": "arn:aws:iam::123456789012:role/MyApplicationRole"
        },
        "Action": "sts:AssumeRole"
    },
    {
        "Effect": "Allow",
        "Principal": {
            "AWS": "arn:aws:iam::123456789012:role/MyApplicationRole"
        },
        "Action": "sts:TagSession",
        "Condition": {
            "StringEquals": {
                "aws:RequestTag/LakeFormationAuthorizedCaller": "application1"
            }
        }
    }
]

Moreover, the info entry IAM roles (AccessRole1 and AccessRole2) will need to have the next IAM permissions assigned with a view to learn Lake Formation protected tables:

{
    "Model": "2012-10-17",
    "Assertion": {
        "Sid": "LakeFormationManagedAccess",
        "Impact": "Enable",
        "Motion": [
            "lakeformation:GetDataAccess",
            "glue:GetTable",
            "glue:GetTables",
            "glue:GetDatabase",
            "glue:GetDatabases",
            "glue:GetPartition",
            "glue:GetPartitions"
        ],
        "Useful resource": "*"
    }
}

Answer overview

For our resolution, Lambda serves as our exterior trusted engine and utility built-in with Lake Formation. This instance is supplied with a view to perceive and see in motion the entry circulation and the Lake Formation API responses. As a result of it’s based mostly on a single Lambda perform, it’s not meant for use in manufacturing settings or with excessive volumes of knowledge.

Furthermore, the Lambda based mostly engine has been configured to help a restricted set of knowledge information (CSV, Parquet, and JSON), a restricted set of desk configurations (no nested information), and a restricted set of desk operations (SELECT solely). As a consequence of these limitations, the applying shouldn’t be used for arbitrary assessments.

On this submit, we offer directions on find out how to deploy a pattern API utility built-in with Lake Formation that implements the answer structure. The core of the API is carried out with a Python Lambda perform. We additionally present find out how to check the perform with Lambda assessments. Within the second submit on this sequence, we offer directions on find out how to deploy an internet frontend utility that integrates with this Lambda perform.

Entry circulation for unpartitioned tables

The next diagram summarizes the entry circulation when accessing unpartitioned tables.

Solution Architecture - Unpartitioned tables

The workflow consists of the next steps:

  1. Person A (authenticated with Amazon Cognito or different equal programs) sends a request to the applying API endpoint, requesting entry to a selected desk inside a selected database.
  2. The API endpoint, created with AWS AppSync, handles the request, invoking a Lambda perform.
  3. The perform checks which IAM information entry position the consumer is mapped to. For simplicity, the instance makes use of a static hardcoded mapping (mappings={ "user1": "lf-app-access-role-1", "user2": "lf-app-access-role-2"}).
  4. The perform invokes the sts:AssumeRole API to imagine the user-related IAM information entry position (lf-app-access-role-1AccessRole1). The AssumeRole operation is carried out with the tag LakeFormationAuthorizedCaller, having as its worth one of many session tag values specified when configuring the applying integration settings in Lake Formation (for instance, {'Key': 'LakeFormationAuthorizedCaller','Worth': 'application1'}). The API returns a set of non permanent credentials, which we consult with as StsCredentials1.
  5. Utilizing StsCredentials1, the perform invokes the glue:GetUnfilteredTableMetadata API, passing the requested database and desk identify. The API returns data like desk location, a listing of approved columns, and information filters, if outlined.
  6. Utilizing StsCredentials1, the perform invokes the lakeformation:GetTemporaryGlueTableCredentials API, passing the requested database and desk identify, the kind of requested entry (SELECT), and CELL_FILTER_PERMISSION because the supported permission sorts (as a result of the Lambda perform implements logic to use row-level filters). The API returns a set of non permanent Amazon S3 credentials, which we consult with as S3Credentials1.
  7. Utilizing S3Credentials1, the perform lists the S3 information saved within the desk location S3 prefix and downloads them.
  8. The retrieved Amazon S3 information is filtered to take away these columns and rows that the consumer just isn’t allowed entry to (approved columns and row filters have been retrieved in Step 5) and approved information is returned to the consumer.

Entry circulation for partitioned tables

The next diagram summarizes the entry circulation when accessing partitioned tables.

Solution Architecture - Partitioned tables

The steps concerned are nearly similar to those offered for partitioned tables, with the next modifications:

  • After invoking the glue:GetUnfilteredTableMetadata API (Step 5) and figuring out the desk as partitioned, the Lambda perform invokes the glue:GetUnfilteredPartitionsMetadata API utilizing StsCredentials1 (Step 6). The API returns, along with different data, the checklist of partition values and places.
  • For every partition, the perform performs the next actions:
    • Invokes the lakeformation:GetTemporaryGluePartitionCredentials API (Step 7), passing the requested database and desk identify, the partition worth, the kind of requested entry (SELECT), and CELL_FILTER_PERMISSION because the supported permissions kind (as a result of the Lambda perform implements logic to use row-level filters). The API returns a set of non permanent Amazon S3 credentials, which we consult with as S3CredentialsPartitionX.
    • Makes use of S3CredentialsPartitionX to checklist the partition location S3 information and obtain them (Step 8).
  • The perform appends the retrieved information.
  • Earlier than the Lambda perform returns the outcomes to the consumer (Step 9), the retrieved Amazon S3 information is filtered to take away these columns and rows that the consumer just isn’t allowed entry to (approved columns and row filters have been retrieved in Step 5).

Stipulations

The next stipulations are wanted to deploy and check the answer:

  • Lake Formation ought to be enabled within the AWS Area the place the pattern utility can be deployed
  • The steps have to be run with an IAM principal with enough permissions to create the wanted assets, together with Lake Formation databases and tables

Deploy resolution assets with AWS CloudFormation

We create the answer assets utilizing AWS CloudFormation. The supplied CloudFormation template creates the next assets:

  • One S3 bucket to retailer desk information (lf-app-data-)
  • Two IAM roles, which can be mapped to consumer customers and their related Lake Formation permission insurance policies (lf-app-access-role-1 and lf-app-access-role-2)
  • Two IAM roles used for the 2 created Lambda capabilities (lf-app-lambda-datalake-population-role and lf-app-lambda-role)
  • One AWS Glue database (lf-app-entities) with two AWS Glue tables, one unpartitioned (users_tbl) and one partitioned (users_partitioned_tbl)
  • One Lambda perform used to populate the info lake information (lf-app-lambda-datalake-population)
  • One Lambda perform used for the Lake Formation built-in utility (lf-app-lambda-engine)
  • One IAM position utilized by Lake Formation to entry the desk information and carry out credentials merchandising (lf-app-datalake-location-role)
  • One Lake Formation information lake location (s3://lf-app-data-/datasets) related to the IAM position created for credentials merchandising (lf-app-datalake-location-role)
  • One Lake Formation information filter (lf-app-filter-1)
  • One Lake Formation tag (key: delicate, values: true or false)
  • Tag associations to tag the created unpartitioned AWS Glue desk (users_tbl) columns with the created tag

To launch the stack and provision your assets, full the next steps:

  1. Obtain the code zip bundle for the Lambda perform used for the Lake Formation built-in utility (lf-integrated-app.zip).
  2. Obtain the code zip bundle for the Lambda perform used to populate the info lake information (datalake-population-function.zip).
  3. Add the zip bundles to an current S3 bucket location (for instance, s3://mybucket/myfolder1/myfolder2/lf-integrated-app.zip and s3://mybucket/myfolder1/myfolder2/datalake-population-function.zip)
  4. Select Launch Stack.

This robotically launches AWS CloudFormation in your AWS account with a template. Just be sure you create the stack in your meant Area.

  1. Select Subsequent to maneuver to the Specify stack particulars part
  2. For Parameters, present the next parameters:
    1. For powertoolsLogLevel, specify how verbose the Lambda perform logger ought to be, from probably the most verbose to the least verbose (no logs). For this submit, we select DEBUG.
    2. For s3DeploymentBucketName, enter the identify of the S3 bucket containing the Lambda capabilities’ code zip bundles. For this submit, we use mybucket.
    3. For s3KeyLambdaDataPopulationCode, enter the Amazon S3 location containing the code zip bundle for the Lambda perform used to populate the info lake information (datalake-population-function.zip). For instance, myfolder1/myfolder2/datalake-population-function.zip.
    4. For s3KeyLambdaEngineCode, enter the Amazon S3 location containing the code zip bundle for the Lambda perform used for the Lake Formation built-in utility (lf-integrated-app.zip). For instance, myfolder1/myfolder2/lf-integrated-app.zip.
  3. Select Subsequent.

Cloudformation Create Stack with properties

  1. Add extra AWS tags if required.
  2. Select Subsequent.
  3. Acknowledge the ultimate necessities.
  4. Select Create stack.

Allow the Lake Formation utility integration

Full the next steps to allow the Lake Formation utility integration:

  1. On the Lake Formation console, select Software integration settings within the navigation pane.
  2. Allow Enable exterior engines to filter information in Amazon S3 places registered with Lake Formation.
  3. For Session tag values, select application1.
  4. For AWS account IDs, enter the present AWS account ID.
  5. Select Save.

LakeFormation Application integration

Implement Lake Formation permissions

The CloudFormation stack created one database named lf-app-entities with two tables named users_tbl and users_partitioned_tbl.

To make sure you’re utilizing Lake Formation permissions, you must verify that you just don’t have any grants arrange on these tables for the principal IAMAllowedPrincipals. The IAMAllowedPrincipals group contains any IAM customers and roles which can be allowed entry to your Knowledge Catalog assets by your IAM insurance policies, and it’s used to take care of backward compatibility with AWS Glue.

To verify Lake Formations permissions are enforced, navigate to the Lake Formation console and select Knowledge lake permissions within the navigation pane. Filter permissions by Database=lf-app-entities and take away all of the permissions given to the principal IAMAllowedPrincipals.

For extra particulars on IAMAllowedPrincipals and backward compatibility with AWS Glue, consult with Altering the default safety settings in your information lake.

Examine the created Lake Formation assets and permissions

The CloudFormation stack created two IAM roles—lf-app-access-role-1 and lf-app-access-role-2—and assigned them totally different permissions on the users_tbl (unpartitioned) and users_partitioned_tbl (partitioned) tables. The particular Lake Formation grants are summarized within the following desk.

IAM Roles
lf-app-entities (Database)
  customers _tbl (Desk) _tbl _partitioned_tbl (Desk)
lf-app-access-role-1 No entry Learn entry on columns uid, state, and metropolis for all of the information. Learn entry to all columns aside from handle solely on rows with worth state=uk.
lf-app-access-role-2 Learn entry on columns with the tag delicate = false Learn entry to all columns and rows.

To higher perceive the complete permissions setup, you must evaluate the CloudFormation created Lake Formation assets and permissions. On the Lake Formation console, full the next steps:

  1. Evaluate the info filters:
    1. Select Knowledge filters within the navigation pane.
    2. Examine the lf-app-filter-1
  2. Evaluate the tags:
    1. Select LF-Tags and permissions within the navigation pane.
    2. Examine the delicate
  3. Evaluate the tag associations:
    1. Select Tables within the navigation pane.
    2. Select the users_tbl
    3. Examine the LF-Tags related to the totally different columns within the Schema
  4. Evaluate the Lake Formation permissions:
    1. Select Knowledge lake permissions within the navigation pane.
    2. Filter by Principal = lf-app-access-role-1 and examine the assigned permissions.
    3. Filter by Principal = lf-app-access-role-2 and examine the assigned permissions.

Check the Lambda perform

The Lambda perform created by the CloudFormation template accepts JSON objects as enter occasions. The JSON occasions have the next construction:

 {
  "identification": {
    "username": "XXX"
  },
  "fieldName": "YYY",
  "arguments": {
    "AA": "BB",
    ...
  }
}

Though the identification discipline is all the time wanted with a view to establish the known as identification, relying on the requested operation (fieldName), totally different arguments ought to be supplied. The next desk lists these arguments.

Operation Description Wanted Arguments Output
getDbs Listing databases No arguments wanted Listing of databases the consumer has entry to
getTablesByDb Listing tables db: Listing of tables inside a database the consumer has entry to
getUnfilteredTableMetadata Return the desk metadata

db:

desk:

Returns the output of the glue:GetUnfilteredTableMetadata API
getUnfilteredPartitionsMetadata Return the desk partitions metadata

db:

desk:

Returns the output of the glue:GetUnfilteredPartitionsMetadata API
getTableData Get desk information

db:

desk:

noOfRecs: N (variety of information to tug)

nonNullRowsOnly: true/false (true to filter out information with all null values)

location: Desk location

authorizedData: information of the desk the consumer has entry to

allColumns: All of the columns of the desk (returned just for demonstration and comparability functions)

allData: All of the information of the desk with none filtering (returned just for demonstration and comparability functions)

cellFilters: Lake Formation filters (utilized to allData to return authorizedData)

authorizedColumns: Columns to which the consumer has entry to (projection utilized to allData to return authorizedData)

To check the Lambda perform, you’ll be able to create some pattern Lambda check occasions. Full the next steps:

  1. On the Lambda console, select Capabilities on the navigation pane.
  2. Select the lf-app-lambda-engine
  3. On the Check tab, choose Create new occasion.
  4. For Occasion JSON, enter a legitimate JSON (we offer some pattern JSON occasions).
  5. Select Check.

Creata Lambda Test

  1. Examine the check outcomes (JSON response).

Lambda Test Result

The next are some pattern check occasions you’ll be able to attempt to see how totally different identities can entry totally different units of data.

user1 user2
{ 
  "identification": {
    "username": "user1"
  },
  "fieldName": "getDbs"
}

{ 
  "identification": {
    "username": "user2"
  },
  "fieldName": "getDbs"
}

{
  "identification": {
    "username": "user1"
  },
  "fieldName": "getTablesByDb",
  "arguments": {
    "db": "lf-app-entities"
  }
}

{
  "identification": {
    "username": "user2"
  },
  "fieldName": "getTablesByDb",
  "arguments": {
    "db": "lf-app-entities"
  }
}

{
  "identification": {
    "username": "user1"
  },
  "fieldName": "getUnfilteredTableMetadata",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_tbl" 
  }
}

{
  "identification": {
    "username": "user2"
  },
  "fieldName": "getUnfilteredTableMetadata",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_tbl" 
  }
}

{
  "identification": {
    "username": "user1"
  },
  "fieldName": "getUnfilteredTableMetadata",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_partitioned_tbl" 
  }
}

{
  "identification": {
    "username": "user2"
  },
  "fieldName": "getUnfilteredTableMetadata",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_partitioned_tbl" 
  }
}

{
  "identification": {
    "username": "user1"
  },
  "fieldName": "getUnfilteredPartitionsMetadata",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_tbl" 
  }
}

{
  "identification": {
    "username": "user2"
  },
  "fieldName": "getUnfilteredPartitionsMetadata",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_tbl" 
  }
}

{
  "identification": {
    "username": "user1"
  },
  "fieldName": "getUnfilteredPartitionsMetadata",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_partitioned_tbl" 
  }
}

{
  "identification": {
    "username": "user2"
  },
  "fieldName": "getUnfilteredPartitionsMetadata",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_partitioned_tbl" 
  }
}

{
  "identification": {
    "username": "user1"
  },
  "fieldName": "getTableData",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_tbl",
    "noOfRecs": 10,
    "nonNullRowsOnly": true
  }
}

{
  "identification": {
    "username": "user2"
  },
  "fieldName": "getTableData",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_tbl",
    "noOfRecs": 10,
    "nonNullRowsOnly": true
  }
}

{
  "identification": {
    "username": "user1"
  },
  "fieldName": "getTableData",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_partitioned_tbl",
    "noOfRecs": 10,
    "nonNullRowsOnly": true
  }
}

{
  "identification": {
    "username": "user2"
  },
  "fieldName": "getTableData",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_partitioned_tbl",
    "noOfRecs": 10,
    "nonNullRowsOnly": true
  }
}

For example, within the following check, we request users_partitioned_tbl desk information within the context of user1:

{
  "identification": {
    "username": "user1"
  },
  "fieldName": "getTableData",
  "arguments": {
    "db": "lf-app-entities",
    "desk": "users_partitioned_tbl",
    "noOfRecs": 10,
    "nonNullRowsOnly": true
  }
}

The next is the associated API response:

{
  "database": "lf-app-entities",
  "identify": "users_partitioned_tbl",
  "location": "s3://lf-app-data-123456789012/datasets/lf-app-entities/users_partitioned/",
  "authorizedColumns": [
    {
      "Name": "born_year",
      "Type": "string"
    },
    {
      "Name": "city",
      "Type": "string"
    },
    {
      "Name": "name",
      "Type": "string"
    },
    {
      "Name": "state",
      "Type": "string"
    },
    {
      "Name": "surname",
      "Type": "string"
    },
    {
      "Name": "uid",
      "Type": "int"
    }
  ],
  "authorizedData": [
    [
      "1980",
      "bristol",
      "emily",
      "united kingdom",
      "brown",
      4
    ],
    [
      "1980",
      "vancouver",
      "",
      "canada",
      "",
      5
    ],
    [
      "1980",
      "madrid",
      "",
      "spain",
      "",
      6
    ],
    [
      "1980",
      "mexico city",
      "",
      "mexico",
      "",
      10
    ],
    [
      "1980",
      "zurich",
      "",
      "switzerland",
      "",
      11
    ],
    [
      "1980",
      "buenos aires",
      "",
      "argentina",
      "",
      12
    ],
    [
      "1990",
      "london",
      "john",
      "united kingdom",
      "pike",
      1
    ],
    [
      "1990",
      "milan",
      "",
      "italy",
      "",
      2
    ],
    [
      "1990",
      "berlin",
      "",
      "germany",
      "",
      3
    ],
    [
      "1990",
      "munich",
      "",
      "germany",
      "",
      7
    ]
  ],
  "allColumns": [
    {
      "Name": "address",
      "Type": "string"
    },
    {
      "Name": "born_year",
      "Type": "string"
    },
    {
      "Name": "city",
      "Type": "string"
    },
    {
      "Name": "name",
      "Type": "string"
    },
    {
      "Name": "state",
      "Type": "string"
    },
    {
      "Name": "surname",
      "Type": "string"
    },
    {
      "Name": "uid",
      "Type": "int"
    }
  ],
  "allData": [
    [
      "beautiful avenue 123",
      "1980",
      "bristol",
      "emily",
      "united kingdom",
      "brown",
      4
    ],
    [
      "lake street 45",
      "1980",
      "vancouver",
      "david",
      "canada",
      "lee",
      5
    ],
    [
      "plaza principal 6",
      "1980",
      "madrid",
      "sophia",
      "spain",
      "luz",
      6
    ],
    [
      "avenida de arboles 40",
      "1980",
      "mexico city",
      "olivia",
      "mexico",
      "garcia",
      10
    ],
    [
      "pflanzenstrasse 34",
      "1980",
      "zurich",
      "lucas",
      "switzerland",
      "fischer",
      11
    ],
    [
      "avenida de luces 456",
      "1980",
      "buenos aires",
      "isabella",
      "argentina",
      "afortunado",
      12
    ],
    [
      "hidden road 78",
      "1990",
      "london",
      "john",
      "united kingdom",
      "pike",
      1
    ],
    [
      "via degli alberi 56A",
      "1990",
      "milan",
      "mario",
      "italy",
      "rossi",
      2
    ],
    [
      "green road 90",
      "1990",
      "berlin",
      "july",
      "germany",
      "finn",
      3
    ],
    [
      "parkstrasse 789",
      "1990",
      "munich",
      "oliver",
      "germany",
      "schmidt",
      7
    ]
  ],
  "filteredCellPh": "",
  "cellFilters": [
    {
      "ColumnName": "born_year",
      "RowFilterExpression": "TRUE"
    },
    {
      "ColumnName": "city",
      "RowFilterExpression": "TRUE"
    },
    {
      "ColumnName": "name",
      "RowFilterExpression": "state="united kingdom""
    },
    {
      "ColumnName": "state",
      "RowFilterExpression": "TRUE"
    },
    {
      "ColumnName": "surname",
      "RowFilterExpression": "state="united kingdom""
    },
    {
      "ColumnName": "uid",
      "RowFilterExpression": "TRUE"
    }
  ]
}

To troubleshoot the Lambda perform, you’ll be able to navigate to the Monitoring tab, select View CloudWatch logs, and examine the newest log stream.

Clear up

Should you plan to discover Half 2 of this sequence, you’ll be able to skip this half, as a result of you will want the assets created right here. You possibly can consult with this part on the finish of your testing.

Full the next steps to take away the assets you created following this submit and keep away from incurring extra prices:

  1. On the AWS CloudFormation console, select Stacks within the navigation pane.
  2. Select the stack you created and select Delete.

Extra issues

Within the proposed structure, Lake Formation permissions have been granted to particular IAM information entry roles that requesting customers (for instance, the identification discipline) have been mapped to. One other chance is to assign permissions in Lake Formation to SAML customers and teams after which work with the AssumeDecoratedRoleWithSAML API.

Conclusion

Within the first a part of this sequence, we explored find out how to combine customized functions and information processing engines with Lake Formation. We delved into the required configuration, APIs, and steps to implement Lake Formation insurance policies inside customized information functions. For example, we offered a pattern Lake Formation built-in utility constructed on Lambda.

The knowledge supplied on this submit can function a basis for creating your personal customized functions or information processing engines that have to function on an Lake Formation protected information lake.

Seek advice from the second half of this sequence to see find out how to construct a pattern net utility that makes use of the Lambda based mostly Lake Formation utility.


In regards to the Authors

Stefano Sandona Picture Stefano Sandonà is a Senior Large Knowledge Specialist Answer Architect at AWS. Obsessed with information, distributed programs, and safety, he helps prospects worldwide architect high-performance, environment friendly, and safe information platforms.

Francesco Marelli PictureFrancesco Marelli is a Principal Options Architect at AWS. He specializes within the design, implementation, and optimization of large-scale information platforms. Francesco leads the AWS Answer Architect (SA) analytics staff in Italy. He loves sharing his skilled data and is a frequent speaker at AWS occasions. Francesco can also be keen about music.

LEAVE A REPLY

Please enter your comment!
Please enter your name here