1. Introduction to Time Series Helpers
Introduction to Time Series Helpers
Introduction to Time Series Helpers
Generally useful Time Series Helpers
Step by Step Guide with examples to follow along with
Description of the ingestion and refinement pipeline for Alloy Environmental Services data
Overview of the Auto-adjusting Budget Alerts
Description of the backdated ingestion process for Liberator data
Explaination of the CD process for the Data Platform
Explaination of the CI process for the Data Platform
How to connect to the Data Platform from Tableau Online using Redshift or Athena
Connect to the Redshift cluster from Google Data Studio
Create an extract to optimise dashboard performance
What is a data catalogue?
This is an overview of the data lake, its responsibilities and how data moves through the zones within the data lake
This is an overview of the data warehouse, its responsibilities and how data is served from within the Data Platform
Creating Glue jobs in terraform
Forecast using Exponential Smoothing
Overview of how db snapshots are exported to the Data Platform Landing Zone
How to perform geospatial enrichment on data held in the data platform
Setting up a PySpark environment on a local machine and running Data Platform scripts.
A Table of terms of Terms and Tools used by the Data Platform
A guide to continuous data quality testing in Glue Jobs
A beginners guide to testing on the data platform
Forecast using Holt Winters ETS
Ingesting data using AWS Lambda [step-by-step]
Import files from google to s3 description
Overview of how data is imported from spreadsheets that are stored in G drive
Overview of how files from external suppliers are imported to the Data Platform Landing Zone
Ingest data from csv files
Ingest spreadsheet files from G Drive description
Overview of how Academy data is ingested onto the Data Platform from MS SQL databases and distributed to Housing Benefits & Needs and Revenues Departments
Ingesting API data into the Data Platform using an AWS Lambda function
Ingesting database tables into the Data Platform using a JDBC Connection
Objective
Ingesting tables from a Dynamo DB instance into the Data Platform landing zone
Setting up a new Kafka topic to streaming events to from a new data entity
Ingesting a snapshot of an RDS instance into the DataPlatform landing zone
Description of the ingestion process for Liberator data
Local Notebook Environment Setup
How to add a new Google group for a department.
How to add users to a Google group
Elements for optimizing glue jobs
Overview of how data is copied from production to pre-production
Prototyping transformation scripts using a Jupyter Notebook
AWS Athena to query data in S3
Description of the ingestion process for RingGo data
There are currently four tiers of role within the data platform project, and they are as follows:
Schedule a glue job to run when new liberator data is added into the platform
Description of the ingestion and refinement pipeline for Tascomi planning data
Recommendations to write an API ingestion script for a Lambda in the Data Platform
Due to the variety of data sources we have had to develop several different ingestion methods. These methods and the data being ingested are detailed in this section
A guide on how to carry out common tasks in GitHub
Using AWS Glue Studio to create ETL processes.
How to use the data catalogue
Use of the watermarks class for recording Glue job states between runs
Overview of how the VPC Peering Connection and its purpose