About, Vitor Garbim

About

A builder who happens to do data at scale.

Multilingual senior data engineer with 19+ years building high-impact, scalable data infrastructure across tech, financial, and insurance industries. I design cloud-native data lakes, real-time and batch pipelines, and AI-powered metadata platforms on the AWS ecosystem.

At Amazon I led the architecture of a 70+ PB centralized data lake built with AWS Glue (Spark), Lake Formation and Athena, supporting 6,000+ internal users. Certified for Red Security (highly critical data), it replaced a legacy ETL system and introduced cost-effective scalability, observability, and cross-team data transparency.

My work spans advanced data modeling and warehousing on Redshift and Lake Formation through to real-time pipelines with Kinesis, Glue Streaming and Lambda, integrating RDS, DynamoDB and OpenSearch with Airflow for orchestration. I led a Data Mesh rollout on AWS DataZone, enhanced with LLMs and machine learning to curate datasets, standardize metrics, and enable self-service discovery across Amazon Operations.

On MLOps, I have delivered infrastructure for ML applications on SageMaker, Bedrock and Kendra, supporting scalable training, secure deployment, and intelligent search. I am also a dedicated mentor and hiring contributor, supporting the talent pipeline through engineering mentorship and technical interviews. Off the clock, I tinker in a homelab and answer to Freddie.

Career

Amazon

Full-time · 7 yrs 2 mos

Senior Data Engineer · OTS DataTech

May 2024 to Present · 2 yrs 2 mos

Austin, Texas · Hybrid

Led the architecture and delivery of a 70+ PB centralized data lake on AWS Glue, Redshift, Lake Formation and Athena (Presto), supporting over 6,000 internal users. Replaced legacy Redshift ETL pipelines, earned Red Security certification for highly critical data, and cut storage and compute costs by 40%. Spearheaded a Data Mesh rollout on AWS DataZone for a governed, self-service business catalog, and mentored engineers across teams on architecture, code quality and operational excellence.

Senior Data Engineer (DE) · GRMC

Feb 2022 to May 2024 · 2 yrs 4 mos

Miami, Florida · Remote

Built technologies to streamline data consumption across Property and Casualty insurance and risk, using AWS CDK, Glue, Lambda, ECS Fargate, ECR, Transfer Family, Lake Formation, Redshift and Athena. Set up and managed data lakes, optimized pipelines, and delivered data ingestion applications in Python, JavaScript and TypeScript.

Data Engineer (DE) · Career Choice

May 2019 to Feb 2022 · 2 yrs 10 mos

Miami-Fort Lauderdale Area · Remote

Improved ingestion and table performance for all Salesforce datasets by automating object ingestion into Redshift with Python BOTO3, S3, Data Pipeline and EC2. Designed the Career Choice program data warehouse on Redshift and Glue with self-service tools like QuickSight, and raised code quality with standards and automated deploys on CodeCommit and CodePipeline.

AIG

6 yrs 11 mos

Senior Project Manager & Business Strategy · Global Transformation

Dec 2014 to May 2019 · 4 yrs 6 mos

Miami-Fort Lauderdale Area

Led enterprise transformation in Property and Casualty insurance across 7+ countries in Latin America, Europe and the US, focused on risk, claims and underwriting operations. Designed a future-state data warehouse for the AIG Shared Services Center, unifying many sources and automating 1,400+ reports. Built productivity tools that cut policy issuance cycle times, and trained regional leaders on Lean Six Sigma.

Strategic Planning Coordinator

Aug 2013 to Dec 2014 · 1 yr 5 mos

Greater São Paulo Area

Designed and communicated analyses and recommendations to identify, validate and refine profitable strategic actions for executive and Board-level audiences. Led analysts through complex business analysis using SAS, Oracle, SQL Server and advanced Excel (VBA), and delivered two global KPI dashboards spanning HR, Finance, Operations, Customers and Claims for the LAC and US regions.

System Architect

Jul 2012 to Aug 2013 · 1 yr 2 mos

Greater São Paulo Area

Integrated systems and technologies by coordinating internal and external partners worldwide, streamlining decision-making. Designed and supported complex ETL and ODS processes from many database sources using SAS, Oracle and Microsoft tools, and validated documentation through the Change Approval Board to reduce implementation errors.

HCL Technologies

7 mos

Business Intelligence Consultant

Jan 2012 to Jul 2012 · 7 mos

Greater São Paulo Area

Built the first auto insurance data warehouse in a client environment following the SDLC methodology, coordinating with business users, designing the data model and programming ETL and ODS with SAS BI, SAS DI and SAS Guide to extract from the eBao policy management system.

Santander

1 yr 8 mos

Strategic Planning Analyst

Jun 2010 to Jan 2012 · 1 yr 8 mos

Greater São Paulo Area

Implemented a collections employee scorecard by consolidating company performance data into a single database, and cut a BAU process runtime by 90% by re-writing SAS code and removing bottlenecks. Built the first collections web dashboard from scratch with MySQL, PHP, HTML5, jQuery and CSS for managers to track results.

Citi

3 yrs 7 mos

MIS Analyst

Dec 2006 to Jun 2010 · 3 yrs 7 mos

Greater São Paulo Area

Worked on Brazil's Credit, Fraud and Collections enterprise data warehouse, defining and validating KPIs with internal and external vendors. Automated 60% of a back-office process while leading a small team, and cut a collections dashboard refresh from 60 to 30 minutes by re-designing the process and database.

GE Money

1 yr 6 mos

MIS Analyst

Jul 2005 to Dec 2006 · 1 yr 6 mos

Greater São Paulo Area

Cut a collectors scorecard process from 8 hours to 20 minutes by automating database extractions and the Excel dashboard with SAS Base and VBA. Built collections data warehouse tables and views in PL/SQL, and automated the boleto creation process to the CNAB bank standard, reducing customer complaints.

Skills

AWS

Glue (Spark)RedshiftLake FormationAthena (Presto)EMR (Hadoop)SageMakerBedrockKendraLambdaCDKECS FargateS3DataZone

Real-time & batch

KinesisGlue StreamingAirflowDynamoDB Streams

Languages

PythonTypeScriptJavaScriptSQL

Web & backend

FlaskFastAPINode.jsReactJSHTMLCSS

Architecture

Data LakeData MeshServerlessScalable system designMLOps infraMetadata management

CI/CD

CodePipelineCodeCommitGitLab

Security

IAMData encryptionGovernanceRed SecurityData lineage

Leadership & craft

MentorshipSystem design thinkingInterviewing & hiringCross-functional collaborationAgileStakeholder alignmentLean Six Sigma

Education

Universidade Presbiteriana Mackenzie

Bachelor of Computer Information Systems, minor in Business

São Paulo, Brazil · 2008 to 2012

Universidade Presbiteriana Mackenzie

Bachelor's degree, Computer Science

São Paulo, Brazil

FECAP

Computer Software Engineering

São Paulo, Brazil

Let's build something.

Writing, mentoring, or a hard data problem worth solving.

Start a conversation →