Crawler PNG and SVG Icon
AWS Glue Crawler is a component that automatically scans data sources, infers schemas, and creates metadata tables in the AWS Glue Data Catalog.
Last Modified: August 29, 2025
16px
32px
48px
64px
Details
Key Features
- Automatically scans data sources to detect schema and metadata.
- Populates AWS Glue Data Catalog entries.
- Supports incremental crawls for efficiency.
- Integrates with Amazon S3, RDS, Redshift, and JDBC sources.
Common Use Cases
- Automatically discovering and cataloging new datasets in S3
- Updating schema changes in AWS Glue Data Catalog
- Classifying data by file type and structure for ETL jobs
Explore More Icons
Thinkbox XMesh
Thinkbox XMesh is a geometry caching system that optimizes complex animated geometry workflows in 3D applications.
Support
AWS Support provides a range of plans to assist customers with their AWS environments, offering 24/7 technical support, best practices, and guidance from cloud experts.
License Manager
AWS License Manager helps you manage software licenses from vendors like Microsoft, SAP, and Oracle on AWS and on-premises.
Lambda
AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers, triggered by events and scaled automatically.
ECS Task
Amazon ECS Task is the smallest deployable unit in ECS, representing a single running container or group of containers defined by a task definition.
Marketplace Light
AWS Marketplace is a digital catalog that makes it easy to find, test, buy, and deploy third-party software that runs on AWS.
Lookout for Equipment
Amazon Lookout for Equipment uses machine learning to detect abnormal equipment behavior and prevent potential failures.
Managed Services
AWS Managed Services (AMS) helps enterprises operate their AWS infrastructure by providing ongoing management, monitoring, patching, and operational support.
Alexa For Business
Alexa for Business is an AWS service that enables organizations to use Alexa-powered devices to improve productivity and manage workplace tasks via voice interaction.
HDFC Cluster
Amazon EMR on HDFC Cluster refers to the use of Hadoop Distributed File System (HDFS) within Amazon EMR for distributed data storage and processing.
DataSync
AWS DataSync is an online data transfer service that simplifies, automates, and accelerates moving data between on-premises storage and AWS.
SageMaker AI
Amazon SageMaker is a fully managed service that enables developers and data scientists to build, train, and deploy ML models at scale.
Compute Optimizer
AWS Compute Optimizer uses machine learning to recommend optimal AWS compute resources for your workloads to reduce cost and improve performance.
Pinpoint
Amazon Pinpoint is a flexible and scalable outbound and inbound marketing communications service for sending targeted messages to customers across multiple channels.
Email Notification
AWS Email Notification typically refers to services like Amazon SES or SNS used for sending email alerts, confirmations, and other automated notifications.
Professional Services
AWS Professional Services is a global team of experts that helps customers realize their desired business outcomes using the AWS Cloud through specialized engagements.
Timestream
Amazon Timestream is a fast, scalable, serverless time series database service for IoT and operational applications.
Elemental Conductor
AWS Elemental Conductor is software for managing multiple AWS Elemental Live encoders from a central interface.
Elastic Container Service
Amazon Elastic Container Service (ECS) is a fully managed container orchestration service that allows you to run and scale containerized applications.
CodeDeploy
AWS CodeDeploy is a fully managed deployment service that automates application deployments to Amazon EC2, Lambda, or on-premises servers.
Thinkbox Stoke
Thinkbox Stoke is a tool for accelerating particle simulation workflows and re-timing caches in 3D content creation.
Neuron
AWS Neuron is a software development kit (SDK) that enables running high-performance ML models on AWS Inferentia-based instances.
Site to Site VPN
AWS Site-to-Site VPN connects your on-premises network to AWS over an IPsec VPN tunnel for secure communication.
VPC Lattice
Amazon VPC Lattice helps you securely connect, monitor, and manage service-to-service communication in a consistent way.